Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-8930: MirrorMaker v2 documentation #324

Merged
merged 2 commits into from
Jan 27, 2021
Merged

KAFKA-8930: MirrorMaker v2 documentation #324

merged 2 commits into from
Jan 27, 2021

Conversation

miguno
Copy link
Contributor

@miguno miguno commented Jan 22, 2021

This adds a new user-facing documentation "Geo-replication (Cross-Cluster Data Mirroring)" section to the Kafka Operations documentation that covers MirrorMaker v2.

Copy link

@ryannedolan ryannedolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Couple small comments so far.

27/ops.html Outdated
<h4 class="anchor-heading"><a id="georeplication-flows" class="anchor-link"></a><a href="#georeplication-flows">What Are Replication Flows</a></h4>

<p>
With MirrorMaker, Kafka administrators can replicate topics, topic configurations, consumer groups and their offsets, and ACLs from one or more source Kafka clusters to one or more target Kafka clusters, i.e., across cluster environments. In a nutshell, MirrorMaker consumes data from the source cluster with source connectors, and then replicates the data by producing to the target cluster with sink connectors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"with sink connectors" is not true at the moment, since I don't think we have a sink connector yet. And even when we do, it would usually be sufficient to use source or sink connector. There are certainly cases where this sentence is true, but I think it's misleading as a general statement.

Maybe "In a nutshell, MirrorMaker uses Connectors to consume from source clusters and produce to target clusters" or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ryannedolan. Text updated.

<h5 class="anchor-heading"><a id="georeplication-topic-naming" class="anchor-link"></a><a href="#georeplication-topic-naming">Custom Naming of Replicated Topics in Target Clusters</a></h5>

<p>
Replicated topics in a target cluster—sometimes called <em>remote</em> topics—are renamed according to a replication policy. MirrorMaker uses this policy to ensure that events (aka records, messages) from different clusters are not written to the same topic-partition. By default as per <a href="https://github.com/apache/kafka/blob/trunk/connect/mirror-client/src/main/java/org/apache/kafka/connect/mirror/DefaultReplicationPolicy.java">DefaultReplicationPolicy</a>, the names of replicated topics in the target clusters have the format <code>{source}.{source_topic_name}</code>:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "records" is more prevalent in Kafka docs vs "events". Maybe verify that and stick with whatever the rest of the docs use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of "events" has become more prevalent, see e.g. the updated introduction page (https://kafka.apache.org/intro) and quickstart (https://kafka.apache.org/quickstart). Hence I prefer to leave the text as-is. The included "(aka records, messages)" parenthesis makes the nomenclature sufficiently clear to the reader, imho.

Copy link
Contributor

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I validated by rendering the changes locally

@bbejeck bbejeck merged commit 810af5a into apache:asf-site Jan 27, 2021
@omkreddy
Copy link
Contributor

We should also add these docs to kafka/docs repo

@bbejeck
Copy link
Contributor

bbejeck commented Jan 27, 2021

We should also add these docs to kafka/docs repo

@omkreddy, yes a PR for kafka/docs is coming soon

@miguno
Copy link
Contributor Author

miguno commented Jan 27, 2021

kafka/docs PR is up at apache/kafka#9983

bbejeck pushed a commit to apache/kafka that referenced this pull request Jan 27, 2021
This adds a new user-facing documentation "Geo-replication (Cross-Cluster Data Mirroring)" section to the Kafka Operations documentation that covers MirrorMaker v2.

Was already merged to kafka-site via apache/kafka-site#324.
Reviewers: Bill Bejeck <bbejeck@apache.org>
bbejeck pushed a commit to apache/kafka that referenced this pull request Jan 27, 2021
This adds a new user-facing documentation "Geo-replication (Cross-Cluster Data Mirroring)" section to the Kafka Operations documentation that covers MirrorMaker v2.

Was already merged to kafka-site via apache/kafka-site#324.
Reviewers: Bill Bejeck <bbejeck@apache.org>
bbejeck pushed a commit to apache/kafka that referenced this pull request Jan 27, 2021
This adds a new user-facing documentation "Geo-replication (Cross-Cluster Data Mirroring)" section to the Kafka Operations documentation that covers MirrorMaker v2.

Was already merged to kafka-site via apache/kafka-site#324.
Reviewers: Bill Bejeck <bbejeck@apache.org>
a0x8o added a commit to a0x8o/kafka that referenced this pull request Jan 27, 2021
This adds a new user-facing documentation "Geo-replication (Cross-Cluster Data Mirroring)" section to the Kafka Operations documentation that covers MirrorMaker v2.

Was already merged to kafka-site via apache/kafka-site#324.
Reviewers: Bill Bejeck <bbejeck@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants