Skip to content

# Kafka MirrorMaker 2 Enhancement Project#21923

Open
nitinbs1999 wants to merge 414 commits intoapache:trunkfrom
nitinbs1999:enhanced-mirror-maker2
Open

# Kafka MirrorMaker 2 Enhancement Project#21923
nitinbs1999 wants to merge 414 commits intoapache:trunkfrom
nitinbs1999:enhanced-mirror-maker2

Conversation

@nitinbs1999
Copy link
Copy Markdown

Project Overview

This project implements critical enhancements to Apache Kafka MirrorMaker 2 (MM2) to handle edge cases in data replication, particularly focusing on fail-fast mechanisms for log truncation and topic reset scenarios. The improvements ensure data consistency and prevent silent failures during Kafka topic replication between primary and standby clusters.

Project Structure

kafka/
|
|-- connect/mirror/                     # MirrorMaker 2 source code
|   `-- src/main/java/...
|-- core/                               # Kafka core modules
|-- clients/                            # Kafka clients
|-- build.gradle                        # Kafka build configuration
`-- mirror-maker2-project/              # Validation assets packaged in this branch
      |-- README.md                       # Project documentation
      |-- Test.md                         # Captured test output notes
      |-- docker-compose.yml              # Docker Compose orchestration
      |-- run_challenge.sh                # Test scenario runner script
      |-- config/
      |   `-- mm2.properties              # MirrorMaker 2 configuration
      |-- docker/
      |   `-- kafka/
      |       `-- Dockerfile/
      |           `-- dockerfile          # Dockerfile for custom MM2 image
      `-- Producer/                       # Test producer application
            |-- src/main/java/ProducerApp.java
            |-- build.gradle
            |-- settings.gradle
            |-- gradle.properties
            |-- gradlew
            |-- gradlew.bat
            |-- gradle/
            |   |-- libs.versions.toml
            |   `-- wrapper/
            |       |-- gradle-wrapper.jar
            |       `-- gradle-wrapper.properties
            `-- Dockerfile/
                  `-- dockerfile

apoorvmittal10 and others added 30 commits January 16, 2025 04:49
Removed Optional for SharePartitionManager and ClientMetricsManager as zookeeper code is being removed. Also removed asScala and asJava conversion in KafkaApis.handleListClientMetricsResources, moved to java stream.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
…fter removing zookeeper (apache#18365)

This patch introduces a new page to document the configs and metrics that have been removed in the transition to 4.0. While these removed items lack a formal deprecation cycle as they are part of KIP-500, KIP-500 itself does not provide an exhaustive list of all impacted configs and metrics. Therefore, this new page aims to assist Kafka users in understanding the specific configs and metrics that have been removed in the 4.0 release.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The PR removes dependency of server module on share-coordinator, rather it should be other way. Moved the ShareCoordinatorConfig class from server to share-coordinator.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
…18414)

In 4.0, there is no ZK mode and both of these configs are required in kraft mode.

Reviewers: Ismael Juma <ismael@juma.me.uk>
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Divij Vaidya <diviv@amazon.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
…EFAULT and NUM_RECOVERY_THREADS_PER_DATA_DIR_CONFIG (apache#18106)

Reviewers: Divij Vaidya <diviv@amazon.com>
…ns (apache#18568)


Reviewers: Mickael Maison <mickael.maison@gmail.com>
…rtitionsMetadata, ZkConfigRepository, DelayedDeleteTopics (apache#18574)


Reviewers: Mickael Maison <mickael.maison@gmail.com>
…rtitionsAssignedCallback (apache#18515)

Reviewers: Lianet Magrans <lmagrans@confluent.io>
Reviewers: Andrew Schofield <aschofield@confluent.io>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>
…pache#18565)

Since the example.com DNS lookup changed the second time within one
year, we rewrote the unit tests for ClientUtils so that they do
not make a real DNS lookup to the outside but use mocks.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
Remove KafkaController and related unused references:

* ControllerChannelContext
* ControllerChannelManager
* ControllerEventManager
* ControllerState
* PartitionStateMachine
* ReplicaStateMachine
* TopicDeletionManager
* ZkBrokerEpochManager

Reviewers: Ismael Juma <ismael@juma.me.uk>
…#18406)

Add some logs when offline/online happens.

Reviewers: David Jacot <djacot@confluent.io>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
…8240)

This PR implements the second part of KIP-996 and KAFKA-16164 (tasks KAFKA-16607, KAFKA-17642, KAFKA-17643, KAFKA-17675) which encompass the response handling of PreVotes, addition of new ProspectiveState, update to metrics, and addition of Raft simulation tests.

Voters now transition to ProspectiveState first before CandidateState to prevent unnecessary epoch bumps. Voters in ProspectiveState send PreVotes requests which are Vote requests with PreVote set to true.

Follower grants PreVotes if it has not yet fetched successfully from leader. Leader denies all PreVotes. Unattached, Prospective, Candidate, and Resigned will grant PreVotes if the requesting replica's log is at least as long as theirs. Granted PreVotes are not persisted like standard votes. It is possible for a voter to grant several PreVotes in the same epoch.

The only state which is allowed to transition directly to CandidateState is ProspectiveState. This happens on reception of majority of granted PreVotes or if at least one voter doesn't support PreVote requests.

Prospective will transition to Follower after election loss/timeout if it was already aware of last known leader and the leader's endpoint, or at any point if it discovers the leader.

Prospective will transition to Unattached after election loss/timeout if it does not know the leader endpoints.

After electionTimeout, Resigned now always transitions to Unattached and increases the epoch.

Prospective grants standard votes if it has not already granted a standard vote (no votedKey), has no leaderId, and the recipient's log is current enough

Candidate no longer backs off after election timeout. Candidate still backs off after election loss.

Reviewers: José Armando García Sancio <jsancio@apache.org>
…pache#17870)

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
)

Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
…pache#18524)

As per the discussion with @ijuma and @mumrah, the `share` module seems not required and it's advised to user `server` and `server-common` instead. The PR moves the classes from `share` module to respective server related modules.

Following has been refactored in the PR:

- Moved Share Fetch, Acknowledge, Session, Context and Cache related classes to `server` module as the classes are used by `core` and `tools` modules.
- Moved `Persister` releated classes from `share` to `server-common` as the Persister classes though currently just being used by `core` module but in [near future](apache#17775) will also be used by `group-coordinator`. Hence the Persister classes shouldn't go in `server`. The debate is mostly between `coordinator-common` vs `server-common`. We have kept the Persister in `server-common` for now, the classes are more related to the server than the coordinator. Persister is basically an abstraction in the server to let you choose how you want to persist the share group progress.
- Updated build.gradle to remove `share` module.
- Removed `import-control-share.xml`

Reviewers: Ismael Juma <ismael@juma.me.uk>
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
@github-actions github-actions bot added triage PRs from the community streams core Kafka Broker producer consumer tools connect performance kraft mirror-maker-2 dependencies Pull requests that update a dependency file storage Pull requests that target the storage module tiered-storage Related to the Tiered Storage feature KIP-932 Queues for Kafka build Gradle build or GitHub Actions docker Official Docker image generator RPC and Record code generator transactions Transactions and EOS clients group-coordinator labels Apr 1, 2026
Removed log truncation detection messages from README.
…opic reset recovery

- Add detectTruncation() method to track per-partition offset progression
- Add handleTopicReset() method for automatic recovery when topic is recreated
- Add lastSeenOffsets map to maintain offset state per TopicPartition
- Three-case logic: normal flow, offset regression (truncation/reset), offset jump
- Update README.md with new implementation details and actual test results
- Update run_challenge.sh and mm2.properties for test scenarios
- Remove Test.md (content consolidated into README.md)
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Gradle build or GitHub Actions clients connect consumer core Kafka Broker dependencies Pull requests that update a dependency file docker Official Docker image generator RPC and Record code generator group-coordinator KIP-932 Queues for Kafka kraft mirror-maker-2 needs-attention performance producer storage Pull requests that target the storage module streams tiered-storage Related to the Tiered Storage feature tools transactions Transactions and EOS triage PRs from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.