-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement raft journal and snapshot management with Ratis #12181
Conversation
Initial changes for hooking up embedded journal state machine with ratis API. Implemented leader election, journal application and taking snapshot on followers. pr-link: #11940 change-id: cid-0e6c9dff7df0bfdacd6d88bd320a5728ad938edb
pr-link: #12056 change-id: cid-45b0ca0655735d011ca3babafc83fc817b67860f
Remove the copycat dependency and change it to a much smaller catalyst dependency. Catalyst dependency is needed for Grpc messaging serialization and deserialization. pr-link: #12082 change-id: cid-c8b21501a3379f0ee334ccbf7067fe1c5604a223
This change implements snapshot replication workflows for embedded journal: - When a raft leader needs a snapshot, instead of taking snapshot locally it copies a recent snapshot from one of the followers. - When a raft follower receives a notification to download a snapshot, it downloads the latest snapshot from the leader. pr-link: #12053 change-id: cid-d2f322ef07db1db70e1a2ab6182afbb5cd439764
When journal is suspended, snapshot should not be taken because some entries included up to the snapshot index might not have been applied by the suspended journal applier. pr-link: #12100 change-id: cid-cd1adb3aa61dc28b8dfde0283f48303c1dd20e44
This change implements the quorum join process for new master. Ratis provides an API for update quorum member list. To implement an automatic join comparable with what we did with Copycat, the new master will send a join request to the quorum leader after it initialized the journal. The leader will atomically convert the request to a raft configuration change and apply it to the quorum. This operation is a noop for a master that is already in the quorum. pr-link: #12099 change-id: cid-35f7de4ffcc56f0ce4969fd632f48c68adf1887a
Added a deadline for master ping requests. pr-link: #12145 change-id: cid-9c7dcf07e967728651df7367aaf295a297e8ea0b
Sometimes a master attempts to join a quorum during leader election. This change extends the retry period so the retry can extends pass an election. Also lower the log severity given its best effort nature. pr-link: #12152 change-id: cid-65c9b9f2f4360c5b16cc4edb237467a3d5c34ad3
Sometimes a snapshot is ordered when the journal log is not consistent with the state machine. This change makes it more restrictive that snapshots can only be taken when a follower is in running state. pr-link: #12150 change-id: cid-91dca0f71c9a9688e1a3cd58240b61b5df3df79b
Sometimes a Ratis client decides to timeout and close connection when master tries to gain primacy. This change makes the master to notify Ratis client to reconnect when an exception is throw, so we don't just keep retrying on a closed client. pr-link: #12151 change-id: cid-2da82374da558c9ca3c9ae2446134938b9343e8a
In a cluster with multiple masters using embedded journal, when network partitioned the original leading master, leadership changes to a new leading master. The workers hang in GRPC connections with the previous leading master and don't register with the new leading master. This PR adds the timeout to periodical connections (operations) between workers and leading masters to prevent workers from hanging forever. pr-link: #12149 change-id: cid-47af90077dfc0fed081a315e0f2b2bcd20f9fb96
In an unlucky situation when raft server throw an exception while losing primacy, the master will attempt to shutdown but stuck in a deadlock of `RaftJournalSystem::losePrimacy -> System::exit -> ProcessUtils::stopProcessOnShutdown -> RaftJournalSystem::stop`. It can be fixed with throwing directly within `losePrimacy`. pr-link: #12166 change-id: cid-e20b652c8baf6a53f26b79635a9ad4813c66afe5
Current the raft client for writing journal entries will always checking for leadership and send the request through network interface, introducing significant overhead. This change implements a raft client which attempts sending messages directly to local server first, and falls back to the default client if the local server is no longer a leader. The client also has built-in retry function for raft client which is closed due to timeout. pr-link: #12174 change-id: cid-df05031cb47f6f931e67c9b4c853b9b426ba145d
@yuzhu PTAL. |
Merged build finished. Test FAILed. |
Test FAILed. |
Ratis leader sometimes send a request to a follower to install a snapshot that is older than the follower's log. This change implements a check to avoid reloading the state machine after such snapshots are downloaded to prevent some accounting issue. pr-link: #12179 change-id: cid-fc9d759f0ec0370b89dd8789b47e0cc7b807df12
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor comment otherwise LGTM
<groupId>org.apache.ratis</groupId> | ||
<artifactId>ratis-grpc</artifactId> | ||
</dependency> | ||
<!-- TODO(lu) remove catalyst dependency which used for serialization and move to Grpc impl --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this todo need to be addressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a TODO for future, not target for 2.4
alluxio-bot, feature-merge this please. |
### What changes are proposed in this pull request? Correct docs about Ratis. ### Why are the changes needed? Embedded journal is introduced in #8219, then Copycat is replaced by Ratis in #12181. Some docs are not updated. ### Does this PR introduce any user facing changes? No. pr-link: #16985 change-id: cid-592a5c06991c5b067690031ef7fffed9d2e6fac6
### What changes are proposed in this pull request? Correct docs about Ratis. ### Why are the changes needed? Embedded journal is introduced in Alluxio#8219, then Copycat is replaced by Ratis in Alluxio#12181. Some docs are not updated. ### Does this PR introduce any user facing changes? No. pr-link: Alluxio#16985 change-id: cid-592a5c06991c5b067690031ef7fffed9d2e6fac6
### What changes are proposed in this pull request? Correct docs about Ratis. ### Why are the changes needed? Embedded journal is introduced in Alluxio#8219, then Copycat is replaced by Ratis in Alluxio#12181. Some docs are not updated. ### Does this PR introduce any user facing changes? No. pr-link: Alluxio#16985 change-id: cid-592a5c06991c5b067690031ef7fffed9d2e6fac6
### What changes are proposed in this pull request? Correct docs about Ratis. ### Why are the changes needed? Embedded journal is introduced in Alluxio#8219, then Copycat is replaced by Ratis in Alluxio#12181. Some docs are not updated. ### Does this PR introduce any user facing changes? No. pr-link: Alluxio#16985 change-id: cid-592a5c06991c5b067690031ef7fffed9d2e6fac6
### What changes are proposed in this pull request? Correct docs about Ratis. ### Why are the changes needed? Embedded journal is introduced in Alluxio#8219, then Copycat is replaced by Ratis in Alluxio#12181. Some docs are not updated. ### Does this PR introduce any user facing changes? No. pr-link: Alluxio#16985 change-id: cid-592a5c06991c5b067690031ef7fffed9d2e6fac6
This PR contains the following changes that implement a raft based journal system and snapshot management using Ratis framework: