-
Notifications
You must be signed in to change notification settings - Fork 11.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RIP-44] Support DLedger Controller #4330
Comments
RongtongJin
added
soc
Summer of Code, hosted by Google, Alibaba, Chinese Academy of Sciences and so on
module/ha
high availably related
labels
May 16, 2022
This was referenced May 17, 2022
This was referenced May 24, 2022
This was referenced Jun 1, 2022
This was referenced Jun 17, 2022
RongtongJin
changed the title
Tracking issue: Let rocketmq support switching master-slave.
[RIP-44] Support DLedger Controller
Jul 1, 2022
6 tasks
RongtongJin
pushed a commit
that referenced
this issue
Nov 19, 2022
6 tasks
drpmma
pushed a commit
that referenced
this issue
Feb 21, 2023
This issue is stale because it has been open for 365 days with no activity. It will be closed in 3 days if no further activity occurs. |
This issue was closed because it has been inactive for 3 days since being marked as stale. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
After the release of RocketMQ 4.5.0, the DLedger mode (raft) was introduced. The raft commitlog under this architecture is used to replace the original commitlog so that it has the ability to failover. However, there are some disadvantages going with this architecture due to the raft capability on replication, including:
To have failover ability, the number of replicas in the broker group must be 3 or more
Acks from replicas need to strictly follow the majority rule of the Raft protocol, that is, 3-replica architecture requires acks from 2 replicas to return, and 5-replica architecture requires acks from 3 to return
Since the store repository relies on OpenMessaging DLedger in DLedger mode, Native storage and replication capabilities of RocketMQ (such as transientStorePool and zero-copy capabilities) cannot be reused, and maintenance becomes difficult as well.
To handle those mentioned problems, I would like to start an RIP-44 Support DLedger Controller. With this improvement, DLedger (Raft) capability will be abstracted onto the upper layer, becoming an optional and loosely coupled coordination component named DLedger Controller.
After the deployment of DLedger Controller, the master-slave architecture will also equip with failover capability. The DLedger Controller can optionally be embedded into the NameServer (the NameServer itself remains stateless and cannot provide electoral capabilities when the majority is down), or it can be deployed independently.
DLedger controller is an optional component that does not change the previous operation and maintenance mode. Compared with other components, its downtime will not affect online services. In addition, RIP-44 unifies the storage and replication of RocketMQ, resulting in lower maintenance costs and faster development iterations. In terms of compatibility, the master-slave architecture can upgrade without compatibility problems.
I've already done the work with @RongtongJin . Our proposals are provided at the links below:
https://docs.google.com/document/d/1tSJkor_3Js4NBaVA0UENGyM8Mh0SrRMXszRyI91hjJ8/edit?usp=sharing
Chinese version:
https://shimo.im/docs/N2A1Mz9QZltQZoAD/
The following prs are the main jobs:
Add statemachine mode for dledger: Feature: add statemachine for dledger openmessaging/dledger#128
Embed a strongly consistent controller based on dledger in name-srv: [Summer of Code] Dledger controller #4195
Add a new HaService -- AutoSwitchHAService, which use new log replicating protocol to support switch role in haService level.: [Summer of Code] Support switch role for ha service #4236
Connecting the interface of Dledger-controller at the Broker level, so that the Broker has the ability of master-slave switching:
[Summer of Code] Support switch role for broker #4272
Add learner role, which does not join inSyncStateSet and only asynchronously replicates logs from Master:
[Summer of code] Support async learner in controller mode #4367
The following prs are for optimization and adjustment
Make the controller independent from name-srv and can be deployed independently:[Summer of code] Stand alone a new controller module #4333
Modify the definition of syncStateSet in AutoSwitchHASerivce, and introduce the confirmOffset mechanism:[Summer of code] Shrink and expand InSyncStateSet #4355
Add admin tools for controler mode (GetSyncStateSet and GetBrokerEpochCahce): [Summer of code] Add admin tools for controller mode #4388
Reuse the remotingServer in dledger: [Summer of code] Reuse dledger remotingServer in controller mode. #4409
Document
Test design
RIP42
https://shimo.im/docs/N2A1Mz9QZltQZoAD
The text was updated successfully, but these errors were encountered: