Before Creating the Bug Report
Runtime platform environment
ubuntu 24.04
RocketMQ version
rocketmq-all-5.5.0
JDK Version
Oracle JRE 8u251
Describe the Bug
RaftBrokerHeartBeatManager.scanNotActiveBroker() uses firstReceivedHeartbeatTime with a reversed time check, so after the initial wait window passes, it keeps skipping the scan forever instead of starting it. As a result, inactive broker detection and the follow-up re-election flow may never run.
Steps to Reproduce
- make a broker into error status like OOM but broker connection remains alive
- JraftController will not find that this broker node is out of service
What Did You Expect to See?
JRaftController can remove this broker node from alive set
What Did You See Instead?
JRaftController did nothing
Additional Context
No response
Before Creating the Bug Report
I found a bug, not just asking a question, which should be created in GitHub Discussions.
I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Runtime platform environment
ubuntu 24.04
RocketMQ version
rocketmq-all-5.5.0
JDK Version
Oracle JRE 8u251
Describe the Bug
RaftBrokerHeartBeatManager.scanNotActiveBroker() uses firstReceivedHeartbeatTime with a reversed time check, so after the initial wait window passes, it keeps skipping the scan forever instead of starting it. As a result, inactive broker detection and the follow-up re-election flow may never run.
Steps to Reproduce
What Did You Expect to See?
JRaftController can remove this broker node from alive set
What Did You See Instead?
JRaftController did nothing
Additional Context
No response