Before Creating the Bug Report
Runtime platform environment
Ubuntu 20.04
RocketMQ version
branch: develop
JDK Version
No response
Describe the Bug
Static topic routing and mapping validation may use the wrong mapping version when comparing mapping epochs with a large difference.
Two places sort static topic mapping metadata by epoch using subtraction and casting the result to int:
(int) (o2.getValue().getEpoch() - o1.getValue().getEpoch())
and:
(int) (o2.getEpoch() - o1.getEpoch())
If the epoch difference is greater than Integer.MAX_VALUE, the cast can overflow and return the wrong ordering. This can cause older static topic mapping metadata to be processed before newer metadata.
Affected files:
- remoting/src/main/java/org/apache/rocketmq/remoting/rpc/ClientMetadata.java
- remoting/src/main/java/org/apache/rocketmq/remoting/protocol/statictopic/TopicQueueMappingUtils.java
Steps to Reproduce
-
Create old static topic mapping metadata with epoch 0.
-
Create new static topic mapping metadata with epoch Integer.MAX_VALUE + 1L.
-
Use both mappings for the same logical queue/global queue.
-
Build routing/mapping result through:
- ClientMetadata.topicRouteData2EndpointsForStaticTopic(...)
- TopicQueueMappingUtils.checkAndBuildMappingItems(..., replace=true, ...)
-
Verify which mapping is selected.
What Did You Expect to See?
The newer mapping with the higher epoch should always be selected, even when the epoch difference is greater than Integer.MAX_VALUE.
What Did You See Instead?
The subtraction-based comparator can overflow and process stale lower-epoch metadata before newer metadata, which may cause static topic routing or mapping replacement to use stale broker mapping information.
Additional Context
Suggested fix: use Long.compare(...) instead of subtraction-based comparison:
mappingInfos.sort((o1, o2) -> Long.compare(o2.getValue().getEpoch(), o1.getValue().getEpoch()));
and:
mappingDetailList.sort((o1, o2) -> Long.compare(o2.getEpoch(), o1.getEpoch()));
Before Creating the Bug Report
I found a bug, not just asking a question, which should be created in GitHub Discussions.
I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Runtime platform environment
Ubuntu 20.04
RocketMQ version
branch: develop
JDK Version
No response
Describe the Bug
Static topic routing and mapping validation may use the wrong mapping version when comparing mapping epochs with a large difference.
Two places sort static topic mapping metadata by epoch using subtraction and casting the result to
int:and:
If the epoch difference is greater than Integer.MAX_VALUE, the cast can overflow and return the wrong ordering. This can cause older static topic mapping metadata to be processed before newer metadata.
Affected files:
Steps to Reproduce
Create old static topic mapping metadata with epoch 0.
Create new static topic mapping metadata with epoch Integer.MAX_VALUE + 1L.
Use both mappings for the same logical queue/global queue.
Build routing/mapping result through:
Verify which mapping is selected.
What Did You Expect to See?
The newer mapping with the higher epoch should always be selected, even when the epoch difference is greater than Integer.MAX_VALUE.
What Did You See Instead?
The subtraction-based comparator can overflow and process stale lower-epoch metadata before newer metadata, which may cause static topic routing or mapping replacement to use stale broker mapping information.
Additional Context
Suggested fix: use Long.compare(...) instead of subtraction-based comparison:
mappingInfos.sort((o1, o2) -> Long.compare(o2.getValue().getEpoch(), o1.getValue().getEpoch()));
and:
mappingDetailList.sort((o1, o2) -> Long.compare(o2.getEpoch(), o1.getEpoch()));