Skip to content

[Bug] Static topic routing may use stale epoch due to comparator overflow #10580

Description

@Aias00

Before Creating the Bug Report

  • I found a bug, not just asking a question, which should be created in GitHub Discussions.

  • I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.

  • I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.

Runtime platform environment

Ubuntu 20.04

RocketMQ version

branch: develop

JDK Version

No response

Describe the Bug

Static topic routing and mapping validation may use the wrong mapping version when comparing mapping epochs with a large difference.

Two places sort static topic mapping metadata by epoch using subtraction and casting the result to int:

(int) (o2.getValue().getEpoch() - o1.getValue().getEpoch())

and:

(int) (o2.getEpoch() - o1.getEpoch())

If the epoch difference is greater than Integer.MAX_VALUE, the cast can overflow and return the wrong ordering. This can cause older static topic mapping metadata to be processed before newer metadata.

Affected files:

  • remoting/src/main/java/org/apache/rocketmq/remoting/rpc/ClientMetadata.java
  • remoting/src/main/java/org/apache/rocketmq/remoting/protocol/statictopic/TopicQueueMappingUtils.java

Steps to Reproduce

  1. Create old static topic mapping metadata with epoch 0.

  2. Create new static topic mapping metadata with epoch Integer.MAX_VALUE + 1L.

  3. Use both mappings for the same logical queue/global queue.

  4. Build routing/mapping result through:

    • ClientMetadata.topicRouteData2EndpointsForStaticTopic(...)
    • TopicQueueMappingUtils.checkAndBuildMappingItems(..., replace=true, ...)
  5. Verify which mapping is selected.

What Did You Expect to See?

The newer mapping with the higher epoch should always be selected, even when the epoch difference is greater than Integer.MAX_VALUE.

What Did You See Instead?

The subtraction-based comparator can overflow and process stale lower-epoch metadata before newer metadata, which may cause static topic routing or mapping replacement to use stale broker mapping information.

Additional Context

Suggested fix: use Long.compare(...) instead of subtraction-based comparison:

mappingInfos.sort((o1, o2) -> Long.compare(o2.getValue().getEpoch(), o1.getValue().getEpoch()));

and:

mappingDetailList.sort((o1, o2) -> Long.compare(o2.getEpoch(), o1.getEpoch()));

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions