Skip to content

Conversation

@ShivsundarR
Copy link
Contributor

@ShivsundarR ShivsundarR commented Oct 9, 2025

What
https://issues.apache.org/jira/browse/KAFKA-19485

Bug :
There is a bug in ShareConsumeRequestManager where we are adding
acknowledgements on initial ShareSession epoch even after checking for
it.
Added fix to only include acknowledgements in the request if we have to,

PR also adds the check at another point in the code where we could
potentially be sending such acknowledgements. One of the cases could be
when metadata is refreshed with empty topic IDs after a broker restart.
This means leader information would not be available on the node.

  • Consumer subscribed to a partition whose leader was node-0.
  • Broker restart happens and node-0 is elected leader again. Broker
    starts a new ShareSession.
  • Background thread sends a fetch request with non-zero epoch.
  • Broker responds with SHARE_SESSION_NOT_FOUND.
  • Client updates session epoch to 0 once it receives this error.
  • Client updates metadata but receives empty metadata response. (Leader
    unavailable)
  • Application thread processing the previous fetch, completes and sends
    acks to piggyback on next fetch.
  • Next fetch will send the piggyback acknowledgements on the fetch for
    previously subscribed partitions resulting in error from broker
    ("Acknowledge data present on initial epoch"). (Currently we attempt
    to send even if leader is unavailable).

Fix : Add a check before sending out acknowledgments if we are on
an initial epoch.
Added unit test covering the above scenario.

Reviewers: Andrew Schofield aschofield@confluent.io

…poch. (apache#20135)

https://issues.apache.org/jira/browse/KAFKA-19485

**Bug :**
There is a bug in `ShareConsumeRequestManager` where we are adding
acknowledgements on initial `ShareSession` epoch even after checking for
it.
Added fix to only include acknowledgements in the request if we have to,

PR also adds the check at another point in the code where we could
potentially be sending such acknowledgements.  One of the cases could be
when metadata is refreshed with empty topic IDs after a broker restart.
This means leader information would not be available on the node.

- Consumer subscribed to a partition whose leader was node-0.
- Broker restart happens and node-0 is elected leader again. Broker
starts a new `ShareSession`.
- Background thread sends a fetch request with **non-zero** epoch.
- Broker responds with `SHARE_SESSION_NOT_FOUND`.
- Client updates session epoch to 0 once it receives this error.
- Client updates metadata but receives empty metadata response. (Leader
unavailable)
- Application thread processing the previous fetch, completes and sends
acks to piggyback on next fetch.
- Next fetch will send the piggyback acknowledgements on the fetch for
previously subscribed partitions resulting in error from broker
("`Acknowledge data present on initial epoch`"). (Currently we attempt
to send even if leader is unavailable).

**Fix** :  Add a check before sending out acknowledgments if we are on
an initial epoch.
Added unit test covering the above scenario.

Reviewers: Andrew Schofield <aschofield@confluent.io>
@AndrewJSchofield AndrewJSchofield self-requested a review October 13, 2025 09:39
Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@AndrewJSchofield AndrewJSchofield merged commit e76f273 into apache:4.1 Oct 13, 2025
27 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants