Duplicate consumption triggered by consumer group rebalance

**In what version(s) of Spring for Apache Kafka are you seeing this issue?**

2.8.10

**Describe the bug**

This is a conceptual description of a Kafka duplicate consumption scenario that we are seeing in our load test environment:

GIVEN Two application instances on separate EC2 instances with one Kafka consumer each and same consumer group ID.
GIVEN Consumption with a Spring BatchAcknowledgingMessageListener which synchronously consumes a batch and then calls Acknowledgment.acknowledge().
GIVEN single-threaded ConcurrentMessageListenerContainer with AckMode MANUAL_IMMEDIATE and syncCommits enabled.
WHEN One of the application instances shuts down (its consumer leaves the consumer group)
THEN The consumer group is rebalanced so that the remaining consumer gets all of the partitions.
WHEN The first application instance starts again (its consumer joins the consumer group).
THEN The consumer group is rebalanced so that each consumer gets half of the partitions.
THEN A handful of messages are successfully consumed by both consumers.
THEN The consumer that gets half of its partitions revoked logs one or more of this type of message before the partition revocation is logged:
`2022-11-17 08:24:21,809 INFO [consumer-0-C-1] o.a.k.c.c.i.ConsumerCoordinator [ConsumerCoordinator.java:1156] [Consumer clientId=consumer-xms-batch-mt-callback-3, groupId=xms-batch-mt-callback] Failing OffsetCommit request since the consumer is not part of an active group`

This happens sometimes when executing a test case with application instance restart during moderate traffic load (consumption of 1000 records per second). By contrast, when there is no duplication, the partition revocation is logged first and only after that there are log entries for failing offset commits.

**To Reproduce**

1. Set up two KafkaMessageListenerContainers with BatchAcknowledgingMessageListeners, belonging to the same consumer group, consuming from the same topic with at least two partitions, with non-default consumer configuration settings `heartbeat.interval.ms = 1500` and `session.timeout.ms = 6000`.
2. Start sending records with a serialized payload of roughly 100 bytes at a rate of 1000 per second.
3. Stop one  of the consumers, wait for a minute, then start it again.

In our test setup this triggers duplicate consumption of a handful of messages roughly two times out of 10.

**Expected behavior**

All produced records are only consumed once by either consumer.

**Sample**

TBD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Duplicate consumption triggered by consumer group rebalance #2489

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Duplicate consumption triggered by consumer group rebalance #2489

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions