Skip to content
This repository has been archived by the owner on Nov 20, 2020. It is now read-only.

Consumer fails to commit offset periodically after partition rebalance event #152

Closed
chrnola opened this issue Jun 26, 2017 · 0 comments
Closed

Comments

@chrnola
Copy link
Contributor

chrnola commented Jun 26, 2017

My team has observed what we believe to be a bug in the Kafunk v0.1 consumer. We have a consumer group composed of 4 hosts located in Azure's uswest region all consuming a single 8-partition topic from a broker located in Azure's useast2 region (so, nontrivial round-trip times).

The attached logs span the time period of 2017-06-11 22:00:00 UTC to 2017-06-19 23:00:00 UTC. Note that during this time period not a single message was written to the topic. Only partition 4 has ever had a message published to it (log size =1). The other 7 partitions have a log size of 0.

For monitoring purposes, we have a separate process that periodically logs the committed offset from the broker (obtained via Kafunk.Consumer.fetchOffset). This monitor service observed that there was no committed offset (fetchOffset returned -1) for partition 2 during the following time periods:

  • 2017-06-13 04:30 UTC to 2017-06-13 20:00 UTC
  • 2017-06-14 20:20 UTC to 2017-06-15 21:00 UTC
  • 2017-06-16 21:10 UTC to 2017-06-19 18:50 UTC

Under normal circumstances the Kafunk consumer periodically commits its offsets even if the offset has not advanced since the previous commit. That has not been the case with this particular consumer group. From looking through the logs we see several partition rebalance events during which partition 2 gets reassigned to a new host. After the rebalance, the consumer fails to begin the periodic commit loop. We suspect this may be due to the low volume nature of the topic.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant