Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consumer offset count reset issue #455

Closed
4 of 7 tasks
DavidNeko opened this issue Sep 25, 2018 · 7 comments
Closed
4 of 7 tasks

Consumer offset count reset issue #455

DavidNeko opened this issue Sep 25, 2018 · 7 comments

Comments

@DavidNeko
Copy link

DavidNeko commented Sep 25, 2018

Description

I've been trying to test the correctness of a chunk of data I send to Kafka. When I was trying to use multiprocessing in fabric, I messed up the process as well as the message consumer. The message consumer did not get shut down correctly at first, then it stopped consuming message any more.

After that I re-started Kafka on my local machine (I'm using docker, so I used
docker-compose -f docker-compose-single-broker.yml rm
to remove the kafka I've been using to test,
and re-created a new one using
docker-compose -f dokcer-compose-single-broker.yml up

After kafka and kafka-manager was up and running, I found out that although I don't have any messages transferred to kafka, the offset value of the topic I used to test was not reset to 0.
image
For data in the picture,
"gateway" is the consumer I've been using before and after I re-started kafka.
"gateway_tester" was the topic that I used to send test messages.
"End 54"(value in red) was the number of data consumed from this topic after I re-started kafka.
"Offset 899"(value in blue) was the number of data consumed from this topic before I re-started kafka.

I'm confused why doesn't the offset number get reset after I re-started kafka.
When I was using this consumer after I re-started kafka, it will consume all the data I sent to kafka because the number of data is less than 899...

Then I created a new consumer called "gateway_2" to consume data from the same topic.
image

As it is shown in the picture, the offset count matched the End value this time. And everything works fine. If I send data to this topic and try to consume data using this new consumer "gateway_2", it consumes the new messages I sent to the topic and it'll ignore the message that it has consumed before. (My setting of the offset is 'auto.offset.reset': 'smallest')

I'm wondering, if there's a way to reset offset count on the consumer that I used before? Or the only way of solving this problem is to create a new consumer.

How to reproduce

  1. Start kafka, create a consumer and consume some data to change the offset count on that consumer.

  2. Shut down kafka.

  3. Re-start kafka and use the same consumer to consumer message.

  4. The consumer would consume all data in topic until the amount of data in certain topic reaches the number of offset count.

Checklist

Please provide the following information:

  • confluent-kafka-python and librdkafka version :confluent_kafka.version(0.11.4) kafka-python(1.3.5) (I could not find confluent_kafka.libversion() because the project I'm working on used pip to manage python packages and confluent_kafka.libversion doesn't show on the requirements.txt file...)
  • Apache Kafka broker version: 0.9.0.1
  • Client configuration:
    KAFKA_HOST = '0.0.0.0'
    KAFKA_PORT = 9092

KAFKA_HOST_PORT = '%(host)s:%(port)s' % {
'host': KAFKA_HOST,
'port': KAFKA_PORT,
}

kafka_configuration = {
'bootstrap.servers': KAFKA_HOST_PORT,
'session.timeout.ms': 6000,
'default.topic.config': {'auto.offset.reset': 'smallest'},
}
(I updated group.id with value gateway and gateway_2(for the new consumer) in my class initializer)

  • Operating system: macOS 10.13.6
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue
@rnpridgeon
Copy link
Contributor

Restarting a broker alone is not enough to remove offsets. You will need to delete the backing volume as well as it stores the contents of the __consumer_offsets topic which stores your consumer groups offsets.

Had you deleted the offsets you would have received an offsets out of range error which would have prompted your consumer to fallback on its offset reset policy which is smallest in your case.

@DavidNeko
Copy link
Author

Thank you for your reply. By 'delete the backing volume', did you mean using command such as
docker-compose -f docker-compose-single-broker.yml rm -v?
I looked up the docker docs and it says 'By default, anonymous volumes attached to containers are not removed.' So I think the command I ran to remove the container was not removing the volume stored on my local machine which is probably causing the issue.
I will try to remove the local volume and restart kafka to test it out right now. Thanks again!

@DavidNeko
Copy link
Author

Thanks it worked!
I used the command
docker-compose -f docker-compose-single-broker.yml rm -v and removed the kafka along with anonymous volumes attached to it.

And after I re-started kafka the offset was reset.
image

May I ask another question, is there a way to delete the backing volume other than use the docker command I mentioned above?

@OneCricketeer
Copy link

OneCricketeer commented Sep 26, 2018

You would need to show your compose file, but stopping and removing the container is one way to remove the volume.

The alternate would be to mount a physical directory on your host machine into the log.dirs Kafka property and you could manually clear that out

@XiaochenCui
Copy link

docker-compose -f docker-compose-single-broker.yml down also works.
See https://docs.docker.com/compose/reference/down/

@edenhill
Copy link
Contributor

edenhill commented Oct 5, 2018

librdkafka will currently not re-commit an older offset than the last committed offset:
confluentinc/librdkafka#1538 (comment)

When you remove and recreate a topic the offsets are thus reset, but an existing consumer will not use the new offsets until they pass the old highwater offset.

@edenhill edenhill closed this as completed Oct 5, 2018
@dragid10
Copy link

dragid10 commented Oct 25, 2019

Unrelated question, but what tool are you using to view your kafka topic like that? @DavidNeko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants