You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When enable_auto_commit=False, consumer.close(autocommit=False) and with no consumer.commit(), messages are not re-consumed when the program is restarted. It only works after we commit the offset to the topic at least once.
consumer=KafkaConsumer('hems.energy-prices', group_id="energyPricesGroup", bootstrap_servers=Config.KAFKA_BROKER_ENDPOINT, consumer_timeout_ms=100, enable_auto_commit=False)
# Consume events until the program receives an exit signalwhilenotexitEvent.wait(timeout=0.01):
try:
msg=next(consumer)
event=json.loads(msg.value)
processEvent(event)
# consumer.commit() # Consumer not committing, should keep receiving the same events after restartexceptStopIteration:
passconsumer.close(autocommit=False)
I successfully received the event and processed it. Since I am not committing the offset, I should receive the same event after restarting the program. However, in the second execution below, this is not happening.
Second Execution:
Consumer.py:
consumer=KafkaConsumer('hems.energy-prices', group_id="energyPricesGroup", bootstrap_servers=Config.KAFKA_BROKER_ENDPOINT, consumer_timeout_ms=100, enable_auto_commit=False)
# Consume events until the program receives an exit signalwhilenotexitEvent.wait(timeout=0.01):
try:
msg=next(consumer)
event=json.loads(msg.value)
processEvent(event)
# consumer.commit() # Consumer not committing, should keep receiving the same events after restartexceptStopIteration:
passconsumer.close(autocommit=False)
In this execution, I did not receive the event. However, I was supposed to receive it because I did not commit the consumer offset.
Workaround:
The workaround we found was to commit the offset to the topic at least once.
consumer=KafkaConsumer('hems.energy-prices', group_id="energyPricesGroup", bootstrap_servers=Config.KAFKA_BROKER_ENDPOINT, consumer_timeout_ms=100, enable_auto_commit=False)
# Consume events until the program receives an exit signalwhilenotexitEvent.wait(timeout=0.01):
try:
msg=next(consumer)
event=json.loads(msg.value)
processEvent(event)
# consumer.commit() # Consumer not committingexceptStopIteration:
passconsumer.close(autocommit=False)
As expected, I received the new event.
Fourth Execution:
Consumer.py:
consumer=KafkaConsumer('hems.energy-prices', group_id="energyPricesGroup", bootstrap_servers=Config.KAFKA_BROKER_ENDPOINT, consumer_timeout_ms=100, enable_auto_commit=False)
# Consume events until the program receives an exit signalwhilenotexitEvent.wait(timeout=0.01):
try:
msg=next(consumer)
event=json.loads(msg.value)
processEvent(event)
# consumer.commit() # Consumer not committingexceptStopIteration:
passconsumer.close(autocommit=False)
Here, I received the same event, because I did not commit. In my understanding, this is the correct behavior. If I make a fifth execution of the consumer, I will receive the same event again, as expected.
Does anyone know how/why this is happening? And is there a way to overcome this problem, without having to make a first commit to the topic?
My goal is not to consume all the messages. It is to keep the integrity of the events. By this I mean if the program crashes while processing an event it should receive and process the same event, until it is committed (event was fully processed).
Problem statement
When enable_auto_commit=False, consumer.close(autocommit=False) and with no consumer.commit(), messages are not re-consumed when the program is restarted. It only works after we commit the offset to the topic at least once.
First Execution:
Producer.py:
Consumer.py:
I successfully received the event and processed it. Since I am not committing the offset, I should receive the same event after restarting the program. However, in the second execution below, this is not happening.
Second Execution:
Consumer.py:
In this execution, I did not receive the event. However, I was supposed to receive it because I did not commit the consumer offset.
Workaround:
The workaround we found was to commit the offset to the topic at least once.
First Execution:
Producer.py:
In the first execution, we will commit the offset to the topic.
Consumer.py:
Second Execution:
Consumer.py:
We did not receive the event, because we already committed the consumer offset. This is working as expected.
Third Execution:
Producer.py:
Consumer.py:
As expected, I received the new event.
Fourth Execution:
Consumer.py:
Here, I received the same event, because I did not commit. In my understanding, this is the correct behavior. If I make a fifth execution of the consumer, I will receive the same event again, as expected.
Does anyone know how/why this is happening? And is there a way to overcome this problem, without having to make a first commit to the topic?
My goal is not to consume all the messages. It is to keep the integrity of the events. By this I mean if the program crashes while processing an event it should receive and process the same event, until it is committed (event was fully processed).
Versions
kafka-python: 2.0.2
kafka docker image: debezium/kafka:1.6
The text was updated successfully, but these errors were encountered: