New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforcing container to stop on transaction fencing #1612
Comments
The consumer cannot tell if the fenced exception is caused by a rebalance or timeout. This is a known limitation of Kafka itself, and will be addressed in a future release. We can certainly add an option to stop the container on these exceptions, but the proper fix can only be done when the new exception is thrown for a timeout. |
In the meantime, you should make sure the |
Reading https://issues.apache.org/jira/browse/KAFKA-9803 I can see :
Actually the producer is not shutting down, is it ? Or if it is, what happens to the container ? Since it is not rollbacking, does it try to re produce the message and then the ProducerFactory creates a new one ? |
The producer factory should recycle the producer (it does for me with your test application with the short tx timeout). Any error on |
Yes but since the exception is ignored the consumer skip the messages with the failed producer and continues with a new batch with a new producer. |
Right, so until they fix it, the only thing we can do is stop the container. Here's another work-around: Set the container's |
Which kind of no-op can be done to keep a transaction up ? (without producing a message :D) |
template.executeInTransaction(t -> {
return null;
}); It will just do |
Resolves spring-projects#1612 **cherry-pick to 2.5.x**
Resolves spring-projects#1612 **cherry-pick to 2.5.x**
Does handling Say we have 3 instances of the application running on a topic with 3 partitions (1 partition for each consumer). Does those rebalances can trigger a |
Yes, it will cause a rebalance, although that can be avoided by setting a unique |
We are using dynamic auto scaling. This means our consumer partitions are not static. Causing a rebalance does not really matters, this FencedException should be rare and causing a rebalance to recover is acceptable. |
It won't deadlock, even if the other instances are not well-behaved (if they don't handle the rebalance in a timely manner). |
Resolves #1612 **cherry-pick to 2.5.x** * * Add @SInCE to javadocs; retain route cause of `StopAfterFenceException`. * * Add reason to `ConsumerStoppedEvent`. Resolves #1618 Also provide access to the actual container that stopped the consumer, for example to allow restarting after stopping due to a producer fenced exception. * * Add `@Nullable`s. * * Test Polishing.
Resolves #1612 **cherry-pick to 2.5.x** * * Add @SInCE to javadocs; retain route cause of `StopAfterFenceException`. * * Add reason to `ConsumerStoppedEvent`. Resolves #1618 Also provide access to the actual container that stopped the consumer, for example to allow restarting after stopping due to a producer fenced exception. * * Add `@Nullable`s. * * Test Polishing.
Affects Version(s): 2.5.6.RELEASE
Regarding this stackoverflow question :
https://stackoverflow.com/questions/64665725/why-i-lost-messages-when-using-kafka-with-eos-beta/64665928
When a ProducerFencedException | FencedInstanceIdException occurs, the framework assumes that the consumer will stop, and thus is just logging :
spring-kafka/spring-kafka/src/main/java/org/springframework/kafka/listener/KafkaMessageListenerContainer.java
Lines 1519 to 1523 in fb1977c
It would be usefull to have a safeguard , enforcing the consumer container to stop to prevent losing message in an unexpected behavior concerning this exception.
The text was updated successfully, but these errors were encountered: