-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer stop consuming after Broker transport failure #548
Comments
Same behaviour and same error of "broker transport failure". Consumer stops and we can see the lag of a topic caused by that. We have to restart the whole thing |
@webmakersteve just pinging here too, since this issue is tracked in multiple issues, and on my opinion it's pretty critical, since the recovery for this problem, in prod environments is not easy. |
@webmakersteve +1 |
@carlessistare IMHO there is a bug in librdkafka. My observations told me that the thread stops consume inside the library. Indirect sign of this is a "solving" issue #222 |
Same issue at our side. Has anybody got a working solution for this? This is extremely critical now for our project. |
We are also facing the same issue, Is there any fix for it? |
I'm also facing the same issue. This is a critical issue which has to be fixed. Is there a working solution? |
Hello, Is there any update about this issue? or a possible workaround? |
We are also facing the same issue. Should we go for non-flow mode for the time being till the fix is available |
Is there any progress for it? |
Check the librdkafka release notes, might be time to upgrade the librdkafka provided by node-rdkafka. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
We are noticing a similar issue. It seems like an update to the version of |
Hi,
We encounter a problem with consumers that stop providing new messages to the 'data' listener.
This seemingly happens after a broker becomes temporarily unavailable (broker transport failure), but only rarely. We observed this on several different consumers on different topics with similar configurations, seemingly randomly (most of the times the consumers resume operations after a broken broker connection).
The consumer is still synchronized with its consumer group (which consists of a single consumer for one topic of 5 partitions), the high offsets increase as new message arrive on the partitions, but the consumer lag keeps increasing and messages are seemingly never properly consumed by the consumer.
We observed this sequence of events, where all partitions of a topic stopped consuming:
This 'event.error' seems to indicate the beginning of the problem:
Error: broker transport failure
After this, no stats are logged again, although they were being logged every second before that.
10 seconds after the error, the consumer stops fetching every partition of the topic, with these two event logs happening for each partition:
{ severity: 7, fac: 'FETCH' } [thrd:BROKER_IP:9092/0]: BROKER_IP:9092/0: Topic TOPIC_NAME [3] in state active at offset 39611 (10/10 msgs, 0/40960 kb queued, opv 6) is not fetchable: queued.min.messages exceeded
{ severity: 7, fac: 'FETCHADD' } [thrd:BROKER_IP:9092/0]: BROKER_IP:9092/0: Removed TOPIC_NAME [3] from fetch list (0 entries, opv 6)
Probably linked to #182.
Environment Information
Consumer configuration
The text was updated successfully, but these errors were encountered: