-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hang in destructor of RdKafka::KafkaConsumer #3954
Comments
https://github.com/edenhill/librdkafka/blob/2d78e928d8c0d798f341b1843c97eb6dcdecefc3/src/rdkafka_broker.c#L5393 |
@edenhill Is this a known issue? |
FWIW, I work on a codebase where we don't use |
Make sure all outstanding objects are destroyed prior to calling close. See https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#termination |
Actually, what we use is the C++ wrapper |
@rickif Our issue last time we faced similar issues on the node-rdkafka end (which also uses the C++ wrapper) was that we didn't process the last incoming rebalance anymore. Not sure this helps - but helped us. |
@rickif , have you been able to solve this issue? I ran into it using rdkafka bindings for Rust, except for me this is triggered by deleting the topic that the consumer is subscribed to right before deleting the consumer. The stack trace looks exactly the same. There are no live Kafka objects in the code except the consumer itself. |
Upon further investigation, this only happens when I either set |
Change Kafka consumer configuration to _not_ automatically committ offsets of consumed messages to the broker, meaning that next time the connector is instantiated it will start reading from the offset specified in `auto.offset.reset` and not from the last committed offset. The previous behavioe caused `rdkafka` to hang in some circumstances (confluentinc/librdkafka#3954). Besides, the new behavior is probably more correct given that circuit state currently does not survive across pipeline restarts, so it makes sense to start feeding messages from the start rather than from the last offset consumed by the previous instance of the pipeline, whose state is lost. Once we add fault tolerance, we will likely use explicit commits, which also do not require these options. Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Change Kafka consumer configuration to _not_ automatically committ offsets of consumed messages to the broker, meaning that next time the connector is instantiated it will start reading from the offset specified in `auto.offset.reset` and not from the last committed offset. The previous behavioe caused `rdkafka` to hang in some circumstances (confluentinc/librdkafka#3954). Besides, the new behavior is probably more correct given that circuit state currently does not survive across pipeline restarts, so it makes sense to start feeding messages from the start rather than from the last offset consumed by the previous instance of the pipeline, whose state is lost. Once we add fault tolerance, we will likely use explicit commits, which also do not require these options. Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ
Do NOT create issues for questions, use the discussion forum: https://github.com/edenhill/librdkafka/discussions
Description
I got a process hang with the librdkafka v1.8.2.
I make some investigation about this.
The rkb->rkb_refcnt is 3 so the check at https://github.com/edenhill/librdkafka/blob/2d78e928d8c0d798f341b1843c97eb6dcdecefc3/src/rdkafka_broker.c#L5266
does not work at all.
The complete pstack ouput of the process is attached as attachment.
How to reproduce
<your steps how to reproduce goes here, or remove section if not relevant>
It's an accident hang and I have not found the way to reproduce it stably.
IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.
Checklist
IMPORTANT: We will close issues where the checklist has not been completed.
Please provide the following information:
v1.8.2
2.3.0
enable.partition.eof=true, enable.auto.offset.store=false, statistics.interval.ms=0, auto.offset.reset=error, api.version.request=true
CentOS Linux release 7.9.2009
debug=..
as necessary) from librdkafkapstack.txt
The text was updated successfully, but these errors were encountered: