-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log writing failed. can't be called from trap context #598
Comments
It's a common and known problem. You can bypass that by running logging inside of a thread for signal. This is dirty but works for 99.999% of cases. |
Also, this is racecar related not ruby-kafka. Could you please move it here: https://github.com/zendesk/racecar |
@mensfeld Do you think it is happening due to some my logging? I think it is due to @logger.info "Disconnecting broker #{id}" in the broker_pool class PS: I did only spend 10 mins looking into this issue as I was busy creating services. I will looking into this in more details as this thing hangs the termination quite a bit. |
@GeminPatel yes but handling signals is not a part of ruby-kafka as it is "runtime environment" agnostic. It's a bug in racecar on how it handles trap context and what it does there with ruby-kafka client, not a ruby-kafka bug. Stacktrace not always points to the error source. I know that as I had similar problem with Karafka and managed to bypass that with spawning a new thread: https://github.com/karafka/karafka/blob/1.3-wip/lib/karafka/status.rb#L24 It does not always prevent a deadlock (theoretically) but is more than enough even for extensive instrumentation. |
@mensfeld: I created this here, because I thought that racecar is correctly using the APIs of ruby-kafka and because I thought that is is expected of the racecar [or some other wrapper on ruby-kafka] that eventually they will call this broker-pool close method in the trap. So I thought it is better that ruby-kafka handle this. But if you say I will create a ticket there as well. But they really cant 100% fix this as you said. But I also agree that ruby-kafka should stay agnostic. :( |
I don't think this should be fixed in ruby-kafka. Hope @dasch does not disagree with me on that. As I said, ruby Kafka is not context aware and provides a different layers in the Kafka processing stack than racecar (as well as other frameworks). This happens due to the way racecar handles ruby-kafka within trap context and due to instrumentation and logging that is used. Nonetheless it's still a racecar issue that involves Kafka not the other way around. |
@mensfeld given that logging mode for production is almost always info. And that racecar can never fix this issue 100% on its own without monkey patching or rewriting some classes. I don't know what other way there is to be sure that this does not occur. But I think thread is an workable approach if we really do not want to change ruby-kafka. |
Keep in mind, that the fact that trap occurs does not mean that the ruby-kafka client needs to receive the shutdown signal from within it. There are other patterns to handle that. |
Yeah, I think Racecar should ensure that the calls to ruby-kafka don't happen from a trap context. Can you open an issue on Racecar? |
@dasch already did. I have ref it here as well. |
Since this issue is from Racecar, not directly related to ruby-kafka. I'll close this issue. |
When a racecar consumer is running and I send a SIGTERM to it. It prints the following:
Not always but sometimes.
I think this maybe because we can not use logger in signal handler https://bugs.ruby-lang.org/issues/7917, but ruby-kafka is doing this. If you follow trap signal handling by racecar you will finally land in a code in ruby-kafka that uses logger.info
I read this somewhere which seems real helpful:
How to resolve is, for example, use IO.pipe and puts into pipe in the signal handler, and treats the signal somewhere else as sidekiq does https://github.com/mperham/sidekiq/blob/b6de0bae6b86d5abae25852141768d8ecc8ddedf/lib/sidekiq/cli.rb#L53
The text was updated successfully, but these errors were encountered: