-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix high cpu sasl loop after disconnect #2032
Fix high cpu sasl loop after disconnect #2032
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! Needs some minor tweaks
…L response Currently disconnects went unnoticed and forced the client into a tight loop while waiting for an answer on a closed socket. This resulted in high CPU usage in the client and the inability to reconnect.
308b560
to
11ba647
Compare
Rebased and squashed fixes into original commit as per contributing guidelines |
@edenhill anything else you want me to do here? According to your comments in various threads, 11.6 seems to be close and I really hope to see this included. |
0.11.6 would be a stretch, it is way past its deadline and needs to get out the door. |
This is a really nasty bug for everyone using SASL auth. A restart of any broker node will immediately make the clients go 100% CPU per thread. I've seen boxes logging CPU loads in excess of 350 due to this issue, making them completely inaccessible. Since this also affects producers it will result in message loss (on the consumer side it "only" prevents the affected threads from receiving messages). There are no known workarounds and there is no way to detect the problem in the calling code. As far as I can see all platforms are affected. Not sure how much worse a bug could be? The only reason, why this hasn't caught much more attention seems to be the fact that auth isn't used that commonly. The fix on the other hand seems trivial ans will only affect those people using auth. ... but it certainly is your call. |
@stefanseufert Valid points, I'll get back to you on this tomorrow. |
Thanks alot for this! |
…L response (#2032) Currently disconnects went unnoticed and forced the client into a tight loop while waiting for an answer on a closed socket. This resulted in high CPU usage in the client and the inability to reconnect.
Should fix #1937