-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious SASL error messages #1218
Comments
What I think is happening here is that the SaslHandshakeRequest times out, causing the client to revert to broker.version.fallback protocol features that default to v0.9, lacking support for SaslHandshake (PLAIN, SCRAM, etc..). Why is it timing out? Because there was a bug that capped this timeout at... wait for it .. 10 ms! The workaround at this point is to set socket.timeout.ms to a value <= 10,000 (10s) |
thanks @edenhill ! |
Hi @edenhill,
Here are the options that we are currently using:
|
@seansabour Can you reproduce this with |
@edenhill We are using the node-rdkafka high-level language binding. I believe that this library currently uses librdkafka version 0.9.4. Referencing here: https://github.com/Blizzard/node-rdkafka I was able to reproduce with the debug set to
|
Thanks, could you provide more logs, I would like to also see what happens before broker 0 goes down? |
Sure, I hope this helps a bit more.. |
I'm looking at broker kafka05 in the logs and everything seems to go fine, it connects and authenticates, but is then suddenly disconnected by the remote peer (broker, network):
Can you check the broker logs for that same time and see if there is an exception? There's some weird log line duplication in the log file, not sure if you are running multiple client instances with the same log or something else is up. |
The only relevant messages I can find in the logs are:
There's quite a bit of those. The broker logic that throws it is:
|
@edenhill Any idea on what could be causing this? It is causing a lot of noise on our end and would not like to disable the logging incase a true issue occurs. |
Since these log messages indicates that the remote peer closed the connection you will need to check the broker or network (firewall, nat, session timeouts, etc) for hints. |
After further investigations, it looks like these messages could come from the brokers that are not being used by the client. As librdkafka connects to all the brokers in the cluster, the brokers that don't host any of the leader partitions will timeout the idle connections. I'll try to confirm that with the teams that are seeing this |
@seansabour Can you confirm #1218 (comment) ? From node-rdkafka, you can find the leaders for your partitions by calling |
Description
When running librdkafka (0.9.5) with node-rdkafka (0.10.2) using SASL Plain against a Kafka Cluster running 0.10.2.1 (also seen on 0.10.0.1), we often get error messages emitted back to the client. These happen intermittently and come from both the Consumer and Producer.
As far as we can tell these errors are harmless, the clients keep functioning as expected whenever they appear.
How to reproduce
Not sure yet. We've not been able to identify what triggers them. Our clients can run without a single error for days and then emit several errors every second for a while. We've also seen example when a rebalance made them stop.
Checklist
Please provide the following information:
All default + SASL/SSL
Linux and much less frequently macOS
No
Yes
debug=..
as necessary) from librdkafkaThe text was updated successfully, but these errors were encountered: