zmq_send does not timeout when ZAP authentication fails. #882

hashstat · 2014-02-11T22:45:37Z

When a server performing ZAP authentication rejects a client with a 400 (or other non-200) error, the client, sending data with zmq_send (and presumably all send functions), reattempts the connection and authentication indefinitely, unless ZMQ_RECONNECT_IVL is set to -1. There is currently no way to limit the number of authentication failures nor is there a way for the client to tell that authentication failed (because zmq_send succeeds and performs the authentication in the background).

It seems that there are three stages to the connection: socket connection, ZMQ handshake, and ZAP handshake. A failure in any three of these should probably be considered a connection failure. Does it make sense to treat them differently? If ZMQ_IMMEDIATE is set, it seems that zmq_send (and others) should block until all three steps are performed successfully. After looking at the code, it is not immediately apparent to me how to implement any of this. Any thoughts, suggestions, etc.?

drewcrawford · 2014-02-16T12:19:19Z

For the record, I spent three (3!) days tracking down a bug that was obscured by the behavior here--e.g., nothing user-visible happens when authentication fails (keeps retrying).

This can also occur when the client has the wrong key for the server. In that case the behavior is especially bad since no ZAP authentication ever occurs, the client is never rejected, and it is not at all obvious why the connection is not succeeding. You pretty much have to resort to Wireshark to figure out why it is failing in that case.

In my opinion, the right thing to do is to report an error at zmq_connect. I realize that zmq_connect is not in the habit of reporting TCP errors to the application (for example, it seems to be by-design behavior that servers can be brought up after clients), however, an authentication error is likely a permanent failure, and it is not going to resolve itself without deliberate human intervention. And the human will not know to intervene if no error is reported anywhere.

However, the behavior I am advocating is contrary to the current tests and so I suspect may be controversial. I am probably biased by the time I spent investigating this, but in my view the benefits far outweigh the change in behavior.

hintjens · 2014-02-16T16:59:00Z

On Sun, Feb 16, 2014 at 1:19 PM, Drew Crawford notifications@github.com wrote:

For the record, I spent three (3!) days tracking down a bug that was obscured by the behavior here--e.g., nothing user-visible happens when authentication fails (keeps retrying).

Sorry about that, it's really not good. Let's make sure it doesn't hit
more people.

This can also occur when the client has the wrong key for the server.

Indeed. What about, in the short term, explicit stderr output in
libzmq when such things happen on CURVE authentication? It's not a
nice long term solution but will help as we work through the different
cases.

In my opinion, the right thing to do is to report an error at zmq_connect.

Right. At least in the connector, when it gets cut off prematurely.

I'll make a patch for some error reporting at both sides that we can
hide away later as needed.

Really sorry for the lost time... :-/

-Pieter

hintjens · 2014-02-16T19:22:24Z

I've added temporary tracing to the server side, which should be
enough. We'll fix the reconnect behavior itself separately.

#898

On Sun, Feb 16, 2014 at 5:58 PM, Pieter Hintjens ph@imatix.com wrote:

On Sun, Feb 16, 2014 at 1:19 PM, Drew Crawford notifications@github.com wrote:

For the record, I spent three (3!) days tracking down a bug that was obscured by the behavior here--e.g., nothing user-visible happens when authentication fails (keeps retrying).

Sorry about that, it's really not good. Let's make sure it doesn't hit
more people.

This can also occur when the client has the wrong key for the server.

Indeed. What about, in the short term, explicit stderr output in
libzmq when such things happen on CURVE authentication? It's not a
nice long term solution but will help as we work through the different
cases.

In my opinion, the right thing to do is to report an error at zmq_connect.

Right. At least in the connector, when it gets cut off prematurely.

I'll make a patch for some error reporting at both sides that we can
hide away later as needed.

Really sorry for the lost time... :-/

-Pieter

drewcrawford · 2014-02-16T22:39:16Z

Im not attacking you personally Peter! No need to apologize.. :-). You can't really complain about the quality of free software. All I meant to say was "this is a major usability problem".

Logging something cf. your patch is a big step in the right direction, some indication is infinitely better than no indication. So this patch would have taken me from three days to about 1.

However I can make a wish list of things that would make this better:

adding output on the client side. As I understand it, your current patch only logs on the server-side. (I may be misreading it). In my situation, the client-side was a more obvious place to look.
long-term there should be some way to catch and handle this case. If you think about the use case, an authentication failure is evidence that the user has missed type their password or similar (picked the wrong keyfile... picked the wrong server identity from a list...) The right thing for application to do here is to have the user retype the password or otherwise reconfigure the authentication options.

hintjens · 2014-02-16T23:10:15Z

Well, I did design this, so the fault is mine. Let's see how we can make
this a totally painless thing. Next step is to log in the client, to cover
at least the main errors like "you set a bad server key".

I'm not sure where in the client to log, which is why I didn't do that. In
fact we have to implement an error response to do this properly. For that I
need Martin Hurton's help.

I agree about returning the error up to the calling application.

Asmod4n · 2014-03-10T09:27:30Z

Hitting the same problem, the server never stops asking the authentication handler when the handler answers with a non 200 reply, can't release my software that way :(

This happens with all authentication methods.

https://github.com/Asmod4n/celluloid-zmq-zap/blob/master/examples/curve_redis.rb

comment out line 14 and it will ask the redis server endlessly

Asmod4n · 2014-03-10T16:42:38Z

@hintjens yeah, this needs a new error code, something like "i don't like you, don't come back again" so the requesting socket doesn't attempt to connect again.

hintjens · 2014-04-29T20:54:14Z

I'm closing this issue to make a more explicit one.

davidblewett · 2015-05-27T01:09:37Z

@hintjens what ticket were you referring to? #883? If so, that ticket doesn't appear to reference some kind of feedback to the client that an auth error occurred.

Currently, you can set ZMQ_RECONNECT_IVL to force the socket to not try to reconnect on authentication failure, but the client code is never made aware of any potential issues.

hintjens · 2015-05-27T05:24:14Z

This issue was fixed in ZeroMQ 4.1.0

johnnykv · 2016-06-27T20:18:59Z

@hintjens How was this fixed in 4.1.0? Was it only the reconnect strategy that was fixed, or is there also implemented visibility (logging) when auth fails (bad key, invalid key, etc)

hintjens · 2016-06-27T20:23:45Z

From memory, there is still debugging output whenever there's an
exceptional condition during authentication (at the libzmq side). We also
catch and deal with the timeout in libzmq, as far as I remember.
On 27 Jun 2016 10:19 pm, "Johnny Vestergaard" notifications@github.com
wrote:

@hintjens https://github.com/hintjens How was this fixed in 4.1.0? Was it
only the reconnect strategy what was fixed, or is there also implemented
visibility (logging) when auth fails (bad key, invalid key, etc)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#882 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AASzCA_SSy_gYgeRceONPCZ7QMWszndCks5qQDA9gaJpZM4BhI1G
.

sigiesec · 2017-08-23T18:04:08Z

@johnnykv Monitoring events for encryption and authentication errors were recently added by #2645 and #2665 and subsequent PRs.

Regarding the reconnect behavior: the client is sent an ERROR for ZAP 400 errors. The CURVEZMQ spec demands that the client does not reconnect in case it received an ERROR. However, the current implementation does not warrant this, to my knowledge. This is requested by #990

hashstat mentioned this issue Feb 11, 2014

Reconnect due to ZAP authentication failure ignores ZMQ_RECONNECT_IVL_MAX. #883

Closed

TheTacoScott mentioned this issue Mar 19, 2014

Assertion failed: !more (fq.cpp:99) #939

Closed

Asmod4n mentioned this issue Apr 9, 2014

Resolve issue #939 #958

Merged

hintjens closed this as completed Apr 29, 2014

sigiesec added Area (Runtime / Usage) Security labels Aug 23, 2017

Enfernuz mentioned this issue Feb 3, 2018

Добавить аутентификацию / Add authentication Enfernuz/quik-lua-rpc#1

Closed

jean-airoldie mentioned this issue May 13, 2019

Problem: outgoing messages are still queued when auth fails #3505

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zmq_send does not timeout when ZAP authentication fails. #882

zmq_send does not timeout when ZAP authentication fails. #882

hashstat commented Feb 11, 2014

drewcrawford commented Feb 16, 2014

hintjens commented Feb 16, 2014

hintjens commented Feb 16, 2014

drewcrawford commented Feb 16, 2014

hintjens commented Feb 16, 2014

Asmod4n commented Mar 10, 2014

Asmod4n commented Mar 10, 2014

hintjens commented Apr 29, 2014

davidblewett commented May 27, 2015

hintjens commented May 27, 2015

johnnykv commented Jun 27, 2016 •

edited

Loading

hintjens commented Jun 27, 2016

sigiesec commented Aug 23, 2017 •

edited

Loading

zmq_send does not timeout when ZAP authentication fails. #882

zmq_send does not timeout when ZAP authentication fails. #882

Comments

hashstat commented Feb 11, 2014

drewcrawford commented Feb 16, 2014

hintjens commented Feb 16, 2014

hintjens commented Feb 16, 2014

drewcrawford commented Feb 16, 2014

hintjens commented Feb 16, 2014

Asmod4n commented Mar 10, 2014

Asmod4n commented Mar 10, 2014

hintjens commented Apr 29, 2014

davidblewett commented May 27, 2015

hintjens commented May 27, 2015

johnnykv commented Jun 27, 2016 • edited Loading

hintjens commented Jun 27, 2016

sigiesec commented Aug 23, 2017 • edited Loading

johnnykv commented Jun 27, 2016 •

edited

Loading

sigiesec commented Aug 23, 2017 •

edited

Loading