Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Using Mosquitto 1.6.7 (checked both 'master' and 'fixes' branches), I'm seeing an issue with the web socket 'take over' behavior where, once a client has taken over an old session, it doesn't get any messages if it subscribes to the same topics as the old session.
A publisher is writing data to
A websocket client is connected with client_id
Mosquitto then correctly prints "Client example already connected, closing old connection".
The new client happily subscribes to the
There are no errors in mosquitto's logs.
I think I've found an easy way to reproduce this by having:
And then connecting my own client using
(No data is received).
The old client is disconnected. No data is ever received for the new session. If I then, however, close this connection, and try to connect again, it'll immediately work:
It'll also immediately work if:
This makes it seem as if the old session isn't cleaned up properly and mosquitto still "sees" the old session, and all it's subscriptions. It's almost as if it feels that "the client with id 'example'" is already subscribed to "example/example1" and thus doesn't feel a need to actually subscribe to it "again".
After some debugging I've tracked this down to the way that websocket clients are cleaned up in a takeover situation, as last changed by 5088202.
This issue seems similar to #1227, but the fix there doesn't seem to have helped and I'm wondering if it's doing what it should be doing:
If I replace this call to
The rationale behind this change is the fact that
I've just started with mosquitto so I'm not knowledgeable enough to see if this change would cause any issues. I'd appreciate it if someone with more in-depth knowledge of mosquitto could take a peek.
Thanks for the detailed description. I haven't checked this yet, but I'm sure you're right. Websockets is a right pain.
I don't think what you've proposed is correct. Adding the client to the disused list (using
I'll take a proper look at it soon.
Good point, hadn't considered that. That makes sense and I could definitely see it causing issues.
Not sure if it's of any help, but I've done some more testing myself and it seems that I can only reproduce this problem when the web socket listener is using SSL: If I test with regular web sockets then everything works as expected. I can only speculate as to why this is, but it may be related to the fact that it takes a while for a SSL connection to be properly closed?
I've created a small test script (Python 3) that can always reproduce the issue for me, when I have SSL enabled: https://gist.github.com/bverhoeven/00b638487df9f0a1b24a58d6954ffaaa#file-issue-1489-py, I've also included the test configuration I've used.
Quick update: I can also easily reproduce it using the public test broker at test.mosquitto.org, so the issue doesn't seem to be limited to just my setup:
server_host = 'test.mosquitto.org' server_port = 8081 server_use_tls = True
Experimental fix (I'm clocking off for this evening, it works but I haven't tested it further):
At first glance that does seem to fix the issue, but by the looks of it only when connecting with 'clean session' set?
A quick test on my side confirms that if I set
Edit: Would it be bad if this logic was applied to all clients regardless of the "clean_start" flag?
I can confirm that the issue is that the first web socket connection doesn't get closed (and thereby cleaned up/freed) until after the second client has managed to subscribe:
I've thought of ignoring clients marked for disconnect in the subscription logic (where I figured it probably checks if the subscription already exists), but that all seems a bit too far-fetched and feels hacky.
If we'd always call
The fact that the old connection still lingers on for a few more seconds doesn't seem to be harmful.
It also seems that, as long as we don't call
Again, not familiar with mosquitto internals, or too familiar with the specs, so pardon my ignorance.
Interesting. For me, with your patch applied, I'm still seeing the issue with clean_start == false:
It does work fine if I apply the
Okay, never mind any of that...
I thought my source tree was clean, but for some reason my helpful editor decided to use an older version of
Also, my apologies for the earlier confusion: I for some reason thought that subscriptions would not be part of a 'session', and thus would not be affected by 'clean session'. After reading the v3.1.1 standard, it turns out that subscriptions are in fact very much part of the session and everything should indeed be working exactly as you described.
With a clean source tree and your patch applied all issues appear to be resolved. Thanks!