New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

large retain message causes issues on topic subscription over websockets #427

Closed
blockbomb opened this Issue Apr 17, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@blockbomb

blockbomb commented Apr 17, 2017

My expereiences are as follows...

Steps to repeat

  1. Post a retain message to the broker at least > 4086 characters (this + the padding etc is more than the default 4096 buffer)
  2. connect to the broker with a websocket client (we were using the paho js library)
  3. subscribe to the topic that has the retain message

Observation when using broker on GNU/Linux

The retain message shows up in the js client.

Observation when using broker on mac

The retain message shows up in the js client.

Observation when using broker on Windows

If you compiled the libwebsockets library with DEBUG you will get an abort in output.c:123
if it is not debug, that function will return -1 which results in the websockets being shutdown and connection being lost to the broker.

Problem analysis

The issue seems to arise within the mqtt_callback when the message to be sent has been buffered internally by libwebsockets. but the mosquitto struct has more packets to send.

After reading about libwebsockets their documentation stresses the importance of not calling lws_write outside of the writeable callback. however I have not seen much direction of calling lws_write multiple times within the callback, my inkling is that this should also be avoided.

When I debugged into the libwebsockets, I noticed that there is a slight functionality difference in their lws_send_pipe_choked function that breaks out of this loop allowing for their event loop to continue on. if I edit the lws-plat-win.c implementation to function as the lws-plat-unix.c I can get windows to at least preform the same between the two platforms. I have posed this question to the libwebsockets project warmcat/libwebsockets#873?

Now let us revisit my suspicions on mosquitto using the libwebsocket library incorrectly by looping through every packet that it wants to send. If I instead do not loop with a while and change it to an if statement. the desired result also takes place (all systems preform the same). I believe this is what the origional author wished to do as the last line in the loops is

if (mosq->current_out_packet) {
    libwebsocket_callback_on_writable(mosq->ws_context, mosq->wsi);
}

which should prod the event loop that next time something is writtable go on and continue sending my packet.

I believe the only reason that it works on the unix like systems is that the choked is reporting that the pipe is choked and it is yeilding control back to the event loop, as it seems it should do in the first place.

I would appreciate if somebody with a little more experience in the area could make sure that I am not missing anything by proposing we remove the loop in the callback.

@ralight

This comment has been minimized.

Show comment
Hide comment
@ralight

ralight Jun 19, 2017

Contributor

If I understand correctly, you're saying that changing the while on line 253 of websockets.c to an if fixes this problem? The intention of it being a while there was to try and send as many mqtt packets per websockets packet as possible. There shouldn't be a downside to sending them out one mqtt packet per callback.

Contributor

ralight commented Jun 19, 2017

If I understand correctly, you're saying that changing the while on line 253 of websockets.c to an if fixes this problem? The intention of it being a while there was to try and send as many mqtt packets per websockets packet as possible. There shouldn't be a downside to sending them out one mqtt packet per callback.

ralight added a commit that referenced this issue Jun 19, 2017

@ralight ralight added this to the fixes-next milestone Jun 27, 2017

@ralight

This comment has been minimized.

Show comment
Hide comment
@ralight

ralight Jun 27, 2017

Contributor

I believe this is fixed in the upcoming 1.4.13, so I'm closing this.

Contributor

ralight commented Jun 27, 2017

I believe this is fixed in the upcoming 1.4.13, so I'm closing this.

@ralight ralight closed this Jun 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment