-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow clients slow down the whole broker #95
Comments
Hi @alexsporn! This is very interesting - the possibility never occurred to me. Currently I am inclined to think the best solution is the one you have described:
Perhaps we should also make the buffer size for inline publish an value in server.Options. @alexsporn what's your use case which triggered this? In the meantime I have increased the buffer to 4096 in v1.3.2 👍🏻 |
Hi @mochi-co , thanks for looking in to the issue. We are using MQTT over WebSocket as a Pub/Sub mechanism to listen to messages processed by our node software. Initially I thought it could be an issue in how we handle the incoming messages and publish them, so I went on to reproduce the bug. Using a JavaScript client (https://github.com/mqttjs/MQTT.js), publishing about 2000 packets a second and forcing the client to sleep between incoming messages to simulate slow processing of each packet, I could reproduce the MQTT broker lockup. Normally I'd say this would be no issue, but this can be used as a Denial-Of-Service attack on public brokers. With the proposed change, the slow QoS 0 client will not influence any other connected clients and slow down the broker. As soon as the slow client clears up enough buffer it will start receiving messages again. If the slow client is using QoS 1/2 this opens up another "attack vector" to the broker. If a long I totally understand that QoS 1/2 give certain guarantees on how MQTT behaves, but a slow client should not influence the brokers performance. Maybe we need a max count of inflight messages per client? What do you think? |
Hi @alexsporn, thanks for your comprehensive reply :) My apologies for not replying to this earlier, I have been very busy lately... I absolutely agree with all of the issues you've highlighted here, and have been trying to think about the best way to handle this and ensure we don't create any unintended consequences. I plan to look into it more thoroughly between now and the weekend if I get some time, but tentatively I think the correct (even expected) behaviour would be to drop the packet if the QOS is 0 and the client buffer is fully, otherwise to add it to the inflight queue. This should apply to both inline-message publishing by the embedding service, and also when a client publishes to the broker and the message is delegated out to subscribing clients. A brief reminder of the code suggests that writing to clients is blocking (in as much as we wait to write to the client's buffer if it's full). This makes me suspect that a client publishing to a topic with many subscribers could theoretically block until all clients are iterated, which is not ideal. I will have a think about how we might alleviate this bottleneck. |
@alexsporn I merged your recent PR, can you try pulling down master and seeing it the problem still exists? :) Thank you! |
@alexsporn I've reverted #97 and reopened this issue as the solution for #97 causes the broker to stall (as per #101) under heavy load. I believe this may be related to the broker dropping acks if the queue is full rather than waiting. |
This issue has been resolved in v2.0.0 |
We are using the MQTT broker and publishing messages directly to all clients using the broker's
Publish()
func.This func adds a new publish packet to the
inlineMessages.pub
buffered channel (size 1024) and theinlineClient()
loop will publish those packets to all subscribed clients.For each subscribed client this will call
client.WritePacket()
which in the end will callWrite()
on the clients writer.If a single subscribed client is too slow, the clients write buffer will fill up and the whole
inlineClient()
loop will hang until this client's buffer has space again (seeawaitEmpty
insideWrite()
). Shortly after theinlineMessages.pub
buffered channel will fill up and further calls toPublish()
will hang.This means a single slow client (even one using QoS 0 with no guarantees of receiving packets) can make the whole broker wait indefinitely and not deliver any more packets to any client.
A possible workaround for this could be to instead of waiting for the buffer to be freed, to just return a "client buffer full" error and skip sending the packet to this client. If the client is using QoS 1/2 the inflight message retry mechanism should try to re-deliver the message.
What do you think? I can write a PR with this changes. Or do you have a better solution to this problem?
The text was updated successfully, but these errors were encountered: