-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reader can block when default ping handler writes pong #97
Comments
Thank you for debugging this issue and reporting it. I'll look at this in the next few days. |
Welcome. If you have trouble reproducing I'll help with it (next week) if you email me. Cheers. |
The easiest fix is to write the pong from a goroutine as you suggest. To prevent a possible DoS, the ping handler should either queue or discard the pong when a goroutine is in flight. |
I have not found a fix for this issue that does not introduce another problem. The best circumvention for an application designed to fill the write buffer is to write the PONG messages from the writing goroutine:
|
I documented that the default ping handler can block in e2e3d84. This is a corner case for most server applications. Browsers do not send the pings that trigger the issue. A full write buffer is usually the symptom of another problem (dead or stuck client, no rate limiting or back pressure on data coming into the system, ...). If needed, an application can avoid blocking by replacing the default ping handler. |
server: - always has read & write timeouts - ping handler never blocks the reader (see A) - sends regular pings to check up on a client A: at some point server write buffer can become full, so in order not to block reads from a client (see gorilla/websocket#97), server may skip some pongs. As a result, client may disconnect. But you either have to do that or block the reader. There is no third way. client: - optional read & write timeouts - optional ping/pong to measure latency
It is documented that only one goroutine should be able to write, and only one goroutine read.
It is also documented that the callback for SetPingHandler may be called from read methods, which sort of implies that it gets called from a read goroutine.
Anyways, what's not obvious is that calling write methods directly from within a PingHandler callback can cause a (temporary) deadlock. The default PingHandler implementation also has this issue.
I do not have a minimal case that can reproduce this issue. But I will try to describe how it can be replicated.
1.a A goroutine to write some short message to the server repeatedly (forever)
1.b A goroutine to read messages from the server repeatedly (forever)
What will happen is that the write goroutine from 1.a will often end up blocking because it fills up the write-buffer from the client (as well as the read buffer from the server). Meanwhile, the ping messages sent by the server will call the PingHandler callback function, which ends up writing a pong message synchronously. This causes a deadlock, at least until the deadline for the pong message.
The way to circumvent this temporary deadlock is to write the pong message on a goroutine, like this:
Took me some 6 hours to debug this issue... maybe it would be good of the default PingHandler didn't have this issue, and if users were advised accordingly in the docs.
The text was updated successfully, but these errors were encountered: