Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reader can block when default ping handler writes pong #97

Closed
jaekwon opened this issue Dec 8, 2015 · 5 comments
Closed

Reader can block when default ping handler writes pong #97

jaekwon opened this issue Dec 8, 2015 · 5 comments
Assignees

Comments

@jaekwon
Copy link

jaekwon commented Dec 8, 2015

It is documented that only one goroutine should be able to write, and only one goroutine read.
It is also documented that the callback for SetPingHandler may be called from read methods, which sort of implies that it gets called from a read goroutine.

Anyways, what's not obvious is that calling write methods directly from within a PingHandler callback can cause a (temporary) deadlock. The default PingHandler implementation also has this issue.

I do not have a minimal case that can reproduce this issue. But I will try to describe how it can be replicated.

  1. Create a gorilla client program with two threads:
    1.a A goroutine to write some short message to the server repeatedly (forever)
    1.b A goroutine to read messages from the server repeatedly (forever)
  2. Create a gorilla server program that reads a message and then replies with a longer message, and occasionally also writes ping messages.
  3. Connect these two programs together on the same development machine (e.g. localhost).

What will happen is that the write goroutine from 1.a will often end up blocking because it fills up the write-buffer from the client (as well as the read buffer from the server). Meanwhile, the ping messages sent by the server will call the PingHandler callback function, which ends up writing a pong message synchronously. This causes a deadlock, at least until the deadline for the pong message.

The way to circumvent this temporary deadlock is to write the pong message on a goroutine, like this:

    con.SetPingHandler(func(m string) error {
        go con.WriteControl(websocket.PongMessage, []byte(m), time.Now().Add(time.Second*wsWriteTimeoutSeconds))
        return nil
    })

Took me some 6 hours to debug this issue... maybe it would be good of the default PingHandler didn't have this issue, and if users were advised accordingly in the docs.

@garyburd garyburd self-assigned this Dec 8, 2015
@garyburd
Copy link
Contributor

garyburd commented Dec 8, 2015

Thank you for debugging this issue and reporting it. I'll look at this in the next few days.

@jaekwon
Copy link
Author

jaekwon commented Dec 10, 2015

Welcome. If you have trouble reproducing I'll help with it (next week) if you email me. Cheers.

@garyburd
Copy link
Contributor

The easiest fix is to write the pong from a goroutine as you suggest. To prevent a possible DoS, the ping handler should either queue or discard the pong when a goroutine is in flight.

@garyburd garyburd changed the title Temporary deadlock from handlePing/handlePong Default ping handler can block on writes Mar 10, 2016
@garyburd garyburd changed the title Default ping handler can block on writes Reader can block when default ping handler writes pong Mar 10, 2016
@garyburd
Copy link
Contributor

I have not found a fix for this issue that does not introduce another problem.

The best circumvention for an application designed to fill the write buffer is to write the PONG messages from the writing goroutine:

   pongs := make(chan string, 1) // capacity must be >= 1

   c.SetPingHandler(func(m string) error {
         select {
         case pongs <- m:
         default:
         }
         return nil
   })

   for {
       select {
       case m := <-pongs:
            if err := w.WriteMessage(websocket.PongMessage, []byte(m)); err != nil {
                // handle error
             }
       default:
       }

       if err := w.WriteMessage(websocket.TextMessage, data); err != nil {
           // handle error
       }
  }

@garyburd
Copy link
Contributor

I documented that the default ping handler can block in e2e3d84.

This is a corner case for most server applications. Browsers do not send the pings that trigger the issue. A full write buffer is usually the symptom of another problem (dead or stuck client, no rate limiting or back pressure on data coming into the system, ...).

If needed, an application can avoid blocking by replacing the default ping handler.

melekes added a commit to tendermint/tendermint that referenced this issue Aug 10, 2017
server:
- always has read & write timeouts
- ping handler never blocks the reader (see A)
- sends regular pings to check up on a client

A:
at some point server write buffer can become full, so in order not to
block reads from a client (see
gorilla/websocket#97), server may skip some
pongs. As a result, client may disconnect. But you either have to do
that or block the reader. There is no third way.

client:
- optional read & write timeouts
- optional ping/pong to measure latency
@gorilla gorilla locked and limited conversation to collaborators Feb 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants