Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix reconnect loop after heartbeat timeout #533

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
4 participants

This fixes a reconnect loop that occurs after heartbeat timeout.
The repro is something like this:

  1. Establish a socket.io connection.
  2. Kill your internet connection for long enough for the heartbeat
    timeout to fire. (turn off wifi, unplug ethernet, etc.)
  3. Reconnect your network. The socket will reconnect, but will soon be
    disconnected by a heartbeat timeout from the PREVIOUS websocket
    connection.
  4. A minute or so later, your socket will disconnect again, and so on.

This is what happens internally:

  • Heartbeat calls transport.onClose
  • transport.onClose calls socket.onDisconnect and thus socket.reconnect
  • socket.reconnect swaps out the old transport for a new one
  • socket.onDisconnect also calls transport.close and thus
    websocket.close
  • websocket.close puts the websocket into the CLOSING state
  • new transport/websocket successfully establishes a connection
  • old websocket (after some time) goes from CLOSING to CLOSED and fires
    its onclose handler which calls transport.onClose on the OLD transport
  • old transport still has a reference to the current socket, so it calls
    socket.onDisconnect, which closes the new socket
  • loop back and do it all again

In order for this to happen, the reconnect has to take place after the
socket reconnects, but before the original socket times out and
switches to the CLOSED state.

Fix:

My fix is pretty terrible. I'm checking to make sure the socket's
transport is the same as the transport handling the onClose event.
Perhaps a better solution would be to remove the onclose handler on the
websocket if the connection is explicitly closed. There also may be
some better way of cleaning up the old transport during a reconnect.

I'm open to suggestions to how to do this better. It may be as simple as getting rid of the onclose event handler in the websocket itself as soon as transport.onClose is called. I don't know enough about the internals here to make a judgement call.

I'm also not sure if this effects 1.0 at all. The code seems to be pretty different, though I haven't spent a whole lot of time looking through master.

I have the same problem on using socket.io-client with Dojo. This change solves the problem. Thank you!

bdimcheff added some commits Mar 26, 2013

This fixes the reconnect loop that occurs after heartbeat timeout.
The bug is something like this:

1. Establish a socket.io connection.
2. Kill your internet connection for long enough for the heartbeat
timeout to fire.
3. Reconnect your internet. The socket will reconnect, but will soon be
disconnected by a heartbeat timeout from the PREVIOUS websocket
connection.

This is what happens internally:

- Heartbeat calls transport.onClose
- transport.onClose calls socket.onDisconnect and thus socket.reconnect
- socket.reconnect swaps out the old transport for a new one
- socket.onDisconnect also calls transport.close and thus
  websocket.close
- websocket.close puts the websocket into the CLOSING state
- new transport/websocket successfully establishes a connection
- old websocket (after some time) goes from CLOSING to CLOSED and fires
  its onclose handler which calls transport.onClose on the OLD transport
- old transport still has a reference to the current socket, so it calls
  socket.onDisconnect, which closes the *new* socket
- loop back and do it all again

In order for this to happen, the reconnect has to take place *after* the
socket reconnects, but *before* the original socket times out and
switches to the CLOSED state.

Fix:

My fix is pretty terrible.  I'm checking to make sure the socket's
transport is the same as the transport handling the onClose event.
Perhaps a better solution would be to remove the onclose handler on the
websocket if the connection is explicitly closed.  There also may be
some better way of cleaning up the old transport during a reconnect.

@bdimcheff bdimcheff commented on the diff Mar 27, 2013

dist/socket.io.js
@@ -3865,7 +3868,4 @@ var swfobject=function(){var D="undefined",r="object",S="Shockwave Flash",W="Sho
, this
);
-if (typeof define === "function" && define.amd) {
@bdimcheff

bdimcheff Mar 27, 2013

I'm not sure what this is all about... I just re-ran make build and it removed this code.

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment