connection failing intermittently #560

Closed
bpytlik opened this Issue Jun 7, 2013 · 3 comments

Projects

None yet

2 participants

@bpytlik
bpytlik commented Jun 7, 2013

I'm running into an issue where sometimes my client will successfully connect to my server and sometimes it will fail to connect. In those clients that successfully connect, here's the sequence of (relevant) system calls that are emitted (per strace):

socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 10

connect(10, {sa_family=AF_INET, sin_port=htons(25004), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)

getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0

write(10, "GET /socket.io/1/websocket/BajUhY_8xzo-_XRIhP8h HTTP/1.1\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: MTMtMTM3MDY0MDI5NzgwNA==\r\nHost: localhost:25004\r\n\r\n", 196) = 196

read(10, "HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept: 7U1sT9kzbqM2W9Vv5cBYvnj2bGw=\r\n\r\n", 65536) = 129

setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0

read(10, "\201\0031::", 65536)          = 5

When the client fails to connect, here's the sequence of system calls:

connect(10, {sa_family=AF_INET, sin_port=htons(25004), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)

getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0

write(10, "GET /socket.io/1/websocket/DeS7QX6gRSkl-_DIhP84 HTTP/1.1\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: MTMtMTM3MDY0MDU2NDg0MA==\r\nHost: localhost:25004\r\n\r\n", 196) = 196

read(10, "HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept: kEaSyYQBCSitOAWCk9adbN7e5BY=\r\n\r\n\201\0031::", 65536) = 134

setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0

AFAICT, the key difference is in what the read syscall gets as a value. In the non-working case, it seems like what should take two reads to receive is being collapsed into a single read. Supporting this is that I see the most connection issues when the system is under load. My theory is that when the system isn't loaded, the client has time to read the first value before the server sends the second one, but when the machine is loaded, it's more likely that the server will send both packets before the client can read the first.

My tracing of the server shows that the server is always doing separate writes to the socket. I'll try to use tcpdump/wireshark to observe the actual packets in flight to determine whether the packet is being formed incorrectly on the server or read incorrectly on the client.

node version: v0.10.8
socket.io-client version: 0.9.11
socket.io version: 0.9.14

@bpytlik
bpytlik commented Jun 8, 2013

And to follow up, tshark/wireshark shows that two separate packets are being sent from the server, one of length 129 and one of length 5 whether the client succeeds or fails. This strongly suggests the problem is in the client. Specifically in how it handles a second packet arriving before the first has been read.

@bpytlik
bpytlik commented Jun 8, 2013

And one more follow up, it seems like this may be confined to the websockets transport, or at least is transport specific as so far I'm unable to duplicate the issue if the transport chosen is xhr-polling.

@bpytlik
bpytlik commented Jun 8, 2013

After digging into _http_client in node's library and WebSocket.js from ws, it looks to me like the bug is outside this package and is in its dependencies. My plan is to leave this open until the problem is fixed in ws (assuming that's where the issue is) and is pulled into a release here.

@rauchg rauchg closed this Nov 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment