Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

SockJS thinks connection is OPEN even though websocket handshake fails #94

Closed
cilia opened this Issue Oct 23, 2012 · 20 comments

Comments

Projects
None yet
9 participants

cilia commented Oct 23, 2012

In some (probably misconfigured) network, websocket handshake fails to complete with the server even on port 80. However, SockJS assumes the connection is OPEN, as 'onopen' event is fired.

This causes the client to believe that a websocket connection is established, but any msg sent between the client and the server will go missing. The client-server interaction simply hangs till timeout. In addition, it also prevents SockJS to switch to another (potentially working) transport, because websocket is already in use.

Owner

majek commented Oct 23, 2012

SockJS only marks the connection as OPEN when it receives an open SockJS frame. This is basically a normal data over websockets. That means that if SockJS fires onopen event, there was a valid data frame arriving over websockets. Therefore by definition WS handshake must have worked.

In theory it is possible that data sent from the client to the server won't work, but it's pretty unlikely.

You can always set the 'websocket:false' option in sockjs-node options, to disable ws on the server side and force the use of other protocols.

In any case - please let me know how to reproduce the situation you're talking about.

cilia commented Oct 23, 2012

I think this is one of those edge cases. I discovered this because I was testing a SockJS app in a particular network environment which prevents the browser client from properly complete the websocket connection. For example, while in this network, I go to http://websocketstest.com/, and I can only see 'connected' and 'data received' for port 80 and 443 (without SSL). I guess that 'data received' qualifies as 'WS handshake worked' in SockJS.

For example, this is a websocket object I see using Chrome developer tools when the problem occurs:

Request URL:ws://www.mysite.com/sockjs/307/gxrzu0e6/websocket
Request Method:GET
Status Code:101 Switching Protocols

Request Headers
Connection:Upgrade
Host:www.mysite.com
Origin:http://www.mysite.com
Sec-WebSocket-Extensions:x-webkit-deflate-frame
Sec-WebSocket-Key:E7uYzLwfLi/SawcGBcGEig==
Sec-WebSocket-Version:13
Upgrade:websocket
(Key3):00:00:00:00:00:00:00:00

Response Headers
Connection:Upgrade
Sec-WebSocket-Accept:JzEG6Q5bwLIf4oVykcSkNZhEkpA=
Upgrade:websocket
(Challenge Response):00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00

And the 'Type' and 'Time Latency' for this websocket object just shows 'Pending' in the developer tools. Sending a message in SockJS does not cause error, but msg will never arrive at the server.

Although it might be rare, but since we cannot control a user's network environment, it would be better for SockJS to detect this problem and switch to other transport immediately. Otherwise, users within such network environment will not be able to use SockJS at all, unless we sacrifice websocket transport for everyone.

Owner

majek commented Oct 24, 2012

What about ws over https? Can you verify that
http://sockjs.popcnt.org/example-cursors.html doesn't work while https://sockjs.popcnt.org/example-cursors.html works?

Although it might be rare, but since we cannot control a user's network environment, it would be better for SockJS to detect this problem and switch to other transport immediately.

The situation you're describing is pretty interesting. But I will not drive any conclusions until I know how to reproduce this - if you're suspecting specific network configuration than what exactly? Some proxy? if so, what exactly?

cilia commented Oct 24, 2012

I was just going to point out that https would help this problem. Indeed, within the faulty network environment that is causing the problem, I can use WS over https (https://sockjs.popcnt.org/example-cursors.html), but not the http version. In addition, with the http version, all polling/streaming transport (except for jsonp-polling) don't works either! In the https version, these work fine.

I see the following error message for the polling/streaming transports in http version:

disconnected SimpleEvent(type=close, code=2000, reason=All transports failed, wasClean=false, last_event=SimpleEvent(type=close, code=2007, reason=Transport timeouted, wasClean=false))

There is no error message for WS over http in Chrome - it just says 'connected websocket', but the cursor does not work. In Safari, however, the same error message above appears for WS.

I don't have access to the configuration of the faulty network, so I'm not sure how I can find out about the specific offending setting(s). But if you let me know what I can do to help diagnose the problem without directly looking at the configuration itself, I am happy to do so. Thanks!

wavded commented Oct 26, 2012

experiencing same issue for one of our clients, websocket tries to upgrade over port 80 but never completes, sockjs things its connected, and never tries another protocol (i've experienced this same thing with socket.io)

wavded commented Oct 28, 2012

fyi, switching over to SSL fixed this issue for the client in question

Owner

majek commented Oct 30, 2012

Everyone says that doing SockJS over SSL fixes the issue - great. Additionally, here are two more similar reports (@mrjoes):

http://stackoverflow.com/questions/13032005/websocket-handshake-hangs-with-haproxy/
https://groups.google.com/forum/?fromgroups=#!topic/python-tornado/09RqXjflSA0

It looks like communication from Server to Client works fine over WS works fine, but doesn't the other way around.

I you encounter this please, help me with reproducing the issue - we really need to find out what proxy / antivirus / firewall / network config is responsible for this behaviour!

@majek majek added a commit that referenced this issue Oct 30, 2012

@majek majek #94 - SockJS is more stable over SSL 8e045f3

@majek majek added a commit that referenced this issue Oct 30, 2012

@majek majek #94 - SockJS is more stable over SSL 433cca6
Member

mrjoes commented Oct 30, 2012

Not only SSL fixes the issue, but serving SockJS on any other port fixes the issue.

Most likely, it has something to do with passive HTTP proxy or some kind of IDS/IPS thingie, which inspects/filters HTTP traffic and fails to work properly with websockets. It appears that it works in simplex HTTP-only mode, where it sends headers and waits for complete response before sending anything else to the server.

I know at least one product which causes problem: FortiGate proxy. Not sure what's exact product name, but something from their product line.

Contributor

cgbystrom commented Nov 27, 2012

I've seen this issue with antivirus/personal firewalls as well, switching to SSL did help a lot for us.

Sounds a like change in the handshake procedure for certain transports could prevent this.
Resorting to SSL is a bit draconian IMHO since this behavior is usually due to faulty software buffering/filtering Web Socket traffic (only client -> server).

Perhaps if the Web Socket transport explicitly requested the open frame rather than having the server sending it on connect? If such handshake succeed it would prove that a bi-directional Web Socket has been established, free from middle men.

(ping @majek)

Owner

majek commented Nov 27, 2012

@cgbystrom "I've seen this issue with antivirus/personal firewalls as well"

What firewalls?

Perhaps if the Web Socket transport explicitly requested the open frame rather than having the server sending it on connect?

We'll try to find a non-invasive solution. Maybe we'll send a heartbeat or something on WebSockets first. Once again, this bug only affects WebSocket connections.

Contributor

cgbystrom commented Nov 27, 2012

What firewalls?

avast! and Bitdefender have been particularly nasty. But overall, socket.io's list of troublemakers is quite conclusive I think and is consistent with our experiences from Beaconpush.

We'll try to find a non-invasive solution. Maybe we'll send a heartbeat or something on WebSockets first. Once again, this bug only affects WebSocket connections.

Cool, that sounds even simpler.

I've made implementations that relied on WebSocket.onopen callback to determine if a successful connection was established, which in hindsight wasn't a very good idea. The client thought everything was fine (but all traffic blocked) and didn't bother failing over to next transport. So if we can avoid that, it would be great :)

Owner

majek commented Nov 30, 2012

@cgbystrom When I was working on first SockJS protocol, I did look at the socket.io list of broken proxies, and, believe me or not SockJS protocol does fail websockets correctly for all the broken firewalls/antiviruses I tested.

Once again, SockJS marks websocket connection as successful only when two things happened:

  1. uderlying native websocket calls onopen (ie: websocket handshake succeeds)
  2. there is an "open" frame delivered over the websocket connection - IE: data flows in one direction correctly.

That works 99% of the time. There is exactly one report in this thread about this method not working - some say that "FortiGate proxy" does allow websocket handshake to succeed and does allow traffic from the server, but does not allow traffic in other direction.

So thanks for pointing out Avast and Bitdefender, but are you absolutely positive that SockJS does not work with them correctly right now?

Member

mrjoes commented Nov 30, 2012

I tested with Avast like half a year ago and they fixed their problem. I think I had Windows XP image with bugged version installed, need to check. Not sure about Bitdefender though.

Owner

majek commented Nov 30, 2012

Here it is, even behind broken Avast SockJS falls back correctly: sockjs/sockjs-protocol#25

Member

mrjoes commented Nov 30, 2012

Ah, I even participated in that one. Completely forgot about it :-)

I will try to get more information about FortiGate proxy.

Contributor

cgbystrom commented Dec 1, 2012

So thanks for pointing out Avast and Bitdefender, but are you absolutely positive that SockJS does not work with them correctly right now?

@majek, sorry if I was unclear. We've yet to migrate over to SockJS for Beaconpush so what incompatibility I was referring to has only been shown with our implementation of pure/Flash based Web Sockets.

I only wanted to make sure that SockJS does not experience the same issues we've seen. But as you mention, SockJS apparently handles most of the cases fine.

We are looking to migrate Beaconpush over to SockJS (sockjs-netty is the stepping stone for this). When that is done, we'll try to conduct a compatibility test as well.

Hi - I just wanted to point out that I'm experiencing the same problem with about 5% of my users on a website I've built. It seems that none of the users that work at Microsoft in Redmond cannot use my site (although it does work for them at home) - so I'm guessing their firewalls get in the way.

The socket tries to connect to the server, but the request doesn't reach the server at all. Then the socket on the client enters the OPEN state.

I'm using http://cdn.sockjs.org/sockjs-0.3.min.js and the client communicates with the server on port 3000.

I'm fine with websockets failing for 5% of my users - but can someone please tell me how I can detect when it fails so that I can do something about it?

Thank you!
Andrei

@andreisoare - did you try using SSL, and see if that takes websockets it through the firewalls? It seemed to do the trick for me (using Socket.io at the present time, but the concept is the same).

Contributor

brycekahle commented Oct 20, 2014

We don't even bind to the websocket 'open' event, but instead wait for a o message to declare a transport successful. Closing due to inactivity. Reopen if this issue is still valid.

@brycekahle brycekahle closed this Oct 20, 2014

grssam commented Feb 16, 2015

I have a related but not all together similar issue. My web socket connection is established successfully including a correct handshake and o message. The problem I am facing is that at times, the box which hosts the sock js server runs out of connections, i.e. max connection possible are reached. In this case, even the connected websocket connection is dropped. This may happen after some of the data has been transferred or even before that.

So my question is - is there anyway to handle a sudden drop in connection and reestablish it using a different mechanism for seamless data communication ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment