Socketstream cannot connect because connect.session is not executed during handshake #219

Closed
nponeccop opened this Issue Apr 6, 2012 · 10 comments

Projects

None yet

5 participants

Contributor

In many scenarios the only HTTP request a user performs is socket.io handshake request. As connect.session middleware is not used during this request, session id becomes missing or corrupt, and socketstream cannot connect or reconnect.

Scenario 1

User's institution uses Squid proxy to cache pages. Two users visit a website one after another. The second user gets all static resources from cache so when the socket.io handshake is performed, the session cookie is not present and socketstream connection fails.

Scenario 1a

Same as Scenario 1, but the website uses a reverse proxy.

Scenario 1b

The reverse proxy from scenario 1a strips Set-Cookie headers from all cacheable content. As socketstream is designed so ALL content is cacheable, the only place a user can get session cookie is socket.io handshake request. But as connect.session is not engaged during handshake, the browser doesn't have a chance to receive a cookie, and socketstream connections fail.

Scenario 2

A user keeps a page open for a long time, so session cookie expires. Then user's loses connectivity for some time. When connectivity is restored, socketstream tries to reconnect. As session id is missing during handshake and no other HTTP requests are performed during reconnection, connection fails.

Scenario 3

Web server is upgraded while a user keeps a page open. When server is shut down, the user loses connection and when server is brought back the same problems as in Scenario 2 appear. In addition, the session may become invalid because cookie name, format or signature was changed during upgrade.


As for solution to this problem, I'm thinking of inserting cookieParser and session middleware before socket.io instead of after.

While socket.io doesn't have direct support for decoration with connect-style middleware, we can use the same approach as socket.io itself. Here are the relevant fragments from manager.js of socket.io:

  // reset listeners
  this.oldListeners = server.listeners('request');
  server.removeAllListeners('request');

  server.on('request', function (req, res) {
    self.handleRequest(req, res);
  });
...

    for (var i = 0, l = this.oldListeners.length; i < l; i++) {
      this.oldListeners[i].call(this.server, req, res);
    }
Contributor

Our fix consists of 3 parts:

  1. insertion of HTTP middleware before socket.io (no patching of anything required)
  2. storing req.sessionID set by connect middleware in socket.io handshake object (1 line in socket.io lib/manager.js)
  3. using handshake.sessionID instead of handshake.headers.cookie in socketstream processSession() (1 line in socketstream websocket/transports/socketio/index.coffee)

It is far from being a pull request, but you should understand the idea.

Putting middleware before socket.io

This code we use in app.js to start the server. Dependency on express can be easily removed and the rest can be put inside ss.start and http.middleware.

Socket.io decorates a passed instance of builtin node HTTP server with its handlers for various events (connect, close, upgrade, request and maybe something else). We are only interested in the request handler installed by socket.io. We remove the handlers (an array of functions with 2 arguments and this) and convert it to a connect middleware (a single function with 3 arguments, 3rd is unused in our case).

Then we install the middleware function in our HTTP middleware chain so it comes after connect.session middleware and not before.

var socketIoRequestHandler

function routes(app)
{
    app.get('/', function (req, res) {
        res.serve('main')
    })
    // This can be implemented in socketstream just by comparison of req.url
    // without express router. This redirection of certain requests 
    // should be in ss.http.middleware at any point after connect.session
    app.all('/socket.io/*', socketIoRequestHandler)
}

var app = express.createServer()

var server = app.listen(3000);
var defaultListeners = server.listeners('request')
ss.start(server)

var ioRequestListener = server.listeners('request')

socketIoRequestHandler = function (req, res, next)
{
    ioRequestListener.forEach(function(listener)
    {
        listener.call(server, req, res)
    })
}

server.removeAllListeners('request');
defaultListeners.forEach(function (l)
{
    server.on('request', l)
})
// at this stage we removed listeners for request events installed by
// socket.io and replaced then with default listeners
// the removal and replacement should be a part of ss.serve()

app.use(ss.http.middleware)
app.use(express.router(routes))
Contributor

Storing req.sessionID in socket.handshake

A single line in Manager.prototype.handshakeData function in lib/manager.js of socket.io:

  return {
      headers: data.headers
    , address: connectionAddress
    , time: date.toString()
    , query: data.query
    , url: data.request.url
    , xdomain: !!data.request.headers.origin
    , secure: data.request.connection.secure
    , issued: +date
    // we just add this line
    , sessionID : data.request.sessionID
  };

Note that connect session object is also accessible here. So we could just use session : data.request.session and remove the need to reimplement an identical session object (and stores!) in socketstream. But we chose just to get sessionID to keep changes in socketstream minimal.

We cannot use data.headers.cookie because it only contains request headers. If a socket.io handshake request is the first request reaching origin servers, then connect.session engages because of the previous fix and sets Set-cookie header before handshakeData() receives control. But it's a response header and not a request header, so it's not in the headers field.

Node doesn't provide a way to inspect response headers already injected into current response, but fortunately connect.session middleware also monkey patches request object by adding .session and .sessionID fields. The latter is just a session ID without a HMAC signature, so it can be used by socketstream to fetch data from session store without additional parsing.

Basically, data.headers.cookie is not always present, but data.request.sessionID and .session are always present after our previous fix and contain the session information we need.

Using handshake.sessionID instead of handshake.cookie

This step is trivial. Here is a dirty fix:

diff --git a/src/websocket/transports/socketio/index.coffee b/src/websocket/transports/socketio/index.coffee
old mode 100644
new mode 100644
index b07843f..c46875e
--- a/src/websocket/transports/socketio/index.coffee
+++ b/src/websocket/transports/socketio/index.coffee
@@ -66,10 +66,7 @@ processSession = (socket) ->

   # Parse session ID from initial hankshake data
   try
-    rawCookie = socket.handshake.headers.cookie
-    cookie = qs.parse(rawCookie, '; ')
-    sessionId = cookie['connect.sid'].split('.')[0]
-    socket.sessionId = sessionId
+    socket.sessionId = socket.handshake.sessionID
   catch e
     console.log('Warning: connect.sid session cookie not detected. User may have cookies disabled or session cookie has expired
     false
owenb commented Apr 14, 2012

Many thanks @nponeccop. I'm currently working on releasing a few more things in time for recording the Node Up podcast tomorrow, but I'll take a look at this early next week.

atrefz commented Jul 18, 2012

Is there any Progress on this? I am hitting this problem right now quite a bit on my testing server which make use of the new feature to provide a socket url for socket.io directly. Since i have done that change, i get that cookie error, but only if i go on the "http" URL(testing.productname.com) and not if i go to the "real" url (host.com:port) that my websockets use. That Fix is really dirty for userland and i would love to avoid it.

Perhaps orhogonal to this issue or not, but SockJS recommends tokens instead of session cookies
https://github.com/sockjs/sockjs-node#authorization
https://github.com/socketstream/ss-sockjs/blob/master/client/wrapper.js#L14-21
Also engine.io support might become important (and does not rely on cookies either)
https://github.com/LearnBoost/engine.io/blob/master/test/server.js#L77-86

Contributor

Cookies serve only the purpose of linking HTTP sessions to Websocket connections. If you want your users authenticated by openid/oauth, you need this linking. If linking is not needed, cookies should not be used. It is not needed in case of password-based authentication, but password based auth is so 20th century and Web 1.0 :)

@nponeccop, Yes, and the linking can also be done via a token. One can get the token via many means, such as from HTTP requests to the original domain or via RPC calls, but when one gets hold of the token, one can link the token to the socketid by registering via a RPC call first thing after reconnect. Registering can also keep browser tabs identified. This could be assisted via RPC middleware.

owenb commented Aug 30, 2012

Hi guys

I need to make some decisions on this for SocketStream 0.4.

I'm all for going down the random token route, but one of the great features of 0.3 that I want to carry though to 0.4 is the ability to share sessions between WS & HTTP requests.

Hence, keeping some level of cookie support is likely to be necessary, though we will need to move back to reading this via client-side JS and sending the session ID to the server as the initial WS request, as Engine.io does not have a 'handshake' object (at least not that I can see). This is no bad thing as it will keep things more consistent between transports (as SockJS already uses this method).

If anyone has any smart ideas, now is great time to share them :)

Owen

@owenb, in engine.io you can get the original handshake request (with cookie headers) as socket.request.

Also if one wants to utilize the same connect.cookieParser instance (with same secret etc) one can apply it on the req by filling res and next with null and a nop function.

Contributor

I'm going to close this, and if it crops up again, I'll pick it up.

@paulbjensen paulbjensen closed this Feb 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment