Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xhr poll error: using cluster #300

Closed
konieshadow opened this issue Dec 29, 2014 · 13 comments
Closed

xhr poll error: using cluster #300

konieshadow opened this issue Dec 29, 2014 · 13 comments

Comments

@konieshadow
Copy link

server.js

var cluster = require('cluster')
var numCPUs = require('os').cpus().length
var engine = require('engine.io')

if (cluster.isMaster) {
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork()
  }
}
else {
  var server = engine.listen(1337, function () {
    console.log('server bound')
  })

  server.on('connection', function (socket) {
    socket.on('error', function (err) {
      console.error(err)
    })

    socket.on('message', function (data) {
      console.log(data)
    })

    socket.on('close', function (reason) {
      console.log(reason)
      socket.close()
    })

    socket.send('hello from ' + '\r\n')
  })
}

client.js

var eio = require('engine.io-client')

function request() {
  setTimeout(function () {
    var socket = eio('ws://localhost:1337')

    socket.on('error', function (err) {
      console.error(err)
    })

    socket.on('open', function () {
      socket.on('message', function (data){
        console.log(data)
        socket.close()
      })
    })

    request()
  }, 10)
}

request()

engine.io version: 1.4.3
engine.io-client version: 1.4.3
node.js version: 0.10.34
os: window 8.1 64bit
c++ complier: Microsoft Visual Studio Community 2013 Visual C++ 2013

@defunctzombie
Copy link
Contributor

Can confirm this issue. What is even stranger is that if you use a cluster size of 2 there is no problem but with 3 ore more the problem starts to happen.

@defunctzombie
Copy link
Contributor

Interesting find, but the reason this is happening is the same reason we require sticky sessions on load balancers when running engine.io servers.

The reason for the xhr poll error is because the different poll requests are being sent to different cluster backends. Each cluster backend is a separate nodejs process and does not share memory with the other process. What happens is that the session is established with the first request (and session id assigned) but future requests get routed to a different process which does not know about the session id.

Further, the actual error from the response is being masked by the 'xhr poll error' hardcoded string. Upon inspecting the responseText, the following message is shown:

{"code":1,"message":"Session ID unknown"}

This is an amusing way to expose the fact that we require sticky sessions so that requests can be routed to the correct backend that is aware of active session ids.

If you want to use cluster, you will need an adapter on top of engine.io server that will share session ids and session data between servers or avoid using cluster and instead run multiple separate processes behind a load balancer which supports sticky sessions.

I think we should update our README/docs/guide to mention that cluster should be avoided due to this limitation. We should also pass along the response text error so that debugging this is easier in the future.

@defunctzombie
Copy link
Contributor

Additional references: https://github.com/indutny/sticky-session

(tho it may not work 100%, but a good starting point for an engine.io-cluster-support module)

@neemah
Copy link

neemah commented Feb 11, 2015

sticky-session does not help if project runs on Heroku with several dynos. And this became show-stopper for horizontal scaling in our app :(

What happens is:

  1. client connects to website (let's say backend no. 1 handles this connection). engine.io saves socket id in local variable.
  2. during protocol upgrade session client reconnects to website (backend no. 2 handles this connection). Engine.io checks socket id of second request in local client hash and fails (server.js: Server.verify()) with UNKNOWN_SID error.

This is so far the cause of the problem. Any suggestions how to handle this will be very helpful.

Thanks in advance.

@3rd-Eden
Copy link
Contributor

@neemah Just get off Heroku and use hosting provider that actually supports real-time applications (and has a load balancer that uses sticky sessions).

@3rd-Eden
Copy link
Contributor

@defunctzombie sticky-session is seriously flawed as it does the sticky load balancing based on the incoming IP address. So when you run this behind another load balancer all ip's will be the same as the loadbalancers IP, causing all connections to go to one single node process.

@defunctzombie
Copy link
Contributor

@3rd-Eden yep, that is why we don't recommend it outright

@defunctzombie
Copy link
Contributor

@3rd-Eden there are problems with using amazon as well since their ELB doesn't support HTTP 1.1 so you have to pick between having websockets (tcp load balance) or polling (http with sticky).

@neemah
Copy link

neemah commented Feb 13, 2015

@3rd-Eden i'd be very pleased if you suggest one that will handle sticky-session.

@3rd-Eden
Copy link
Contributor

HAProxy, nginx, http-proxy(node) and many others.

On Feb 13, 2015, at 11:54, Slava Tsyrulnik notifications@github.com wrote:

@3rd-Eden i'd be very pleased if you suggest one that will handle sticky-session.


Reply to this email directly or view it on GitHub.

@defjamuk
Copy link

defjamuk commented May 6, 2015

Is there any reason we need sticky sessions? It's an anti pattern. I would like to store the session information in a distributed database such as cassandra. If someone could point me in the right direction I would be willing to develop a module to do this with a cassandra data store. It would help our application horizontally scale on AWS.

@wzrdtales
Copy link

@3rd-Eden That is why I build https://github.com/wzrdtales/socket-io-sticky-session to support hashing informations from layer 4 instead. But I also would prefer to be able to use something else than sticky sessions, with layer 4 information it is now also possible to balance in a bit more controlled behavior, but the best thing would be to be able to just balance clients without caring to much about the handshake.

Thus the best option would be if engine.io would finally support a handshake that works across servers. For example in combination with a storage in between like redis.

darrachequesne pushed a commit that referenced this issue May 8, 2020
JSONP transport fails when sending JSON stringified message
@darrachequesne
Copy link
Member

For future readers: I think it is implemented this way because without sticky session you would have something like this:

no-sticky-session

Since the event handlers are registered upon connection (in the current implementation, at least), any subsequent HTTP request should be forwarded to the 1st instance, but that wouldn't scale well, would it? Same with outgoing packets, if you call socket.send() on the 1st instance and the HTTP long-polling connection is established on the 3rd instance.

Besides, we have published @socket.io/sticky in order to use Socket.IO within a cluster. Unlike sticky-session and socketio-sticky-session, it is based on the sid query parameter.

Sample usage:

const cluster = require("cluster");
const http = require("http");
const { Server } = require("socket.io");
const redisAdapter = require("socket.io-redis");
const numCPUs = require("os").cpus().length;
const { setupMaster, setupWorker } = require("@socket.io/sticky");

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);

  const httpServer = http.createServer();
  setupMaster(httpServer, {
    loadBalancingMethod: "least-connection", // either "random", "round-robin" or "least-connection"
  });
  httpServer.listen(3000);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on("exit", (worker) => {
    console.log(`Worker ${worker.process.pid} died`);
    cluster.fork();
  });
} else {
  console.log(`Worker ${process.pid} started`);

  const httpServer = http.createServer();
  const io = new Server(httpServer);
  io.adapter(redisAdapter({ host: "localhost", port: 6379 }));
  setupWorker(io);

  io.on("connection", (socket) => {
    /* ... */
  });
}

The documentation was updated accordingly: https://socket.io/docs/v3/using-multiple-nodes/#Using-Node-JS-Cluster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants