New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crash during hot code push (w/ websockets enabled) #513

Closed
speigg opened this Issue Nov 28, 2012 · 14 comments

Comments

Projects
None yet
4 participants
@speigg

speigg commented Nov 28, 2012

I have enabled web-sockets on the client-side for meteor apps (I need the higher performance). I have a Collection with documents being updated very frequently (as often as every requestAnimationFrame). When I make a hot code push, I sometimes get the following errors:

events.js:68
        throw arguments[1]; // Unhandled 'error' event
                       ^
Error: This socket is closed.
    at Socket._write (net.js:518:19)
    at Socket.write (net.js:510:15)
    at Socket.HttpProxy.proxyWebSocketRequest.reverseProxy.incoming.socket.on.listeners.onOutgoing (/Users/gspeiginer/.meteorite/meteors/meteor/meteor/067113a37dd4e7824db1a16d5f7a90c3739eed9c/dev_bundle/lib/node_modules/http-proxy/lib/node-http-proxy/http-proxy.js:515:35)
    at Socket.EventEmitter.emit (events.js:93:17)
    at TCP.onread (net.js:396:14)

OR

events.js:68
        throw arguments[1]; // Unhandled 'error' event
                       ^
Error: write EPIPE
    at errnoException (net.js:769:11)
    at Object.afterWrite (net.js:593:19)

I believe I have been able to prevent the latter error with the following change to Meteor._LivedataSession:

    Meteor._LivedataSession.prototype.send = function(msg) {
      var self = this;
        if (self.socket && self.socket.writable) // check for writable socket to avoid error
          self.socket.send(JSON.stringify(msg));
        else
          self.out_queue.push(msg);
    }

However, the first error is still appearing on hot code pushes.

@speigg

This comment has been minimized.

speigg commented Nov 29, 2012

Never-mind that last part, I'm still getting both errors occasionally. The EPIPE error is less frequent than the "closed socket" error.

@gschmidt

This comment has been minimized.

Member

gschmidt commented Nov 29, 2012

@speigg, thanks, I appreciate the report. The core team won't be able to take this on until we start work on official websocket support -- before we push a patch we need to dig in and understand exactly what's going on.

But if you happen to have a minute, I'd love to hear more about how you're using Meteor. You're doing a write across the network every requestAnimationFrame? This certainly isn't anything I ever anticipated and it would be interesting to better understand the use case :)

@speigg

This comment has been minimized.

speigg commented Nov 30, 2012

@gschmidt, Sure. It's a research project. Quick summary: I'm using Meteor to create a concurrent collaborative authoring system involving 3D content across multiple devices, users, and modes of interaction. Right now I'm just sending 3D transformation matrices + projection matrices across the network in order for users to see representations of each other's perspectives in realtime. Using websockets reduces the latency significantly.

@meawoppl

This comment has been minimized.

Contributor

meawoppl commented Dec 7, 2012

@speigg Can you post where this code is housed? I am doing a similarly demanding low-latency application, and I would love to work this avenue with you. . .

@speigg

This comment has been minimized.

speigg commented Dec 7, 2012

You mean the snippet I gave above? I am overwritting the send function on Meteor's Meteor._LiveDataSession prototype object:

https://github.com/meteor/meteor/blob/master/packages/livedata/livedata_server.js#L128

This does not fix the error though. It does seem to reduce the frequency of crashes for me, but that might just be placebo :). I'm still at a loss as to what the real source of these errors are. If you can make any progress in finding/fixing the real problem, that would be great. Let me know if I can help somehow.

@meawoppl

This comment has been minimized.

Contributor

meawoppl commented Dec 8, 2012

That snippet is useful, but what I was looking for is this meteor SO post.

It looks like websockets is already in the current dev, so I suspect these errors will get more attention soon :)

@speigg

This comment has been minimized.

speigg commented Dec 8, 2012

I've read through that SO post before. While websockets support is currently enabled on the server side, it isn't on the client side. This is the code I am using to enable websockets on the client side:

if (Meteor.isClient) {
    Meteor._Stream.prototype._launch_connection = function () {
        var self = this;
        self._cleanup_socket(); // cleanup the old socket, if there was one.

        self.socket = new SockJS(self.url, undefined, { debug: false});

        // ********* Disable the whitelist, allow all protocols (esp websocket!) *************
        //, protocols_whitelist: [
        //    // only allow polling protocols. no websockets or streaming.
        //    // streaming makes safari spin, and websockets hurt chrome.
        //    'xdr-polling', 'xhr-polling', 'iframe-xhr-polling', 'jsonp-polling'
        //  ]});

        self.socket.onmessage = function (data) {
          // first message we get when we're connecting goes to _connected,
          // which connects us. All subsequent messages (while connected) go to
          // the callback.
          if (self.current_status.status === "connecting")
            self._connected(data.data);
          else if (self.current_status.connected)
            _.each(self.event_callbacks.message, function (callback) {
              callback(data.data);
            });

          //self._heartbeat_received(); // *********** this causes timers to be created/destroyed multiple times a frame :P
        };
        self.socket.onclose = function () {
          // Meteor._debug("stream disconnect", _.toArray(arguments), (new Date()).toDateString());
          self._disconnected();
        };
        self.socket.onerror = function () {
          // XXX is this ever called?
          Meteor._debug("stream error", _.toArray(arguments), (new Date()).toDateString());
        };

        self.socket.onheartbeat =  function () {
          self._heartbeat_received();
        };

        if (self.connection_timer)
          clearTimeout(self.connection_timer);
        self.connection_timer = setTimeout(
          _.bind(self._fake_connect_failed, self),
          self.CONNECT_TIMEOUT);
    };    

    /// ********** restart connection *******
    Meteor.reconnect({_force: true});
}
@meawoppl

This comment has been minimized.

Contributor

meawoppl commented Dec 8, 2012

Also, @gschmidt this is a major use case for me as well. I am using mongo as a synchronization tool because meteor makes rendering a cross-platform gui trivial.

In a nutshell, it is really easy to get anything (C/C++/Python) to talk to a mongo database, and hence it is an excellent tool for synchronizing several asyncronous interfaces. In our case, we have a high speed camera, running a concurrent acquisition thread, and several concurrent compression threads, as well as a thread watching the various camera "properties" and state, letting us know whether it is acquiring, waiting, at what rate, in error etc.

We have the same story for 3 other attached yes hardware devices, all of which must interact with non-trivial dependencies, and preferably have near real-time displayable information. The browser is the only real cross-platform rendering engine, and that is what brought me to your doorstep :)

@meawoppl

This comment has been minimized.

Contributor

meawoppl commented Dec 8, 2012

Thanks @speigg. That is what I was seeking.

@speigg

This comment has been minimized.

speigg commented Dec 8, 2012

@meawoppl No problem.

FYI, the code is based on the Meteor._Stream._launch_connection function here:
https://github.com/meteor/meteor/blob/master/packages/stream/stream_client.js#L310

My changes are:

  1. Disabling the whitelist (allowing all protocols including websockets)
  2. Disabling the heartbeat_received function for onmessage events.
  • This was necessary because the heartbeat_received function creates and destroys a timer every time it is called. If you have data being pushed around very frequently, timers will be created faster than the garbage collector can keep up (at least in Chrome and Safari), which slows down the frame-rate (at least in my use case).
  • I realize this may re-introduce another issue w/ spuriously closing connections (6049771)

Finally, don't forget to reset the connection:
Meteor.reconnect({_force: true});

@meawoppl

This comment has been minimized.

Contributor

meawoppl commented Dec 8, 2012

Perhaps the heartbeat_received function just needs to be debounced to a reasonable level?

Perhaps 1/second or similar? I am going to start a fork to play with this all. Perhaps disabling the heartbeats triggers the server to assume client death an axe connections? I don't 100% understand what all is happening under the hood, but in ZMQ installations the heartbeat nomenclature is certainly connoted to treating the life and death cycle of connections cleanly. . . Digging in more this PM.

@glasser

This comment has been minimized.

Member

glasser commented May 28, 2013

Avi and I looked at this and it looks like it's a bug in the old version of node http-proxy we use. This error handling was changed multiple times since then. Will look into upgrading.

@glasser

This comment has been minimized.

Member

glasser commented Jun 7, 2013

Hmm, I think our upgrade (in 0.6.4) improved the situation a bit but the "This socket is closed" is still reproducible. With the new version of http-proxy, it's because there's no error handler set on the socket from proxy to the proxied target. If you set such a thing (eg to proxyError, in reverseProxy.on('upgrade')) then that fixes this issue.

Would consider giving them a pull request to fix it but that would require taking the totally broken websocket changes in 0.10.2. We should either fork, or continue to wait for the Node 0.10 stream-based rewrite that they claim is forthcoming.

@glasser

This comment has been minimized.

Member

glasser commented Apr 19, 2014

We've upgraded to a fully rewritten version of http-proxy (in 0.6.6), and have put a fair amount of effort into fixing its error handling. Closing as assumed to be working now.

@glasser glasser closed this Apr 19, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment