uncaught exception breaks connection - poss due to heartbeat breaking chatty connection? #32

Closed
hjwp opened this Issue Mar 7, 2012 · 6 comments

Comments

Projects
None yet
2 participants

hjwp commented Mar 7, 2012

Hey,

we've noticed several Uncaught exception in /socket.io/1/websocket/ types of errors in our logs. We think it tends to happen when there's a lot of small bits data being transmitted between the client and server in quick succession.

We find that, with every error, an incoming message is received at the same time as a heartbeat... so it looks like that's probably what's causing the issue?

Here's a minimal repro, which should crash within a minute or so... :

from os import path as op
import logging
from tornado import web
from tornadio2 import SocketConnection, TornadioRouter, SocketServer
ROOT = op.normpath(op.dirname(__file__))

class IndexHandler(web.RequestHandler):
    """Regular HTTP handler to serve the ping page"""
    def get(self):
        self.render('index.html')

class SocketIOHandler(web.RequestHandler):
    def get(self):
        self.render('../socket.io.js')

class PongConnection(SocketConnection):
    def on_message(self, message):
        print message
        self.send('pong')

PingRouter = TornadioRouter(PongConnection,
                            dict(enabled_protocols=['websocket', 'xhr-polling',
                                                    'jsonp-polling', 'htmlfile']))
application = web.Application(
    PingRouter.apply_routes([
        (r"/", IndexHandler),
        (r"/socket.io.js", SocketIOHandler)
    ]),
    flash_policy_port = 843,
    flash_policy_file = op.join(ROOT, 'flashpolicy.xml'),
    socket_io_port = 8001
)

if __name__ == "__main__":
    # watch logs for error
    logging.basicConfig(filename='repro.log', level=logging.DEBUG, format='%(asctime)s %(levelname)s:%(message)s')
    SocketServer(application)
<!DOCTYPE html>
<html>
<head>
  <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
  <script src="socket.io.js"></script>
  <script>
    $(function() {
        var socketio = new io.connect('http://' + window.location.host);

        socketio.on('disconnect', function() {
            console.log('disconnected');
        });
        socketio.on('error', function() {
            console.log('error');
        });

        socketio.on('connect', function() {
          console.log('connected');
          for (var i=0; i< 100000; i++){
            socketio.send('ping' + i);
          }
        });
    });
</script>
</head>
<body><h3>Minimal repro</h3></body>
</html>

and here is what we find in the logs -- the <<<3 is a normal socketio message, and if we understand it correctly, the <<<2 is a heartbeat.

2012-03-07 15:49:22,313 DEBUG:>>> 3:::ping10717
2012-03-07 15:49:22,314 DEBUG:<<< 3:::pong
2012-03-07 15:49:22,317 DEBUG:>>> 3:::ping10718
2012-03-07 15:49:22,317 DEBUG:<<< 3:::pong
2012-03-07 15:49:22,323 DEBUG:>>> 3:::ping10719
2012-03-07 15:49:22,324 DEBUG:<<< 3:::pong
2012-03-07 15:49:22,326 DEBUG:<<< 2::
2012-03-07 15:49:22,326 DEBUG:<<< 0::
2012-03-07 15:49:22,327 ERROR:Uncaught exception in /socket.io/1/websocket/9292bc6ecf30cab96d548ae0b87e4e18
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/websocket.py", line 254, in wrapper
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tornadio2/persistent.py", line 136, in on_message
    self.session.close()
AttributeError: 'NoneType' object has no attribute 'close'

hjwp commented Mar 7, 2012

having added a bit more debug code - it seems to be that we exit with a self.close inside the _heartbeat function of session.py... Could it be that having a very "busy" client-side leads to the client missing its heartbeat acks?

Owner

mrjoes commented Mar 7, 2012

It is different issue. When server sends heartbeat to the client, client closes connection by sending close packet.

I fixed tornadio2 crash - it won't choke on missing self.session, but I have no idea why socket.io 0.9 (right?) closes connection on a heartbeat.

hjwp commented Mar 7, 2012

aha! may have a fix for you... will submit pull request...

Owner

mrjoes commented Mar 7, 2012

Nevermind, you're right - looks like socket.io client does not respond with a heartbeat... Investigating further.

hjwp commented Mar 7, 2012

just sent a pull request from work account...

On Wed, Mar 7, 2012 at 4:23 PM, Serge S. Koval
reply@reply.github.com
wrote:

Nevermind, you're right - looks like socket.io client does not respond with a heartbeat... Investigating further.


Reply to this email directly or view it on GitHub:
MrJoes#32 (comment)


Harry J.W. Percival

Twitter: @hjwp
Mobile:  +44 (0) 78877 02511
Skype:         harry.percival

Owner

mrjoes commented Mar 7, 2012

OK, I found issue.

Browser will queue all outgoing messages, so they will be send in sequence. If you have queue of 100k messages, there's no chance for heartbeat to go through in timely fashion. Server will send few more heartbeats and will close the connection.

I'm going to rework heartbeat handling, so it won't even send heartbeat request if there was activity in last X seconds.

@mrjoes mrjoes closed this in 35df83b Mar 7, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment