Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat trying to emit on None object #63

Closed
maxekman opened this issue Jun 7, 2013 · 11 comments
Closed

Heartbeat trying to emit on None object #63

maxekman opened this issue Jun 7, 2013 · 11 comments

Comments

@maxekman
Copy link

maxekman commented Jun 7, 2013

This happens from time to time with 40+ clients that all have a lot of network and cpu load. I have no idea it this causes the request to fail, but I'm sure it crashes the current (heartbeat?) greenlet, which is never a good thing.

Traceback (most recent call last):
  File "/madcrew/applications/software/virtualenvs/otis_v0_1_0-linux/lib/python2.7/site-packages/gevent/greenl
et.py", line 328, in run
    result = self._run(*self.args, **self.kwargs)
  File "/madcrew/applications/software/virtualenvs/otis_v0_1_0-linux/lib/python2.7/site-packages/zerorpc/heart
beat.py", line 79, in _heartbeat
    self._channel.emit('_zpc_hb', (0,))  # 0 -> compat with protocol v2
AttributeError: 'NoneType' object has no attribute 'emit'
<Greenlet at 0x70a7370: <bound method HeartBeatOnChannel._heartbeat of <zerorpc.heartbeat.HeartBeatOnChannel o
bject at 0x88c4c90>>> failed with AttributeError

We also get a lot of these, but I suspect we are to aggressive calling combined with a high network load, which makes it acceptable:

/!\ gevent_zeromq BUG /!\ catching up after missing event (RECV) /!\
@bombela
Copy link
Member

bombela commented Jun 10, 2013

Hello,

Do you have any simple test case to reproduce this problem?

The close() function in the heartbeat object is supposed to kill the heartbeat task before destroying and setting the _channel to None. I am really curious about what is happening over there.

Best,
fx

@maxekman
Copy link
Author

Thanks for the explanation.

One cause for getting them could be that I set heartbeat to 1s, setting it to 10s gets rid of it completely. I still get some of gevent_zeromq bug prints though.

I'll see if I can produce a test case for it. I'm also reading along in #37 which could be related (we are also doing similar long lasting tasks.)

@jpetazzo
Copy link
Contributor

Seeing just the same thing, since we upgraded from zerorpc 0.4.2 + pyzmq 2.2.0.1 to zerorpc 0.4.3 + pyzmq 13.1:

Traceback (most recent call last):
  File ".../20130613010810832055_57f451d7f9f3_57f451d7f9f3/lib/python2.6/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
  File ".../20130613010810832055_57f451d7f9f3_57f451d7f9f3/lib/python2.6/site-packages/zerorpc/heartbeat.py", line 79, in _heartbeat
    self._channel.emit('_zpc_hb', (0,))  # 0 -> compat with protocol v2
AttributeError: 'NoneType' object has no attribute 'emit'
<Greenlet at 0x36d52d0: <bound method HeartBeatOnChannel._heartbeat of <zerorpc.heartbeat.HeartBeatOnChannel object at 0x36ecf90>>> failed with AttributeError

Exception greenlet.GreenletExit: GreenletExit() in <bound method HeartBeatOnChannel.__del__ of <zerorpc.heartbeat.HeartBeatOnChannel object at 0x36ecf90>> ignored

It looks like it doesn't have any other side-effect (for now!), so maybe it's just crashing the greenlet that was spawned to handle a specific request, at the end of the request.

@bombela
Copy link
Member

bombela commented Jun 14, 2013

I guess setting it to 1s makes the heartbeat loop wakeup more often, which then raises the probability of the bug/race to occur... I guess its going to be fun to test.

@jpetazzo
Copy link
Contributor

I tried to reproduce with different settings (low heartbeat server-side, client-side, both sides); with short and long jobs (gevent.sleep(x) with x from 1 to 60); with up to 100 clients in parallel, and couldn't reproduce the issue :-(

@crystaldust
Copy link

I met the same issue when using nodejs client interacting with python server. When I invoke the method from python server in nodejs frequently, it appears, here is the result:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gevent-1.1-py2.7-linux-x86_64.egg/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/usr/local/lib/python2.7/dist-packages/zerorpc-0.4.4-py2.7.egg/zerorpc/heartbeat.py", line 79, in _heartbeat
    self._channel.emit('_zpc_hb', (0,))  # 0 -> compat with protocol v2
AttributeError: 'NoneType' object has no attribute 'emit'
<Greenlet at 0xf0b410: <bound method HeartBeatOnChannel._heartbeat of <zerorpc.heartbeat.HeartBeatOnChannel object at 0xf1c2d0>>> failed with AttributeError

Exception greenlet.GreenletExit: GreenletExit() in <bound method HeartBeatOnChannel.__del__ of <zerorpc.heartbeat.HeartBeatOnChannel object at 0xf0a710>> ignored

@bombela
Copy link
Member

bombela commented Mar 21, 2014

I have a theory for this bug:

  • the heartbeat channel object is destroyed before the greenlet is properly terminated, which means the greenlet runs on a destroyed (thus all None) object.
  • this would happen only when the server is handling a tons of request and python start to garbage collect aggressively

This is just a guess because normally, the heartbeat channel object shouldn't be destroyed before the greenlet... since the greenlet references it. But at the same time, there is a circular reference... and that is maybe the only way the circle gets broken up?

Just thinking out loud here...

If you could find a way to reproduce the bug with a minimal amount of code, that would greatly help!

@grokcore
Copy link

I get this when connecting to a python instance from a node client. My test case is simple, run a process that takes 5 seconds to complete, the client timeout parameter doesn't yield any different results.

test.js

var zerorpc = require("zerorpc");
var client = new zerorpc.Client({timeout:10000});

client.connect("tcp://127.0.0.1:4242");

for (invoked=0; invoked<10; invoked++) {
   client.invoke("testCase",{something:"here"}, function(error, res, more) {
       console.log('Results',res);
  });
}

test.py

import zerorpc, time


class testServe(object):
    def testCase(self,data):
        time.sleep(5)
        return data

s = zerorpc.Server(testServe())
s.bind("tcp://127.0.0.1:4242")
s.run()

Results (x9):

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/usr/local/lib/python2.7/dist-packages/zerorpc-0.4.4-py2.7.egg/zerorpc/heartbeat.py", line 79, in _heartbeat
    self._channel.emit('_zpc_hb', (0,))  # 0 -> compat with protocol v2
AttributeError: 'NoneType' object has no attribute 'emit'
<Greenlet at 0x8637cfc: <bound method HeartBeatOnChannel._heartbeat of <zerorpc.heartbeat.HeartBeatOnChannel object at 0x865234c>>> failed with AttributeError

On the client end, the first request goes through,the rest have: Result: undefined.

Removing the time.sleep will result in a succesful data echo

...
addendum:
Using a similar method from the command line utility rather then the node client works -

 zerorpc -j tcp://localhost:4242 testCase '{"something":"here"}'&
 zerorpc -j tcp://localhost:4242 testCase '{"something":"here"}'&
 zerorpc -j tcp://localhost:4242 testCase '{"something":"here"}'&
 zerorpc -j tcp://localhost:4242 testCase '{"something":"here"}'&
 zerorpc -j tcp://localhost:4242 testCase '{"something":"here"}'&

@pikeas
Copy link

pikeas commented Dec 4, 2014

Any update on this issue?

@bombela
Copy link
Member

bombela commented Dec 4, 2014

try to reproduce with this version of zerorpc:
https://github.com/bombela/zerorpc-python

On Thu Dec 04 2014 at 1:23:22 AM Aris Pikeas notifications@github.com
wrote:

Any update on this issue?


Reply to this email directly or view it on GitHub
#63 (comment)
.

@bombela
Copy link
Member

bombela commented Jun 16, 2015

This bug should now be fixed. Finally. Latest package v5.0.1 on pypi.

@bombela bombela closed this as completed Jun 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants