-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server becomes unresponsive #61
Comments
Hello, Can you try to reproduce the problem with the least amount of code? And in this case, can you share it with us? When you say that the server will stop accepting new requests, it sounds to me like your process is waiting in some coroutine (gevent green thread blabla) forever, thus never switching to the other coroutine, for example, the one used by zerorpc to handle new connections. So make sure that the task that you are executing are not blocking the process, but only YIELDING coroutines. Do you have any disk IOs or some non-gevent compatible database driver for example? Best, |
I really apologize for opening an issue without being able to provide a repeatable example of the issue - I knew it was a long shot. I ended up just writing a simple process monitor that will restart the ZeroRPC server if it becomes unresponsive. It's not an ideal fix, as I'd still like to discover what was going wrong in the first place, however it seems to serve its purpose quite well so far. |
I realize this is an old issue but I've come across the same problem and I do have a minimal reproducible example. # server.py
import zerorpc
class MyServer(object):
@zerorpc.stream
def streaming_range(self, fr, to, step=1):
return range(fr, to, step)
if __name__ == '__main__':
server = zerorpc.Server(MyServer())
server.bind('tcp://127.0.0.1:1234')
server.run() # client.py
import zerorpc
client = zerorpc.Client()
client.connect("tcp://127.0.0.1:1234")
for item in client.streaming_range(0, 200):
print(item) // client.js
const zerorpc = require('zerorpc');
const client = new zerorpc.Client();
client.connect('tcp://127.0.0.1:1234');
client.invoke('streaming_range', 0, 200, (error, res, more) => {
console.log(res);
}); Try this first:
You can do this as many times as you want. No problem there. The problem seems to be specific to the node-client:
It works exactly once. No response from the server for any additional requests. Interestingly, requesting a smaller range through the node-client works just fine. Change the range in
But for such small datasets we wouldn't need streaming, would we? Conversely, I can crank up the range in the python-client without any issues. Any hints to what's going on and how to fix this would be greatly appreciated! |
Hi @bombela, Does it make sense to discuss this (see above) in this old and closed issue or should I open a new one? |
thank you for this reproducible test. It fails exactly as you described. I will allocate some time into the problem. |
Thanks for the update, @bombela. Did you get a chance to look into this yet? |
Is it possible that the single-threaded nature of node is causing the heartbeats to miss and the server consider this client as disconnected? |
I only briefly looked. But it seems the whole zmq router socket on the
server gets stuck.
…On Tue, Jan 23, 2018, 08:07 dhm116 ***@***.***> wrote:
Is it possible that the single-threaded nature of node is causing the
heartbeats to miss and the server consider this client as disconnected?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AANMjCv4cZStEkWT0aqJgwD8kkc4HXeqks5tNgPEgaJpZM4Ap-eP>
.
|
@bombela, Thanks for looking into this. I'd really need to be able to use the node-client in combination with streaming responses but I don't have the resources to dig deeper into this myself.
How do you think we should proceed from here? |
It has to be a regression, because in 2012 I was using infinite streams from python -> node with no problem. The cross-language integration testing are also not failing. And it fails after a specific number of events, that changes between your machine and mine. The only thing I can think of doing, is spend few evening on it, debugging carefully until I understand exactly what is happening! Will try to look into it this week. But as usual, no promise. |
Hi @bombela, just floating this to the top of your inbox in case you forgot about it. Thanks, David |
I looked at it 2 weeks ago, but I couldn't figure out the problem. This looks like a nasty interaction between many things. Could be the zmq python layer, zmq itself, some logic in either zerorpc-python or zerorpc-node... |
As the maintainer, what do you suggest to do next? |
I suggest the maintainer moves his ass and fixes the problem :D Joke aside, I spent some time this weekend on it, and found out the following: As soon as a nodejs client consuming a stream terminates, the python server is frozen. This means, I can start a nodejs client that streams for a while. Then connect as many python client streaming as I want. They can all go theirs merry way, disconnect, connect back and so on. Until the first nodejs client terminates. Then everything is frozen. I am going to look into head of line blocking in the zmq router socket on the server. |
Any update on this issue? is this still a problem? |
yes still an issue. still hope to find some time to dive deeper one day.
…On Thu, Apr 19, 2018, 21:28 umsiw ***@***.***> wrote:
Any update on this issue? is this still a problem?
thanks before
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AANMjH3PCD4yoev10FLwlORNELCgzsn0ks5tqWP3gaJpZM4Ap-eP>
.
|
I was able to come up with a workaround to this problem @bombela . My team hit the problem earlier today. We had a very long streaming request in our python server implemented using an iterator. I added For python functions that return especially long iterators, I think the streaming code tries to return each response to the client without giving up control to other gevent code (Greenlets) This blocks other RPCs, and also blocks heartbeat messages. |
@Prgrmman, Where did you add the gevent.sleep(0)? I think I am encountering this same issue. |
Gosh haha 4 years ago and I moved to a different project. |
Thanks! I'll experiment with that! |
I'm running into some really strange behavior between a python zerorpc server and a zerorpc-node client - there are no logs or errors to help me diagnose why this happens, but when I use the streaming feature, after the client receives all of the messages, the server will simply stop accepting any new requests. Clients can connect fine and attempt to send messages, but they will always time out until I restart the server.
I had to end up using the streaming feature because there are some potentially (very) long running processes, however this isn't a simple timeout issue, although I am also occasionally encountering the same issue mentioned in #37.
I don't see any obvious starting points to help troubleshoot this other than putting log statements throughout the zerorpc code, but I thought I'd open this issue in case this has been encountered before and there is a solution.
The text was updated successfully, but these errors were encountered: