Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZMQ Pipeline instability #2

Closed
maggedotno opened this issue Oct 20, 2012 · 8 comments
Closed

ZMQ Pipeline instability #2

maggedotno opened this issue Oct 20, 2012 · 8 comments

Comments

@maggedotno
Copy link
Contributor

I'm having a hard time with a project of mine where I use React + Ratchet as a WAMP server. React is also set up with ZMQ and recieves data from a pipeline. What I experience is that this pipeline stops working and I have to restart my React + Ratchet server after varying periods of time.

In order to track down what was going on I went back to basics and created a vanilla ZMQ pipeline where I in one process pull and write a minus (-) while in another process push and write a plus (+).

See the pull code here: https://gist.github.com/3924664
See the push code here: https://gist.github.com/3924672

When running this it works as I expect, I get a (seemingly) neverending mix of pluses and minuses.

See a test run here: https://gist.github.com/3924682
(remember to scroll horizontally ;) )

OK, so the basics work. What I did next was to implement the same minimalistic pull code with React+ZMQ.

See the React pull implementation here: https://gist.github.com/3924668

When running this it starts out fine, but after a little bit I end up only pushing to the pipeline, the pulls seem to hang for some reason. My question to you - whats up with that?

See a test run with React pull here: https://gist.github.com/3924709
(remember to scroll horizontally ;) )

Setup:

PHP 5.3.10-1ubuntu3.2 with Suhosin-Patch (cli) (built: Jun 13 2012 17:20:55)
libzmq version => 2.2.0
react/event-loop v0.2.1
react/socket v0.2.1
react/zmq dev-master

@igorw
Copy link
Contributor

igorw commented Oct 20, 2012

Kudos for the excellent test case. I can reproduce this issue. I'll try and find out what is going wrong here. At some point the read callback is no longer getting triggered.

Only thing I have found so far: the client process keeps running, so it is not waiting for a blocking call to return. If you find anything more, let me know. Feel free to join #reactphp on freenode if you want to brainstorm. :)

@igorw
Copy link
Contributor

igorw commented Oct 21, 2012

This problem is most likely related to incomplete handling of edge-triggered events in some cases. A similar issue existed in gevent-zeromq.

igorw added a commit that referenced this issue Oct 21, 2012
Accessing SOCKOPT_EVENTS nullifies the edge-triggered read/write events.
We need to only get the option once and check it against POLL_IN/POLL_OUT,
so that the read event gets triggered, and we do not accidentally one of
the read events, leading to a empty pull socket.

Includes a test script reproducing the problems.

Unfortunately the fix currently causes an infinite loop with the pubsub
example and the push/pull integration test.
@igorw
Copy link
Contributor

igorw commented Oct 21, 2012

Update: I have a fix, but it causes an infinite loop in one of the other example scripts and one of the tests. I need to figure out a real solution. Partial patch in edge-trigger branch.

Feel free to play with it and help find a way to fix the recursion problem.

igorw added a commit that referenced this issue Oct 21, 2012
* edge-trigger:
  Fix event handling, centralize it in socket wrapper
  Partial fix for #2, do not access SOCKOPT_EVENTS so often
@igorw
Copy link
Contributor

igorw commented Oct 21, 2012

I have pushed a fix to master, see also 3bccdbf. Please try it and confirm the fix.

@igorw igorw closed this as completed Oct 21, 2012
@maggedotno
Copy link
Contributor Author

I've fried my head on this issue most of the day today, but I see you beat me to it, hehe. If you want you can see what I ended up doing in my fork here:

https://github.com/maggedotno/zmq/tree/socket-buffer-nomulti-refactor

I found the culprit to be recvmulti, but I'm not completly sure why. Replacing it with a recv followed by a check of SOCKOPT_RCVMORE and further recv for multipart msgs works fine.

I've run the 2 test scripts in my branch for over 6 hours and they were still behaving OK.

Please pick any pieces you can use or let me know if you need me to do a pull req, etc.

BTW: I saw your issue on multipart messages, did I maybe improve on that while I refactored code? :)

@igorw
Copy link
Contributor

igorw commented Oct 21, 2012

Oh damn, thanks for spending time on it. Took me quite a while to figure it out.

Can you try with master and see if that RCVMORE stuff is still an issue? And if it is, can you make a test case (or test script) against master, and ideally a PR with a fix?

I haven't had any issues with multipart myself, so if you have a reproduce case then I can get a better picture.

Also, your changes do not fix #3. That ticket is about getting rid of this messy check:

$this->emit('message', count($message) == 1 ? $message : array($message));

Thanks, man. You get a personal two thumbs up from me: 👍 👍

@maggedotno
Copy link
Contributor Author

No worries, really great that you fixed over the weekend.

I've run the tests I did for my refactoring related to multipart (and also related to the issue here, since the test is kind of stressful). Multipart is working fine and I can't make it break wrt this issue either. Good work! 👍

I will create a PR with the tests in a little bit, so you can include them if you want. Theres one test pushing from vanilla ZMQ and one test pushing from a 1 sec period React event-loop.

@igorw
Copy link
Contributor

igorw commented Oct 23, 2012

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants