Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace select() usage for a Poller class witch uses either kqueue(), poll() or select() backend #129

Merged
merged 12 commits into from
Aug 10, 2014

Conversation

igorsobreira
Copy link
Contributor

This patch aims to fix the issue #26.

The Poller() class uses kqueue() or poll(), if available, with a fall back to select() (witch remains with problem).

I've tested with this config file: http://ideone.com/sH2Z7 :

sudo bash -c "launchctl limit maxproc 2048 2048 ; supervisord -nc sample-cat-stress.conf"

Both poll() and kqueue() work as expected. But select() raises (as reported on #26):

ValueError: filedescriptor out of range in select()

@chihhsiw
Copy link

Hello I downloaded the poll() version of supervisord and try to run 1500 process of 40 programs.

We start those processes through supervisorctl by using 200 client connection at the same time, each use supervisorctl to start 7 process like supervisorctl -c /tmp/supervisord_cat.conf start program37:program37_6 program37:program37_7 program37:program37_8 program37:program37_9 program38:program38_1 program38:program38_2 program38:program38_3 program38:program38_4.

Sometimes this kind of start will hang supervisord, that means not all supervisorctl returns and hangs forever. Even though I tried just use /bin/cat as program for running.

If I reduce number of client to 20 connection, the hang wont happen. Does anybody have clue why there is hang? We want to get all the processes start as soon as possible, using 20 connections will start all processes but take double time compare with 200 connections

@desaintmartin
Copy link

Hello,
Using Linux, even while using one client (i.e running a shell script that does supervisorctl start program1:program1; supervisorctl start program2:program2; ...), it will eventually hang after a few starting processes.
@igorsobreira, are you able to reproduce this bug?
Thanks for the pull request anyway, it will be very useful when it's bug-free.

@desaintmartin
Copy link

I will try to dive into the code this evening. My first reaction is "could we extract this code and make it a generic Python library?".
I am not able to reproduce the error using the "cat" examples. But from our supervisor configuration (with many different processes outputing many things and even crashing), it hangs after a few "start groupXX:" from supervisorctl. Let me try to reproduce it.

@mcdonc mcdonc merged commit b442130 into Supervisor:master Aug 10, 2014
@mcdonc
Copy link
Member

mcdonc commented Aug 10, 2014

Wonderful patch, thank you! Its in master now, and will be released in supervisor 4.0.

@msabramo
Copy link
Contributor

Neato! I wonder if the code could be made simpler and more robust using a library - e.g.: https://pypi.python.org/pypi/pyev/ ?

Or I wonder if gevent greenlets could be used? (Or tulip/asyncio/trollius).

@mcdonc
Copy link
Member

mcdonc commented Aug 11, 2014

@masmbro maybe although it's remarkably clean right now

@kevin1024
Copy link

Wow, thanks for merging this! This will be a great improvement.

@igorsobreira
Copy link
Contributor Author

Awesome, thanks for merging!

@mcdonc
Copy link
Member

mcdonc commented Aug 11, 2014

@msabramo to be clear I dont mean all of supervisor is clean ;-) i meant the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants