New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyncWorker.wait() method returns None #983

Closed
awol opened this Issue Feb 17, 2015 · 3 comments

Comments

Projects
None yet
4 participants
@awol

awol commented Feb 17, 2015

I am running gunicorn 19.2.0 serving Django 1.7.4 using nginx as a proxy on Centos7.

The wait() method of the SyncWorker class is getting a tuple of empty lists from the call to select.select() as a result of the timeout passing, and, as such falling through to the end of the function and returning None to the calling function. This calling function (run_for_multiple in my case) then raises an exception as it tries to iterate over the result of the call to wait().

As a result the worker then dies and then is rebooted by the master. Since I am in a low volume world, this is happening for each worker every timeout seconds (15.0) and really filling my logs.

Is this the expected behaviour? Perhaps this is an artefact of the result of the select() call not raising an exception when the timeout expires when elsewhere it would.

My work around is to return an empty list at the end of the wait method. An alternative would be to return self.sockets as would be done in the event of an EINTR exception from the select.select() call. I am not sure which is more idiomatic (if indeed either) for the gunicorn model as I have not had the opportunity to investigate the impact further up the stack of the workaround. However the empty list approach is working well for me (apparently).

@tilgovi tilgovi added this to the R19.3 milestone Feb 17, 2015

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Feb 17, 2015

Collaborator

Added to R19.3 milestone because this is not a pleasant experience at all and makes it seem like something is very wrong when it is not. Thanks for the detailed report.

Collaborator

tilgovi commented Feb 17, 2015

Added to R19.3 milestone because this is not a pleasant experience at all and makes it seem like something is very wrong when it is not. Thanks for the detailed report.

@benoitc benoitc modified the milestones: R19.2 19.2.1, R19.3 Feb 18, 2015

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Feb 18, 2015

Owner

If we provide all the sockets here, it would means we could miss a connection on one of the sockets. Imo a better way would be to returning in the looop if no socket is ready to accept and wait for at least one.

One possible issue by doing this is the thundering herd problem, but that could be handled later. Thoughts?

Owner

benoitc commented Feb 18, 2015

If we provide all the sockets here, it would means we could miss a connection on one of the sockets. Imo a better way would be to returning in the looop if no socket is ready to accept and wait for at least one.

One possible issue by doing this is the thundering herd problem, but that could be handled later. Thoughts?

@jfarrimo

This comment has been minimized.

Show comment
Hide comment
@jfarrimo

jfarrimo Mar 6, 2015

I'm encountering this problem as well. One side-effect is that clients making http requests to gunicorn occasionally get "socket hang up" errors because there are apparently no free workers to service the request. This isn't a giant problem for me, and doesn't happen very frequently, but it does show that this problem has real-world implications and is not totally benign. I'm eagerly awaiting a new gunicorn release that fixes this.

jfarrimo commented Mar 6, 2015

I'm encountering this problem as well. One side-effect is that clients making http requests to gunicorn occasionally get "socket hang up" errors because there are apparently no free workers to service the request. This isn't a giant problem for me, and doesn't happen very frequently, but it does show that this problem has real-world implications and is not totally benign. I'm eagerly awaiting a new gunicorn release that fixes this.

@benoitc benoitc closed this in 803a2d7 Mar 6, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment