Worker failed to boot #338

Closed
tijs opened this Issue Apr 25, 2012 · 33 comments

Comments

Projects
None yet

tijs commented Apr 25, 2012

I have a reasonably straightforward Django site running in a virtualenv. It runs fine when i use ./manage.py runserver but when i try and run it with gunicorn i get the a Worker failed to boot error with no further explanation.

tijs@python:~/projects/hoogtij.net$ sudo ./runscript.sh 
2012-04-25 14:02:08 [12681] [INFO] Starting gunicorn 0.14.1
2012-04-25 14:02:08 [12681] [INFO] Listening at: http://127.0.0.1:8001 (12681)
2012-04-25 14:02:08 [12681] [INFO] Using worker: sync
2012-04-25 14:02:08 [12696] [INFO] Booting worker with pid: 12696
2012-04-25 14:02:08 [12697] [INFO] Booting worker with pid: 12697
2012-04-25 14:02:08 [12698] [INFO] Booting worker with pid: 12698
Traceback (most recent call last):
  File "/home/tijs/.virtualenvs/hoogtij/bin/gunicorn_django", line 8, in <module>
    load_entry_point('gunicorn==0.14.1', 'console_scripts', 'gunicorn_django')()
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 129, in run
    DjangoApplication("%prog [OPTIONS] [SETTINGS_PATH]").run()
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 129, in run
    Arbiter(self).run()
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 184, in run
    self.halt(reason=inst.reason, exit_status=inst.exit_status)
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 279, in halt
    self.stop()
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 327, in stop
    self.reap_workers()
  File "/home/tijs/.virtualenvs/hoogtij/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 413, in reap_workers
    raise HaltServer(reason, self.WORKER_BOOT_ERROR)
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>

Where should i go looking to resolve this issue?

Collaborator

tilgovi commented Apr 26, 2012

Is gunicorn installed inside the virtualenv or outside?

tijs commented Apr 26, 2012

inside. by now i think it's an app issue since gunicorn starts when i throw out most of the settings, it's just that you can't tell what's failing when looking at that output.

Collaborator

tilgovi commented Apr 26, 2012

Try with "--debug --log-level debug". See if you get any more info.

tijs commented Apr 26, 2012

this is the output with the debug log level i'm afraid

Collaborator

tilgovi commented Apr 26, 2012

😞
I'm not sure what's happening yet. After the "Booting worker" message I expect to see "Exception in worker process" if there is a problem. Just past midnight here, but I'll sleep on it and maybe come up with something.

tijs commented Apr 26, 2012

Thanks man, if i figure it out myself i'll let you know. i'm now working through the settings.py turning stuff on until i hit the error again. Old school debugging :)

benoitc closed this in 0b3314b Apr 26, 2012

Collaborator

tilgovi commented Apr 26, 2012

@benoitc why did log.exception not work?

Owner

benoitc commented Apr 27, 2012

I generally don't like to use that one since this is just a python thing if that makes sense.

Debug sound more appropriate there. The real addition there is the traceback :)

panyam commented May 14, 2012

Actually I am having a similar issue and I think it is to do with write permissions on the log file. If I make the log files (stderr and stdout) publicly writeable it seems to be working fine. So I guess that brings me to the question of when you first start gunicorn from supervisord the log files are created under root before gunicorn takes over. What is the best way to manage ensure that log files are created by gunicorn and not by supervisord?

Owner

benoitc commented May 14, 2012

How do you launch gunicorn? Do you reproduce it with last head ?

panyam commented May 14, 2012

I launch it with supervisord. Here is my gunicorn config for supervisord:

[program:sliceweb]
directory = /opt/src/slicephone/webapp/webapp/
command = /opt/src/slicephone/webapp/scripts/gunicorn.sh
stdout_logfile = /opt/logs/supervisord.stdout.log
stderr_logfile = /opt/logs/supervisord.stderr.log

And the actual gunicorn script (note how I touch and chown the log files to the gunicorn user/group):

#!/bin/bash

set -e
DATASTORE_SOFTWARE="XXXXX"
ACCESS_LOGFILE=/opt/logs/gunicorn.slice.access.log
ERROR_LOGFILE=/opt/logs/gunicorn.slice.error.log
LOGFILE=/opt/logs/gunicorn.slice.log
LOGDIR=$(dirname $LOGFILE)
WEBAPPDIR=$(dirname $0)/../
NUM_WORKERS=3
DEBUG_FLAGS="--debug --log-level debug"

user/group to run as

USER=www-data
GROUP=www-data
cd $WEBAPPDIR/webapp
test -d $LOGDIR || mkdir -p $LOGDIR
echo WebAppDir: $WEBAPPDIR
echo "User: $USER"
echo "Group: $GROUP"

touch $ACCESS_LOGFILE $ERROR_LOGFILE $LOGFILE
chown -R $USER:$GROUP $ACCESS_LOGFILE $ERROR_LOGFILE $LOGFILE

export DATASTORE_SOFTWARE=$DATASTORE_SOFTWARE ; exec /opt/virtenvs/django_slice/bin/gunicorn_django $DEBUG_FLAGS -w $NUM_WORKERS --user=$USER --group=$GROUP --log-level=debug --log-file=$LOGFILE --access-logfile=$ACCESS_LOGFILE --error-logfile=$ERROR_LOGFILE

Good question about the latest head. pip freeze shows it to be 0.14.1 (.2 is latest isnt it?)

@sirkonst sirkonst pushed a commit to sirkonst/gunicorn that referenced this issue May 14, 2012

@benoitc benoitc display traceback when the work fails to boot (close #338) 1c27d36

Have u solve the problem? I have the same issue and currently I have no idea how to solve this.

Owner

benoitc commented May 20, 2012

It should be solved yes. Which version of gunicorn are you running? Can you provide me more information about your config so I can reproduce it? Thanks.

@panyam did you tried with 0.14.3 ?

panyam commented May 21, 2012

@benoitc Sorry mate I didnt try it on 0.14.3. I managed to get it working on 0.14.1 by making the gunicorn user the owner of the log files as soon as supervisord starts the script.

panyam commented May 21, 2012

Just to clarify I dont think this is a gunicorn issue. @anyeguyue - the best place to check this would be to look the supervisord log or the gunicorn log (usually in /var/log/gunicorn/....) to see if you have a "permission denied" error in there.

I just got the same issue, solved by first 'python manage.py syncdb' which led to an error. After fixing that setting issue, gunicorn worked (after restarting nginx).

I solved this problem too but in a different way. The first line of the gunicorn_django file was "#!/opt/django/env/mysite/bin/python", which is the path of my virtualenviroment python path. The problem solved by replace it as "#!/usr/bin/env python", although they both used the same python interpreter, but there are some PYTHONPATH difference, that's why it was failed before.

@anyeguyue I faced the same problem on gunicorn 0.14.5 and Django 1.4, but with even worse symptoms: gunicorn_django -b 0.0.0.0:8000 refused to run. Your fix works, and is pretty obvious: one should always ask /usr/bin/env which python to use! Thanks.

@asimihsan, glad to help

Contributor

vibrant commented Aug 1, 2012

I also had this issue. Came out it was some problem with wrong imports in my code. If you want to debug try running gunicorn_django --preload - this will hopefully spit out the right exception.

For anyone facing the same issue, the problem is usually something in django itself.
Activate your venv and run ./manage.py runserver

This will usually give you a more detailed error message.

👍 @fergalmoran Helped me solve my problem (it was a problem with PIL)

I'm facing the same problem... and don't know how to solve it

Collaborator

tilgovi commented Dec 5, 2014

@pythdasch there have been a couple different problems talked about here. If you can provide some more information please open a new ticket, or simply send a message to the mailing list and others might be able to help you debug.

Collaborator

berkerpeksag commented Dec 5, 2014

I think we solved @pythdasch's problem on #gunicorn. @pythdasch, can you confirm?

I confirm, thanks to berker :)

My wsgi was in the same folder as my manage.py so I had to put path wsgi:application and not project.wsgi:application

Contributor

collinanderson commented Dec 5, 2014

(It was the wrong path to wsgi. In this case needed to be wsgi:application instead of project.wsgi:application)

@tilgovi yes sorry I went to the irc channel gunicorn to solve it :)

I encountered the same issue yesterday, with gunicorn 0.18 and the following invokation:

gunicorn 'app:create_app()' --name X --workers 5 --user=apprunner --group=apprunner --bind='0.0.0.0:5000' --log-config=resource/config/di/logging.conf --timeout=360 --debug --log-level debug

It was due to an ImportError in my Flask application. The takeway here is that gunicorn does not display the traceback from the application if the worker crashes at boot, so I had to start the Flask app manually to see the actual problem.

Contributor

collinanderson commented Dec 10, 2014

I wonder if we could get it to display the actual traceback

This would be very nice. IMO, the next best thing, if displaying the original traceback is not possible, would to to include a message suggesting to run the application directly, instead of using gunicorn, to look for the actual error.

Collaborator

berkerpeksag commented Dec 10, 2014

If I understand the problem correctly, Gunicorn already shows the exception messages like ImportError:

$ gunicorn a:app --error-logfile=- --access-logfile=-
[2014-12-10 17:13:31 +0000] [12176] [INFO] Starting gunicorn 19.2.0
[2014-12-10 17:13:31 +0000] [12176] [INFO] Listening at: http://127.0.0.1:8000 (12176)
[2014-12-10 17:13:31 +0000] [12176] [INFO] Using worker: sync
[2014-12-10 17:13:31 +0000] [12181] [INFO] Booting worker with pid: 12181
[2014-12-10 17:13:31 +0000] [12181] [ERROR] Exception in worker process:
Traceback (most recent call last):
  File "/home/berker/projects/gunicorn/gunicorn/arbiter.py", line 517, in spawn_worker
    worker.init_process()
  File "/home/berker/projects/gunicorn/gunicorn/workers/base.py", line 117, in init_process
    self.wsgi = self.app.wsgi()
  File "/home/berker/projects/gunicorn/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/home/berker/projects/gunicorn/gunicorn/app/wsgiapp.py", line 65, in load
    return self.load_wsgiapp()
  File "/home/berker/projects/gunicorn/gunicorn/app/wsgiapp.py", line 52, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/home/berker/projects/gunicorn/gunicorn/util.py", line 355, in import_app
    __import__(module)
  File "/home/berker/projects/testdjango/a.py", line 1, in <module>
    from flask import Flask
ImportError: No module named flask

and other kind of error messages:

$ gunicorn a:app --error-logfile=- --access-logfile=-
[2014-12-10 17:18:52 +0000] [12294] [INFO] Starting gunicorn 19.2.0
[2014-12-10 17:18:52 +0000] [12294] [ERROR] Connection in use: ('127.0.0.1', 8000)
[2014-12-10 17:18:52 +0000] [12294] [ERROR] Retrying in 1 second.
[2014-12-10 17:18:53 +0000] [12294] [ERROR] Connection in use: ('127.0.0.1', 8000)
[2014-12-10 17:18:53 +0000] [12294] [ERROR] Retrying in 1 second.
[2014-12-10 17:18:54 +0000] [12294] [ERROR] Connection in use: ('127.0.0.1', 8000)
[2014-12-10 17:18:54 +0000] [12294] [ERROR] Retrying in 1 second.

I faced the same issue when I tried to run the app like this:

gunicorn app:application -b localhost:3000

The problem had been the presence of a directory named app too in addition to the file app.py. It worked after renaming the python file.

@jay-tyler jay-tyler added a commit to jay-tyler/codefellows-ansible-example that referenced this issue Aug 20, 2015

@jay-tyler jay-tyler bad request currently; trying rename re last comment here benoitc/gun… 715b59b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment