Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supervisord Failed to Start #89

Closed
mtanvir21 opened this issue Jan 21, 2015 · 9 comments
Closed

Supervisord Failed to Start #89

mtanvir21 opened this issue Jan 21, 2015 · 9 comments
Milestone

Comments

@mtanvir21
Copy link

I've installed api-umbrella on two different Ubuntu VMs (14.04) and right when I try to start api-umbrella with:

sudo /etc/init.d/api-umbrella start

I get the following error message in the terminal:

Starting api-umbrella... [FAIL]

supervisord failed to start

See /opt/api-umbrella/var/log/supervisor/supervisord_forever.log for more details

Stopping api-umbrella... [ OK ]
error: Forever detected script exited with code: 1

I have to hit CTRL + C to get back control of the terminal.

Below is what I see in the supervisord_forever.log. Any idea why this is happening? I had no luck finding a process that was using a conflicted port.

2015-01-20 17:50:49,924 CRIT Supervisor running as root (no user in config file)
2015-01-20 17:50:49,961 INFO RPC interface 'supervisor' initialized
2015-01-20 17:50:49,962 INFO RPC interface 'laforge' initialized
2015-01-20 17:50:49,962 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2015-01-20 17:50:49,962 INFO RPC interface 'supervisor' initialized
2015-01-20 17:50:49,962 INFO RPC interface 'laforge' initialized
2015-01-20 17:50:49,962 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-01-20 17:50:49,964 INFO supervisord started with pid 2066
2015-01-20 17:50:49,967 INFO spawned: 'router-log-listener' with pid 2070
2015-01-20 17:50:49,978 INFO spawned: 'gatekeeper2' with pid 2071
2015-01-20 17:50:49,990 INFO spawned: 'gatekeeper3' with pid 2072
2015-01-20 17:50:50,005 INFO spawned: 'gatekeeper1' with pid 2074
2015-01-20 17:50:50,033 INFO spawned: 'gatekeeper4' with pid 2076
2015-01-20 17:50:50,045 INFO spawned: 'config-reloader' with pid 2078
2015-01-20 17:50:50,112 INFO spawned: 'varnishd' with pid 2081
2015-01-20 17:50:50,144 INFO spawned: 'web-delayed-job' with pid 2083
2015-01-20 17:50:50,179 INFO spawned: 'log-processor' with pid 2085
2015-01-20 17:50:50,248 INFO spawned: 'mongod' with pid 2086
2015-01-20 17:50:50,302 INFO spawned: 'redis' with pid 2087
2015-01-20 17:50:50,365 INFO spawned: 'distributed-rate-limits-sync' with pid 2089
2015-01-20 17:50:50,422 INFO spawned: 'web-nginx' with pid 2090
2015-01-20 17:50:50,475 INFO spawned: 'router-nginx' with pid 2092
2015-01-20 17:50:50,537 INFO spawned: 'varnishncsa' with pid 2093
2015-01-20 17:50:50,622 INFO spawned: 'elasticsearch' with pid 2096
2015-01-20 17:50:50,700 INFO spawned: 'web-puma' with pid 2097
2015-01-20 17:50:55,184 INFO success: gatekeeper2 entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,184 INFO success: gatekeeper3 entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,184 INFO success: gatekeeper1 entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,184 INFO success: gatekeeper4 entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,184 INFO success: varnishd entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,910 INFO success: redis entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,910 INFO success: distributed-rate-limits-sync entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,910 INFO success: web-nginx entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,910 INFO success: router-nginx entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:50:55,910 INFO success: varnishncsa entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2015-01-20 17:51:00,423 INFO success: router-log-listener entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:00,423 INFO success: config-reloader entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:00,423 INFO success: log-processor entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:00,423 INFO success: mongod entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:00,744 INFO success: elasticsearch entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:00,744 INFO success: web-puma entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2015-01-20 17:51:10,210 INFO success: web-delayed-job entered RUNNING state, process has stayed up for > than 20 seconds (startsecs)
2015-01-20 17:51:30,103 INFO exited: web-delayed-job (exit status 0; expected)
2015-01-20 17:51:48,133 CRIT Supervisor running as root (no user in config file)
Error: Another program is already listening on a port that one of our HTTP servers is configured to use. Shut this program down first before starting supervisord.
For help, use /opt/api-umbrella/embedded/bin/supervisord -h

@GUI
Copy link
Member

GUI commented Jan 21, 2015

Hm, API Umbrella defaults to using several ports in the 14000-14100 range. Could you try stopping API Umbrella (sudo /etc/init.d/api-umbrella stop) and then run this command?

$ sudo netstat -tulpn | grep ":140[0-9][0-9]"

If you could share that output, that will tell you which processes are running on any ports in the 14000-14100 range.

@mtanvir21
Copy link
Author

Hi, thanks for the prompt reply. Below is the output that I got:

tcp 0 0 0.0.0.0:14010 0.0.0.0:* LISTEN 2749/varnishd
tcp 0 0 0.0.0.0:14011 0.0.0.0:* LISTEN 2092/router.conf
tcp 0 0 0.0.0.0:14012 0.0.0.0:* LISTEN 2090/web.conf
tcp 0 0 0.0.0.0:14013 0.0.0.0:* LISTEN 2092/router.conf
tcp 0 0 127.0.0.1:14050 0.0.0.0:* LISTEN 2074/node
tcp 0 0 127.0.0.1:14051 0.0.0.0:* LISTEN 2071/node
tcp 0 0 127.0.0.1:14052 0.0.0.0:* LISTEN 2072/node
tcp 0 0 127.0.0.1:14053 0.0.0.0:* LISTEN 2076/node
tcp 0 0 0.0.0.0:14000 0.0.0.0:* LISTEN 2087/redis-server *
tcp 0 0 0.0.0.0:14001 0.0.0.0:* LISTEN 2086/mongod
tcp 0 0 127.0.0.1:14004 0.0.0.0:* LISTEN 5151/python
tcp6 0 0 :::14010 :::* LISTEN 2749/varnishd
tcp6 0 0 :::14000 :::* LISTEN 2087/redis-server *
tcp6 0 0 :::14002 :::* LISTEN 2096/java
tcp6 0 0 :::14003 :::* LISTEN 2096/java

I've noticed that I can't kill the varnishd & the nginx processes. Also, when I kill this process:

tcp 0 0 127.0.0.1:14004 0.0.0.0:* LISTEN 5151/python

then start API-Umbrella, I get a new error:

Starting api-umbrella.....Monitor died unexpectedly with exit code 0.

@mtanvir21
Copy link
Author

Also, some additional information. It seems as though API-Umbrella is already running when I load up my VM. I type the stop command and it tells me that it failed since API-Umbrella is not currently running. However, when I type the netstat command in afterwards, all the running processes seem to be API-Umbrella related. I checked to see if the API-Umbrella web page loads on local host, and it seems to be working. It's kind of confusing describing this, so please let me know if you'd like me to clarify anything.

The running processes upon boot-up:

tcp 0 0 0.0.0.0:14010 0.0.0.0:* LISTEN 3396/varnishd
tcp 0 0 0.0.0.0:14011 0.0.0.0:* LISTEN 3341/router.conf
tcp 0 0 0.0.0.0:14012 0.0.0.0:* LISTEN 3337/web.conf
tcp 0 0 0.0.0.0:14013 0.0.0.0:* LISTEN 3341/router.conf
tcp 0 0 127.0.0.1:14050 0.0.0.0:* LISTEN 3321/node
tcp 0 0 127.0.0.1:14051 0.0.0.0:* LISTEN 3318/node
tcp 0 0 127.0.0.1:14052 0.0.0.0:* LISTEN 3320/node
tcp 0 0 127.0.0.1:14053 0.0.0.0:* LISTEN 3322/node
tcp 0 0 0.0.0.0:14000 0.0.0.0:* LISTEN 3334/redis-server *
tcp 0 0 0.0.0.0:14001 0.0.0.0:* LISTEN 3333/mongod
tcp 0 0 127.0.0.1:14004 0.0.0.0:* LISTEN 3314/python
tcp6 0 0 :::14010 :::* LISTEN 3396/varnishd
tcp6 0 0 :::14000 :::* LISTEN 3334/redis-server *
tcp6 0 0 :::14002 :::* LISTEN 3344/java
tcp6 0 0 :::14003 :::* LISTEN 3344/java

@perfaram
Copy link
Contributor

I get the exact same issue on CentOS 7, last API-Umbrella version... "Supervisord failed to start".
I can access the home page, /signup and /docs, but /admin raises a 502 Bad Gateway...
(Did you try the packages on fresh, new & clean environments ?)

PS : Could you give me the path to the #60 translations ? So that I can try the one I made in french ?

@GUI
Copy link
Member

GUI commented Jan 25, 2015

Thanks for the extra details. Sorry for the trouble, though. I haven't experienced this on CentOS 6 systems, but I'll poke around and see what might be leading to these startup oddities under Ubuntu and CentOS 7.

@perfaram
Copy link
Contributor

Thanks

GUI added a commit to NREL/omnibus-api-umbrella that referenced this issue Jan 26, 2015
@GUI
Copy link
Member

GUI commented Jan 26, 2015

So it would appear as though API Umbrella was starting up during boot on Ubuntu systems after being installed. However, due to bug on our end, subsequent /etc/init.d/api-umbrella * calls were failing because it didn't realize that API Umbrella was actually already started. This was being caused by differences between how sudo handles the HOME environment variable under Ubuntu versus CentOS 6.5.

This actually got fixed on master a while ago due to other HOME environment variable issues: NREL/api-umbrella-router@7fd29fc But I've added additional tests to more explicitly test this: NREL/omnibus-api-umbrella@ea527f4

So this and other possible startup issues should be fixed as part of API Umbrella's v0.7 release. I'm going to try to get out later this week.

As a temporary workaround under API Umbrella v0.6, you can try prefixing your commands with an explicit HOME like:

$ sudo env HOME=/root /etc/init.d/api-umbrella start # or stop/status/restart

In my testing, that allows the init script to work as expected under Ubuntu systems. But I realize this is far from ideal, so, like I said, I'll try to get the new release done later this week.

Sorry for the issue, but thanks for reporting it!

@GUI GUI closed this as completed Jan 26, 2015
@GUI GUI added this to the v0.7 milestone Jan 26, 2015
@perfaram
Copy link
Contributor

Hey @GUI !
First thing, thanks for your answer. But unfortunately it doesn't work... At least not under CentOS.
Still getting :

Starting api-umbrella......... [FAIL]


Stopping api-umbrella... [  OK  ]
supervisord failed to start

  See /opt/api-umbrella/var/log/supervisor/supervisord_forever.log for more details
error: Forever detected script was killed by signal: SIGKILL
error: Script restart attempt #1
Starting api-umbrella.......... [FAIL]


Stopping api-umbrella... [  OK  ]
supervisord failed to start

  See /opt/api-umbrella/var/log/supervisor/supervisord_forever.log for more details
error: Forever detected script exited with code: 1

I'll just switch to Ubuntu - even if I hate this OS - just to test my translation (out of context, it can be pretty hard to do).

@GUI
Copy link
Member

GUI commented Feb 10, 2015

As a heads up, the new v0.7 packages have been released that should greatly improve these types of startup issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants