Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicorn does not come up (error 502) after hard restart of Docker server #1305

Open
IlyaSemenov opened this issue Jul 27, 2017 · 27 comments
Open

Comments

@IlyaSemenov
Copy link

IlyaSemenov commented Jul 27, 2017

Steps to reproduce

  1. Run GitLab using the guide
  2. Power cycle the server running Docker

Actual result

GitLab will never come up fully, showing error 502.

The docker container logs will have this:

2017-07-26 23:20:38,558 INFO spawned: 'unicorn' with pid 612
2017-07-26 23:20:39,160 INFO exited: unicorn (exit status 1; not expected)
...
2017-07-26 23:20:46,864 INFO spawned: 'unicorn' with pid 647
2017-07-26 23:20:47,312 INFO exited: unicorn (exit status 1; not expected)
2017-07-26 23:20:48,313 INFO gave up: unicorn entered FATAL state, too many start retries too quickly

unicorn_stderr.log will have this:

...
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:195:in `pid=': Already running on PID:601 (or pid=/home/git/gitlab/tmp/pids/unicorn.pid is stale) (ArgumentError)
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:127:in `start'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn_rails:209:in `<top (required)>'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `load'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `<main>'

Workaround

The only way to bring up GitLab will be to docker exec into the container, manually delete the stale pid file and restart the container:

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

Expected result

GitLab comes up without manual intervention.

@asbjornenge
Copy link

Any progress here? I'm trying to run gitlab latest in a docker swarm and getting stuck on this. Is the pidfile still located there for latest version? What version of gitlab were you trying @IlyaSemenov ?

@lucpolak
Copy link

lucpolak commented Oct 11, 2017

Hello,
I have the exactly same issue.
Sometimes Gitlab dont start successfully (after server reboot)
Interested about resolution of this issue.

@asbjornenge
Copy link

@lucpolak I finally got it working just using a more beefy server. I was trying to run on a g1-small on GCP, but upgrading to a n-standard-2 did the trick 👍

@lucpolak
Copy link

Hey @asbjornenge, My server is pretty good. The VM is hosted on ESXI with Intel 4c CPU and 32Gb RAM.
It is provided by OVH.
The VM is Ubuntu with docker installed on it and 4Gb RAM allowed.

I have another VM with same config with gitlab-ce installed without docker and all works fine ;-(

@arthurkrupa
Copy link

We had the same issue on a DigitalOcean 4-core VPS with 8GB RAM (~30 regular users and a lot of CI pipelines).

What helped was reducing the number of unicorn workers from 8 to 6 (using the UNICORN_WORKERS variable).

@Mario-Eis
Copy link

Exact same issue on Synology NAS. Reducing the workers did not solve the issue.
Maybe it should be mentioned, that it already worked fine. The issues started about 1-2 month ago. Maybe with 10.2.x or 10.3.x

@Mario-Eis
Copy link

Is there a workaround like automatically removing the pid file at startup?

@HengCC
Copy link

HengCC commented Apr 22, 2018

遇到了同样的问题. INFO exited: unicorn (exit status 1; not expected)2018-04-22 06:08:53,643 INFO spawned: 'unicorn' with pid 587
2018-04-22 06:08:54,647 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds
(startsecs)

@bsakweson
Copy link

bsakweson commented Apr 22, 2018

sameersbn/gitlab:10.6.4

I am seeing this same behavior at the moment and can hardly understand how to go about resolving it. I am in the process of deploying gitlab in our on-prem kubernetes cluster. Some googling shows that some people have had success beefing up memory for the running instance. I beefed up pod spec to use up to 4G RAM but that has also been futile. Here is what I am seeing in the log before kubernetes restart the containing as an effort to repair it. In essence, it hangs here:

2018-04-22 09:01:04,820 CRIT Supervisor running as root (no user in config file)
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitaly.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitlab-workhorse.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/mail_room.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/nginx.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sidekiq.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sshd.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/unicorn.conf" during parsing
2018-04-22 09:01:04,824 INFO RPC interface 'supervisor' initialized
2018-04-22 09:01:04,825 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-04-22 09:01:04,825 INFO supervisord started with pid 1
2018-04-22 09:01:05,827 INFO spawned: 'gitaly' with pid 592
2018-04-22 09:01:05,829 INFO spawned: 'sidekiq' with pid 593
2018-04-22 09:01:05,831 INFO spawned: 'unicorn' with pid 594
2018-04-22 09:01:05,833 INFO spawned: 'gitlab-workhorse' with pid 595
2018-04-22 09:01:05,835 INFO spawned: 'cron' with pid 600
2018-04-22 09:01:05,853 INFO spawned: 'nginx' with pid 601
2018-04-22 09:01:05,855 INFO spawned: 'sshd' with pid 603
2018-04-22 09:01:07,564 INFO success: gitaly entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sidekiq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: gitlab-workhorse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request
2018-04-22 09:06:03,656 INFO waiting for sshd, gitlab-workhorse, sidekiq, cron, nginx, gitaly, unicorn to die
2018-04-22 09:06:03,657 INFO stopped: sshd (exit status 0)
2018-04-22 09:06:03,662 INFO stopped: nginx (exit status 0)
2018-04-22 09:06:03,663 INFO stopped: cron (terminated by SIGTERM)
2018-04-22 09:06:03,665 INFO stopped: gitlab-workhorse (terminated by SIGTERM)
2018-04-22 09:06:05,094 INFO stopped: unicorn (exit status 0)
2018-04-22 09:06:07,097 INFO waiting for sidekiq, gitaly to die
2018-04-22 09:06:07,669 INFO stopped: sidekiq (exit status 0)
2018-04-22 09:06:07,676 INFO stopped: gitaly (exit status 1)

See these two lines where it dies and look at how long it took for it to stop:

2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request

It just sits at this point until the container is restarted by kubernetes. I have also increased initialDelaySeconds: 300, a relatively higher number to see if that resolves it but no luck.

@LM1LC3N7
Copy link

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

@fover0932
Copy link

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

@compurator
Copy link

compurator commented Jun 2, 2018 via email

@herrmanthegerman
Copy link

I'm seeing this issue just now on my GitLab installation on a Synolog NAS.

I installed GitLab via Package Center, i.e, I'm using the package provided by Synology which is based on an old version (sameersbn/gitlab:9.4.4).

Fixed the issue by removing the stale PID file. Thanks!

@sharkymcdongles
Copy link

Any actual solution rather than a mitigation? Is it just that my docker container doesn't have enough memory assigned or is there something misconfigured?

@StefanCristian
Copy link

In my case, the problem was the fact that I was using wrong signed-SSL certificates, and especially the wrong dhparam.pem certificate
Nginx didn't recognize them and faulted with that
The bad side of the story is that it didn't show up in the logs anywhere
@bsakweson @sharkymcdongles did you try with self-signed for a short test?

@jcberthon
Copy link

We experienced a file system full and had to restart GitLab. After restart we also had a error 502.

I did:

# gitlab-ctl status
run: alertmanager: (pid 551) 1449s; run: log: (pid 545) 1449s
run: gitaly: (pid 593) 1449s; run: log: (pid 589) 1449s
run: gitlab-monitor: (pid 597) 1449s; run: log: (pid 592) 1449s
run: gitlab-pages: (pid 558) 1449s; run: log: (pid 556) 1449s
run: gitlab-workhorse: (pid 553) 1449s; run: log: (pid 548) 1449s
run: logrotate: (pid 596) 1449s; run: log: (pid 591) 1449s
run: nginx: (pid 579) 1449s; run: log: (pid 578) 1449s
run: node-exporter: (pid 552) 1449s; run: log: (pid 547) 1449s
run: postgres-exporter: (pid 563) 1449s; run: log: (pid 560) 1449s
run: postgresql: (pid 561) 1449s; run: log: (pid 557) 1449s
run: prometheus: (pid 594) 1449s; run: log: (pid 590) 1449s
run: redis: (pid 549) 1449s; run: log: (pid 543) 1449s
run: redis-exporter: (pid 550) 1449s; run: log: (pid 544) 1449s
run: registry: (pid 542) 1449s; run: log: (pid 540) 1449s
run: sidekiq: (pid 541) 1449s; run: log: (pid 539) 1449s
run: sshd: (pid 20) 1480s; run: log: (pid 19) 1480s
run: unicorn: (pid 33646) 1s; run: log: (pid 559) 1449s

All services were up except unicorn which kept on restarting.

I've checked the log files of unicorn and it stated:

ArgumentError: Already running on PID:777 (or pid=/opt/gitlab/var/unicorn/unicorn.pid is stale)

So as already mentioned above a simple rm /opt/gitlab/var/unicorn/unicorn.pid was enough. Actually because GitLab (omnibus installation) was keeping on restarting unicorn, I did not have to restart anything. After a second, unicorn was up and running and GitLab was healthy again! :-)

@gjrtimmer
Copy link
Contributor

Removing the PID and restarting also solved my issue.
Caused by reboot of my Synology NAS.

@solidnerd @sameersbn
Can we fix this permanently by adding a cleanup in the entrypoint ?

Example:

#!/bin/bash

#Define cleanup procedure
cleanup() {
    echo "Container stopped, performing cleanup..."
}

#Trap SIGTERM
trap 'cleanup' SIGTERM

#Execute a command
"${@}" &

#Wait
wait $!

#Cleanup
cleanup

@JMLX42
Copy link

JMLX42 commented Jun 21, 2019

Could it be because docker kills the gitlab container before unicorn had enough time to shutdown?
Maybe we could try setting the --stop-timeout Docker setting to a higher value.

@mgscreativa
Copy link

Same here, this command worked for me but the PID path is different in my case opt/gitlab/var/unicorn/unicorn.pid

docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab

I've put that in my cron file and it works!

@abulka
Copy link

abulka commented Jan 29, 2020

The 502 problem happens when I stop and then start GitLab from the Synology package manager UI, which I don't consider to be a "hard restart". As such, its a problem every docker GitLab synology deployment is going to have very quickly.

Users will have to be lucky enough to find this thread and learn to run a docker command (the rm and restart mentioned above, which works for me) to fix the problem. And running a docker command in synology is not straightforward via the UI - you have to do the following:

gitlab synology restart fix

This command has to be issued every time the NAS restarts etc. unless they use a cron job fix mentioned, which I'm not sure how to do on a synology - @mgscreativa can you please elaborate?

I think this issue is pretty serious and needs a proper fix.

@mgscreativa
Copy link

Hi @abulka sorry I don't have Synology hardware!

@hannes-ucsc
Copy link

hannes-ucsc commented Jan 31, 2020

Same here on an EC2 instance booting from a RancherOS AMI. So this is not specific to Synology. This occurred after sudo reboot. The workaround of running docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab worked.

12.4.5 (539f5fc0384)

@stale
Copy link

stale bot commented May 6, 2020

This issue has been automatically marked as stale because it has not had any activity for the last 60 days. It will be closed if no further activity occurs during the next 7 days. Thank you for your contributions.

@stale stale bot added the wontfix label May 6, 2020
@IlyaSemenov
Copy link
Author

84f

@stale stale bot removed the wontfix label May 7, 2020
@sameersbn
Copy link
Owner

Sorry about this. Does this issue still exists with the newer releases?

@sameersbn
Copy link
Owner

ah i see its present in 12.4.5 too. will make a fix soon

@solidnerd
Copy link
Collaborator

@sameersbn I think we need to this with puma right by now instead of unicorn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests