New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: shutdown supervisor in healthcheck.sh on fatal error, fixes #5993 #6033
Conversation
Download the artifacts for this pull request:
See Testing a PR. |
Just looking at the code says this is fine, will manually test, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great and works fine. The one request I have is not to include gunicorn, which I think will make it easier for python folks to debug. Gunicorn is more like web_extra_daemons in that it's easier to mess up, and when things go bad you want to be able to restart it.
I tested this with a bad nginx configuration and it behaved correctly. I also tried a bad PHP extra config, and surprisingly php-fpm just ignored that.
I really liked the clear output when nginx is the problem:
Waiting for web/db containers to become ready: [web db] Failed waiting for web/db containers to become ready: web container failed: log=&{2024-04-01 07:04:12.760287761 -0600 MDT 2024-04-01 07:04:13.11287521 -0600 MDT 143 nginx:FATAL Shut down }, err=ddev-d10-web container is unhealthy: &{2024-04-01 07:04:12.760287761 -0600 MDT 2024-04-01 07:04:13.11287521 -0600 MDT 143 nginx:FATAL Shut down
We don't get good info like that from apache or php.
@@ -20,6 +20,14 @@ if [ -f /tmp/healthy ]; then | |||
sleep ${sleeptime} | |||
fi | |||
|
|||
# Shutdown the supervisor if one of the critical processes is in the FATAL state | |||
for service in php-fpm nginx apache2 gunicorn; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the hardest things about python/gunicorn is that it doesn't stay up if we have a settings problem. I'd rather treat it like we do the web_extra_daemons. It might be a step forward with python.
for service in php-fpm nginx apache2 gunicorn; do | |
for service in php-fpm nginx apache2; do |
be1317a
to
01f0529
Compare
01f0529
to
7f542a6
Compare
Rebased, and pushed the image again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worked great on my first try. I set up https://github.com/ddev/test-django4-bakerydemo and didn't remember that other things had to be done. So it failed.
But I was able to ddev ssh
and study it, which is the good thing here.
And after I did the things it was OK.
The Issue
How This PR Solves The Issue
I checked https://docs.docker.com/reference/dockerfile/#healthcheck before doing any change and when we use
start_period
, no matter what we do during this period, the container cannot becomeunhealthy
.I also searched for a fast fail when starting the container, but that's all I found:
https://forums.docker.com/t/healthcheck-fail-fast-feature-during-container-start-up/121201
So, in the end, I decided to do the same thing as suggested in:
but in a simpler way.
Also I decreased
startretries=10
tostartretries=3
in supervisor, there is no reason to try more and wait longer, if the process cannot start several times already.Manual Testing Instructions
See error for php-fpm after 20-30 seconds (without waiting for 2 minutes default timeout):
Automated Testing Overview
Related Issue Link(s)
Release/Deployment Notes