Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test nginx config on SIGHUP #206

Closed
stefansundin opened this issue Sep 13, 2023 · 4 comments · Fixed by #207
Closed

Test nginx config on SIGHUP #206

stefansundin opened this issue Sep 13, 2023 · 4 comments · Fixed by #207

Comments

@stefansundin
Copy link
Contributor

Hello,

It would be great if the nginx config could be tested with nginx -t when SIGHUP is received. I messed up my config and the result was (unsurprisingly) that the whole docker container died. I think it would be great if it just printed the output from nginx -t and didn't actually reload nginx in this case.

Thanks for a great project!

@JonasAlfredsson
Copy link
Owner

Hi stefansundin,

Before diving deeper into this I must ask what changes to the config were made in order to break Nginx. I was under the impression that if Nginx has managed to load a valid config once it will just print an error if presented with a new erroneous config after a reload. The exception here is probably if there are missing files or similar.

I am therefore not entirely sure what benefit including this check in the reload would provide, since it is intended more for the administrator to review its output to catch any obvious errors. The SIGHUP function here is just an automatic process which will basically never be inspected by a human until something goes wrong and then most of the stuff would already be in the error log.

So I am not rejecting this idea, it is just that we probably need to better specify what the actual issue is and what you want to solve. :)

@stefansundin
Copy link
Contributor Author

Hello again, thanks for getting back to me :)

I was under the impression that if Nginx has managed to load a valid config once it will just print an error if presented with a new erroneous config after a reload. The exception here is probably if there are missing files or similar.

Unfortunately that does not seem to be the case. Here's an example of a mistake I just made:

I wanted to add a log_format directive but unfortunately I placed it inside of the server block instead of the http block. I sent the SIGHUP signal to reload the config file, which caused the container to crash again. Here's the log output:

bash-5.1# docker kill --signal=HUP ae8424eb96be
bash-5.1# docker logs -f ae8424eb96be

[... log output cut for brevity ...]

2023/09/15 22:05:24 [info] Received SIGHUP signal; terminating the autorenewal sleep process
Terminated
2023/09/15 22:05:24 [info] Running the autorenewal service
2023/09/15 22:05:24 [info] Starting certificate renewal process
2023/09/15 22:05:24 [info] Requesting an ECDSA certificate for 'redacted.example.com' (http-01 through webroot)
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Certificate not yet due for renewal

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certificate not yet due for renewal; no action taken.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2023/09/15 22:05:28 [emerg] 1209#1209: "log_format" directive is not allowed here in /etc/nginx/conf.d/redacted.example.com.conf:27
nginx: [emerg] "log_format" directive is not allowed here in /etc/nginx/conf.d/redacted.example.com.conf:27
2023/09/15 22:05:28 [notice] 107#107: signal 15 (SIGTERM) received from 1, exiting
2023/09/15 22:05:28 [notice] 1051#1051: exiting
2023/09/15 22:05:28 [notice] 1052#1052: exiting
2023/09/15 22:05:28 [notice] 1051#1051: exit
2023/09/15 22:05:28 [notice] 1053#1053: exiting
2023/09/15 22:05:28 [notice] 1052#1052: exit
2023/09/15 22:05:28 [notice] 107#107: signal 14 (SIGALRM) received
2023/09/15 22:05:28 [notice] 107#107: signal 17 (SIGCHLD) received from 1053
2023/09/15 22:05:28 [notice] 107#107: worker process 1052 exited with code 0
2023/09/15 22:05:28 [notice] 107#107: cache manager process 1053 exited with code 0
2023/09/15 22:05:28 [notice] 107#107: signal 29 (SIGIO) received
2023/09/15 22:05:28 [notice] 107#107: signal 17 (SIGCHLD) received from 1052
2023/09/15 22:05:28 [notice] 107#107: signal 17 (SIGCHLD) received from 1051
2023/09/15 22:05:28 [notice] 107#107: worker process 1051 exited with code 0
2023/09/15 22:05:28 [notice] 107#107: exit
bash-5.1# docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
bash-5.1# 

Hopefully this is enough information. Please let me know if there's anything else you need.

If you agree that a config test should be added but you don't have time to add it yourself then I can give it a try.

@JonasAlfredsson
Copy link
Owner

Sorry, yes, I am a little bit busy this week so I apologize for the slow responses.

However, I am sitting here thinking about the change and I am not sure this should be included in the SIGHUP loop. I feel like this automatic loop needs to be able to draw attention to itself in case something is significantly wrong, and a container restart is about as clear it get that manual intervention is necessary.
Changing this behavior to "silently" ignoring problems by just printing to logs feels like a major change to how this container behaves right now, which could lead to users not realizing something has gone wrong with their deployment unless they actually look in the logs.

Would it perhaps be better to add this check to another signal, or just execute the check manually (in order to see the result directly)

docker exec -it <container_name> bash -c 'DEBUG=1; . /scripts/util.sh; symlink_user_configs && nginx -t'

@stefansundin
Copy link
Contributor Author

Hmm. I can't say I agree but you should do what you think it best, of course. :)

In my personal opinion the logs are the standard way for a program to communicate with the administrator what is going on. I don't know if it needs to be terribly different from running nginx as a systemctl service.

It appears that the nginx systemctl service does what you originally thought, check the config before actually performing the reload. Not sure if all distributions use this config though, I wouldn't be surprised if some modify it.

ExecStartPre=/usr/sbin/nginx -t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants