Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: avoid waiting for stdout eof of /etc/network/ scripts #274

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

frwbr
Copy link
Contributor

@frwbr frwbr commented Sep 28, 2023

Hi,

As mentioned in the commit message, this patch is intended to increase compatibility with ifupdown with regards to how /etc/network/ scripts are run (see commit message for a simple example).

It also fixes an issue where a Debian 12 host hangs indefinitely on boot if current ifupdown2 master as well as the packages ntpsec and ntpsec-ntpdate are installed.

The hang happens because ntpsec-ntpdate installs a script /etc/network/if-up.d/ntpsec-ntpdate -- the following excerpt is relevant:

#!/bin/sh
...
(
...
service="ntpsec"
...
invoke-rc.d --quiet "$service" start >/dev/null 2>&1 || true
) &

The invoke-rc.d call amounts to systemctl start ntpsec, so the script essentially runs systemctl start ntpsec in a background subshell. Unfortunately this command blocks indefinitely (see below). Since ifupdown2 waits for stdout end-of-file, it waits for the command to terminate even though it is run in a background subshell. As a result, the boot process hangs indefinitely. I presume the systemctl start ntpsec command blocks indefinitely because ntpsec.service sets Wants=network.target, but network.target will only become available when ifupdown2 finishes, which it never does. With ifupdown instead of ifupdown2, the boot does not hang as the blocking command is executed in the background.

I guess the hang issue can also be fixed by patching the unit file of ntpsec and/or the network script of ntpsec-ntpdate -- to be honest I have not dug deep enough yet to fully grasp the network script. As this PR also increases compatibility with ifupdown, I decided to send it here anyway. Let me know what you think.

Thanks and best wishes,

Friedrich

Scripts in /etc/network/ are executed using `exec_command` which
captures stdout by default, and thus waits for stdout end-of-file via
`Popen.communicate()`. However, this can cause hangs if the network
script executes a long-running command in the background. Can be
reproduced by putting the following (executable) script in
/etc/network/if-up.d/:

	#!/bin/sh
	sleep 5&

This script will cause `ifreload -a` to wait for 5 seconds per network
interface.

To avoid waiting, do not capture stdout when executing /etc/network/
scripts. This also improves compatibility with ifupdown, which runs
the above script in the background.

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
@frwbr
Copy link
Contributor Author

frwbr commented Dec 14, 2023

Hi, any thoughts on this patch?

I first started looking into this because some Proxmox VE users reported hanging boots if both ntpsec and ntpsec-ntpdate are installed (Proxmox VE ships a variant of ifupdown2). I've also created an issue on the Proxmox VE bugtracker with some more detailed information.

@xabix
Copy link

xabix commented Dec 18, 2023

Hi All,

what should we be doing to get this solved? I lost few hours tshooting this so now I am srronger :) and thanks @frwbr for reporting the issue.

@julienfortin julienfortin self-assigned this Dec 18, 2023
@julienfortin julienfortin merged commit 7a28bcb into CumulusNetworks:master Dec 18, 2023
@frwbr
Copy link
Contributor Author

frwbr commented Dec 19, 2023

Thanks for the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants