Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task [06] incorrectly runs on startup #352

Closed
cornerfix opened this issue Apr 20, 2023 · 8 comments
Closed

task [06] incorrectly runs on startup #352

cornerfix opened this issue Apr 20, 2023 · 8 comments
Assignees
Labels
Milestone

Comments

@cornerfix
Copy link

cornerfix commented Apr 20, 2023

task [06] incorrectly runs on startup

I added the following line to my sshd startup file, expecting killall to be run on reboot and halt:

task [06] killall -TERM sshd

when i ommit 0 (leaving only [6]) - the command successfully executes on reboot. however, with [06] - the command correctly executes on reboot and halt, but also executes during boot process.

during boot finit prints message "[ OK ] killall" on the boot screen

sshd processes need to be killed on reboot and halt, otherwise connected clients freeze.
sysvinit, openrc and others send "TERM" signal to all sshd pids on reboot and shutdown

@cornerfix cornerfix changed the title task [06] runs on startup task [06] incorrectly runs on startup Apr 20, 2023
@troglobit
Copy link
Owner

I'll have to get back to you on the issue of; incorrectly runs on startup. I'm more curious how your sshd service line looks like, because Finit also sends TERM to sshd on reboot and shutdown by default.

In https://github.com/troglobit/finit-skel/blob/main/skel/etc/finit.d/available/sshd.conf I have specified:

task [S] /usr/bin/ssh-genhostkeys --
service [2345789] <usr/ssh-hostkeys> env:-/etc/default/sshd /usr/sbin/sshd -D $SSHD_OPTS -- OpenSSH daemon

Where, ssh-genhostkeys is a small task that runs before sshd is allowed to start. I've reserved runlevel 1 for single-user mode (no networking), and runlevel 0 and 6 are also
reserved for poweroff and reboot, respectively. This means that by just moving to runlevel 1 we can verify that Finit actually stops the SSH service, which it does. So you should definitely not need your task (above).

What version of Finit are you using?

@cornerfix
Copy link
Author

cornerfix commented Apr 20, 2023

I am using 4.3 tar.gz from "releases" section.

Stopping SSHD is a little more complicated than that.

Finit correctly stops main SSHD process.

However, SSHD spawns new sshd processes for every connected ssh session.

Spawned instances runs independently of the main sshd process. Stopping main sshd process does not stop spawned instances. This is done intentionally, e.g. to be able to change options in etc/ssh/sshd_config and restart the sshd without disconnecting the session used to make changes to configuration file.

Other init systems do the following on reboot and halt: (1) stop main sshd instance and then (2) take care to stop all spawned sshd instances (which serve connected sessions).

If step (2) is skipped - connected ssh clients will freeze (they wil not receive tcp packet indicating tcp connection is closed) .

You can see this by connecting ssh to a sshd started by finit. On poweroff - ssh client freezes indefinitely. On reboot - ssh client freezes temporatily and drops to the shell after the machine reboots and starts sshd again.

With other init systems - ssh client immediately drops to the shell on both reboot and poweroff.

You can see how openrc does this in "stop ()" section of /etc/init.d/sshd - by executing 'kill -TERM $(pgrep sshd)'

I like finit very much and would like to use it on hundreds of servers.

However, we will need to take care of correctly stopping postgresql and application servers. So - I have a question - what is the best way to execute commands on reboot and halt ? Is there a better way than 'task [06]' ?

@troglobit
Copy link
Owner

I see, did not know this. Thank you for elaborating on it! I'm guessing sshd then not just forks these to the background but also change there process group too, because Finit sends the TERM/KILL signal to all processes in the same process group.

There are lots of ways to run stuff at reboot/halt, but for service specific things like this it's better to group all related commands in a dedicated .conf file using run/task statements. I'll poke around a bit and get back to you on the "runs at startup" issue.

@troglobit
Copy link
Owner

I've looked into this now, it seems to be a very old bug/design mistake. Runlevel S is translated into 0, because of an array being used as data structure for tracking levels. That's why your shutdown task runs also at bootstrap.

I need to look into this in more detail to see if I can redesign it in a way that's still backwards compatible. Can't give you any timeline though, sorry.

@troglobit troglobit added the bug label Apr 20, 2023
@troglobit troglobit self-assigned this Apr 20, 2023
@troglobit troglobit added this to the 4.4 milestone Apr 20, 2023
@troglobit
Copy link
Owner

Fixed the issue of S and 0 being the same runlevel yesterday. Then I spent a while testing reboot/poweroff and all my ssh login sesions are properly closed, so I don't see the issue you mentioned. Finit has sig.c:do_shutdown(), which properly sends TERM to all remaining processes at reboot/poweroff:

finit/src/sig.c

Lines 289 to 318 in 4d47227

void do_shutdown(shutop_t op)
{
struct sched_param sched_param = { .sched_priority = 99 };
int in_cont = in_container();
int signo = SIGTERM;
if (!in_cont) {
/*
* On a PREEMPT-RT system, Finit must run as the highest prioritized
* RT process to ensure it completes the shutdown sequence.
*/
sched_setscheduler(1, SCHED_RR, &sched_param);
}
halt = op;
if (sdown)
run_interactive(sdown, "Calling shutdown hook: %s", sdown);
/* Update UTMP db */
utmp_set_halt();
/*
* Tell remaining non-monitored processes to exit, give them
* time to exit gracefully, 2 sec was customary, we go for 1.
*/
do_iterate_proc(kill_cb, &signo);
if (do_wait(1)) {
signo = SIGKILL;
do_iterate_proc(kill_cb, &signo);
}

I a bit curious as to why that is not sufficient in your case. But then I don't know how you've set up Finit, maybe on top of an existing sysv init install using existing start/stop-scripts?

@cornerfix
Copy link
Author

cornerfix commented Apr 23, 2023

Thanks for fixing the issue, thats great :)

I am afraid I found 2 more small bugs. Will report them soon.

No, my finit installation is not on top of sysv init - it's in alpine linux and replaces openrc (via symlink /sbin/init -> finit)

If you want - I can easily publish this VM on external IP address and give you access so you can have a look ?

This may also help with the other 2 bugs (they are reproducible on this same VM).


P.S. I just sent VM login to your @gmail address

@troglobit
Copy link
Owner

OK, I usually don't do free consulting since expectations are high and my spare time very limited. But I can have a look later tonight.

Meanwhile, it'd be good if you could tell me a bit more about the installation. For instance, have you replaced the Busyxbox reboot/shutdown/halt program too with the Finit equivalents? How did your configure line look, and which version (git hash) are you using, is it still v4.3?

@troglobit
Copy link
Owner

OK I've logged in. Looks like Finit is not properly installed. Did you see this blog post?

There is also mention about this in the README, linking to this HowTo in the source tree:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants