New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
honor-first-shutdown-request #15400
honor-first-shutdown-request #15400
Conversation
@keszybz Is there anything I need to do for this pull request? Is there something wrong? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so you are only doing this for shutdowns initiated via the emergency action logic. That is intended, yeah? I think I am fine with this, I am just wondering if you are aware that this does not affect "systemctl poweroff" and so on, as those are regular, human requested shutdowns and not emergency ones...
src/core/emergency-action.c
Outdated
if ((manager_state(m) == MANAGER_STOPPING) && | ||
((action == EMERGENCY_ACTION_REBOOT) || | ||
(action == EMERGENCY_ACTION_POWEROFF))) { | ||
log_info("EmergencyAction: Shutdown is already active Skipping %s request", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, so manager_state() does a lot of things internally, most of which aren't interesting here. I think I would make the test differently:
Unit *u;
u = manager_get_unit(m, SPECIAL_SHUTDOWN_TARGET);
if (u && unit_active_or_pending(u) &&
IN_SET(action, EMERGENCY_ACTION_REBOOT, EMERGENCY_ACTION_POWEROFF, EMERGENCY_ACTION_EXIT)) {
…
Also, I think this should be log_notice(), not log_info()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ANd yeah, EMERGENCY_ACTION_EXIT should be covered too here, it's just another way to shutdown (one you would only use in a container, but that's a detail)
|
||
[Service] | ||
ExecStart=/test-honor-first-shutdown.sh | ||
ExecStop=pkill -9 test-honor-first-shutdown.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecStop=kill -KILL $MAINPID
should just work, and not pull in pkill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pkill was the method I was using to cause termination of the loop inside the script and to cause the FailureAction=reboot emergency action during the shutdown. Otherwise the shutdown would just SIGTERM the script and no failure would be seen correct? I was thinking pkill might be a handy tool to have for the test environment. In any case, I will work on a different way so I won't have to pull in pkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the line above is equivalent to the pkill. It uses the $MAINPID
env var systemd passes to ExecStop= calls anyway, so the pkill is not necessary. The effect of the line I proposed and yours i the same: signal 9 (SIGKILL) is sent to the process invoked via ExecStart=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. Will do. Thanks
My idea was that a shutdown has already been triggered via human or other way and the fix was to prevent another shutdown, via the emergency path, from interrupting that. I was thinking it would be rather difficult to manually initiate a second shutdown if one is in progress. I suppose someone could have some service that could have an ExecStop that tries to do a systemctl reboot(etc). Would that be something you would want to guard against or should we leave that to user error? Thank you for your comments, I will work on your change suggestions. |
I think your approach is fine. If people place a "systemctl reboot" in ExecStop= they get to live with the effect... |
@poettering, I believe I have addressed all of your comments, please review when you get a chance. |
src/core/emergency-action.c
Outdated
[EMERGENCY_ACTION_POWEROFF_FORCE] = "poweroff-force", | ||
[EMERGENCY_ACTION_POWEROFF_IMMEDIATE] = "poweroff-immediate", | ||
[EMERGENCY_ACTION_EXIT] = "exit", | ||
[EMERGENCY_ACTION_EXIT_FORCE] = "exit-force", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please align the rhs values to the same column. It's much easier to scan by eye then.
src/core/emergency-action.c
Outdated
IN_SET(action, EMERGENCY_ACTION_REBOOT, | ||
EMERGENCY_ACTION_POWEROFF, EMERGENCY_ACTION_EXIT)) { | ||
log_notice("EmergencyAction: Shutdown is already active Skipping %s request", | ||
emergency_action_table[action]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation, and the sentences run together. Maybe "Shutdown is already active. Skipping emergency action request %s."
|
||
[Service] | ||
ExecStart=/test-honor-first-shutdown.sh | ||
ExecStop=/bin/sh -x -c 'kill -SIGKILL $MAINPID' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shell is not nedeed here. Just ExecStop=kill -SIGKILL $MAINPID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keszybz I have tried to use this ExecStop without the shell but simply cannot make it work. I found that testsuite-09 uses the same shell to implement a kill in it's ExecStop. When I remove the shell and run the test I get the following error when running the image.
TEST-48-HONORFIRSTSHUTDOWN RUN: testing honor first shutdown
unit_file_build_name_map: normal unit file: /usr/lib/systemd/tests/testdata/units/test-honor-first-shutdown.service
test-honor-first-shutdown.service: Trying to enqueue job test-honor-first-shutdown.service/start/replace
Sent message type=error sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=s error-name=org.freedesktop.systemd1.BadUnitSetting error-message=Unit test-honor-first-shutdown.service has a bad unit file setting.
Failed to process message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a: Unit test-honor-first-shutdown.service has a bad unit file setting.
I changed it to be more like what is in testsuite-09 as follows:
ExecStop=sh -c 'kill -SIGKILL $MAINPID'
If using the shell operation is not acceptable please help me understand how to fix it.
echo "Honor first shutdown test script" | ||
while true; do | ||
sleep 3; | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe sleep infinity
instead of the loop?
|
||
# setup the testsuite service | ||
cp ../testsuite-48.units/testsuite-48.service $initdir/etc/systemd/system | ||
cp ../testsuite-48.units/testsuite-48.sh $initdir/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please please just do what the other tests do. There is no need to copy files.
test/test-functions
Outdated
elif [[ "$UNIFIED_CGROUP_HIERARCHY" = "default" ]]; then | ||
_nspawn_pre=("${nspawn_pre[@]}" env --unset=UNIFIED_CGROUP_HIERARCHY --unset=SYSTEMD_NSPAWN_UNIFIED_HIERARCHY) | ||
_nspawn_pre=("${_nspawn_pre[@]}" env --unset=UNIFIED_CGROUP_HIERARCHY --unset=SYSTEMD_NSPAWN_UNIFIED_HIERARCHY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug should be fixed in a separate commit.
Description=Testsuite service | ||
|
||
[Service] | ||
ExecStart=/testsuite-48.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No files in root. Just do what the other tests do.
please generate "perfect" PRs, i.e. where each individual commit is a logical step that makes sense, and not a historical one. We want perfect bisectability, and that means that commits in a PR shouldn't fix up commits in the same PR. (or in other words, please squash the commits so that only logical steps remain) |
also, plese provide a useful commit msg in the first place |
Inspired by systemd#15400 (comment).
0da704e
to
e89406d
Compare
Additional note with this push: The tests have been failing due to a timeout issue. It may be due to my fork being so far behind the source. I will look into any failures that may occur with the current tests. |
Create unit tests per established norm at position 52 check in_set first before getting unit
e89406d
to
9f1fa85
Compare
@poettering @keszybz The last commit was done to fix a numbering conflict of the test files with the main repo. |
LGTM. I think the test could be still simplified a bit, but this isn't terribly important. Let's merge. |
@keszybz
Here is the new pull request I spoke of in the previous one I just closed.
I went ahead with the use of log_info instead of using log_unit_info since using manager_state() instead of manager_get_unit() precludes the use of log_unit_info. If you wish that instead I can change it but I think the use of manager_state is cleaner.
I also added a test in the test directory which incorporates the use of the test service file and script.