-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[r8] TestStorageStratisReboot.testEncrypted fails #20373
Conversation
Completely wiping the currently running root partition is too aggressive: During shutdown, recent versions of systemd-shutdown [1] re-executes itself to free all open fds on the root fs (so that it can be unmounted cleanly), which doesn't work if /usr/sbin/systemd-shutdown is gone. This leads to reboot hanging at ``` [ OK ] Reached target reboot.target - System Reboot. [!!!!!!] Failed to execute shutdown binary. ``` Instead, make the partition unbootable by cleaning /etc (i.e. allowed SSH keys, boot targets, etc.) Even that is redundant, as post reboot the test already checks that it booted from the encrypted partition. But let's keep the belt and suspenders. [1] https://github.com/systemd/systemd/blob/main/src/shutdown/shutdown.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this commit rings a bell! 😁
If the bots are happy, so am I. Thanks!
They are not, so I'll have to fetch the image and debug this one I'm afraid. |
Ah, so it's not even a regression in the image, it fails on the current one as well |
Hmm could it be the switch to our different CI and that is a bit slower in general? |
Could be -- this test is really demanding. But it should have failed before then in the 50% of cases where it was running on RHOS.. I have no idea, sorry. |
I'll run it locally and maybe debug it in a separate PR (or this one) |
Thanks for the The passwordless systemd agent seems to be the issue.
|
Indeed it seems the agent got started and stopped again, so it seems it at least was triggered.
The journal:
So this smells more like a problem with the stratis setup than with the agent? |
I now became suspicious if we really fixed that on main, or if it's an actual regression in RHEL 8.10. We haven't run main tests on RHEL 8 for a fair while, but that's what I'd like to try.
That fails as on the rhel-8 branch, so conclusion: It's not a change in the tests. So let's try latest main storaged on an otherwise clean rhel-8-10 image:
but that now crashes in
That crashes on the overview now. So copy that as well ( ConclusionWe can't fix this with a backport. This is genuinely broken in RHEL 8. |
The last "green" (really in quotes) rhel-8-10 image refresh was cockpit-project/bots#6255, but there the test already failed, but got naughtied:
The image refresh a week later in cockpit-project/bots#6311 then failed, but there were zero packaging changes. So I can only conclude that somehow this changed in our naughty matching. Running The reason is quite obvious in retrospect: bots changed the module structure. In the matching case:
and currently:
|
The traceback now includes the top-level `machine` module. Delete the pattern on RHEL 9. The bug is fixed there, and the naughty hasn't matched any more since April for the same reason. Closes cockpit-project/cockpit#20373
I sent cockpit-project/bots#6369 to update the naughties. This PR is obsolete. |
The traceback now includes the top-level `machine` module. Delete the pattern on RHEL 9. The bug is fixed there, and the naughty hasn't matched any more since April for the same reason. Closes cockpit-project/cockpit#20373
Completely wiping the currently running root partition is too aggressive: During shutdown, recent versions of systemd-shutdown [1] re-executes itself to free all open fds on the root fs (so that it can be unmounted cleanly), which doesn't work if /usr/sbin/systemd-shutdown is gone. This leads to reboot hanging at
Instead, make the partition unbootable by cleaning /etc (i.e. allowed SSH keys, boot targets, etc.) Even that is redundant, as post reboot the test already checks that it booted from the encrypted partition. But let's keep the belt and suspenders.
[1] https://github.com/systemd/systemd/blob/main/src/shutdown/shutdown.c