New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: Don't invoke systemctl start
if unit is already active
#3523
Conversation
(Only compile tested locally, my rhel8 devenv bitrotted, looking at resurrecting it) |
c21cc41
to
b21cbd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: the original patch for this was partly added because the daemon was taking too long to start and racing with the D-Bus timeout, but I'm 73% sure that now with #3406, this will no longer be an issue. The error-reporting part still applies though.
rust/src/client.rs
Outdated
let activeres = Command::new("systemctl") | ||
.args(&["is-active", "rpm-ostreed"]) | ||
.output()?; | ||
if !activeres.status.success() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is-active
returns nonzero if the service is not active. But even then, I think it'd be better to not make this a hard error.
So maybe let's drop this if-statement entirely and key off purely on its stdout regardless of the exit code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, I am so used to my custom fish prompt that clearly shows exit status of last command that I get tripped up in bash cases when it isn't present.
I don't think the GPG key parsing was ever a big problem on RHEL:
Yeah. |
Adding onto the pile of hacks here unfortunately. Basically RHEL8 systemd seems to count explicit `systemctl start` invocations against the restart limit. We hit this in tests in openshift/machine-config-operator which are invoking `rpm-ostree status` and `rpm-ostree kargs` frequently.
b21cbd9
to
0556152
Compare
Hmm, jenkins CI seems to have gotten a bit worse recently. Clicking the restart button one more time but if that fails, going to override and we'll need to dig into that. |
Hmm, doesn't seem like a flake. It looks like something going wrong with the vmcheck test harness itself, but no obvious error messages. |
Can't reproduce locally so far. Opened #3528. Feel free to push there too to debug this. |
Restarted CI. SSH bug should be fixed now! |
The main motivation here is to work around coreos/rpm-ostree#3523 (Which is itself a workaround for a RHEL8 systemd bug) Basically this e2e is invoking `rpm-ostree kargs` in a pretty tight loop which triggers that bug. To read the kernel command line, we can just read `/proc/cmdline` instead. (Now, this is the *actual* cmdline instead of just rpm-ostree's view of it, but it should be fine)
Adding onto the pile of hacks here unfortunately. Basically
RHEL8 systemd seems to count explicit
systemctl start
invocationsagainst the restart limit. We hit this in tests in openshift/machine-config-operator
which are invoking
rpm-ostree status
andrpm-ostree kargs
frequently.