New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v243 breaks libvirt lxc guest support #13629
Comments
The containers that libvirt starts get put into their own control group / systemd scope under /machines.slice. The exception though is that libvirt_lxc controller process which is still in the main libvirtd.service control group. I guess the latter is what's being killed off & causing the containers to go away from libvirt's POV. The libvirtd.service unit file explicitly uses KillMode=process so that these supplementary processes still remain after libvirtd is stopped. The same is true of the dnsmasq processes libvirtd starts btw - are you seeing those killed off too by chance ? |
FTR this is libvirt 5.4 due to the Ubuntu Feature freezes. But I have not seen any related change in libvirt not did @berrange know about any that would be worth to try. |
@berrange the PIDs of the dnsmasq processes associated to libvirt do not change with systemd 243 installed. So the issue seems to be more specific to libvirt-lxc in this case. |
I made systemd logging verbose which is a lot, but I was able to gather some logs that might be useful. I did Comparing (a) and (b) I see the message related to commit 0219b35 showing up in line 425 / 421.
This is the "removal" of the processes. With the old code the same message was triggered but as we seen in the commit it was an early exit stopping the cleanup at this point. Interesting is that we see dnsmasq reported the same way as the libvirt-lxc guest but the former "survives" while the latter is gone afterwards. At the tail of the bad-case log we see the formerly reported libvirt error and the related systemd cleanup now:
|
I was trying to check for cgroup/hierarchy differences between dnsmasq and libvirt-lxc guests. The same service scope detects them:
And:
I agree that there is "some" machine slice grouping that lxc guest has on top of dnsmasq:
There is no counterpart to that for the dnsmasq processes to this. The grouping that the lxc guests have seems to cover the following:
So maybe their grouping in these makes them actually susceptible for the cleanup they are hit by with the new systemd version? |
How does systemd-cgls output look like when run on the host when the offending container is running? |
Hi @poettering , the first snippet of above's comment already was from As mentioned above they are grouped with with the libvirtd.service under the |
Since I know that @berrange is usually right :-) I was wondering why the guest isn't in that machine slice he mentioned. So I was revisiting
Unfortunately I have no systemd 243 for Fedora at the moment to try it there and F31 didn't want to work for me this morning. Chances are that we have two issues at once here:
|
@cpaelzer it occurred to me that libvirt's support for cgroups has significantly changed over the last 6 months as we've integrated support for cgroups v2. The existing v1 support should not have changed, but there's a non-negligible risk that something was broken by mistake that I don't know about. Also can you confirm that you actually have systemd-machined installed as that can alter the way libvirt deals with cgroups & the lack of machine.slice in your setup makes me worry it might be not installed. |
@berrange that cgroupv2 code was exactly why I also added libvirt 5.0 results above to check if behavior was different back then. In the meantime I also checked latest Debian:
And Debian are affected just like Ubuntu is - both versions lack the machine.slice and systemd 243 breaks the guest on libvirtd restart. |
@berrange For your question about Installing it on the affected systems, reboot and recreating the case showed that your guess was right. With @berrange what is libvirt upstreams expectation/recommendation on running libvirt-lxc with/without |
I don't think we've made a clear statement upstream on this matter. libvirt has cgroups code written so that it will try to work correctly on non-systemd based OS distros. So when you have a systemd host without machined present, we'll be falling back to that non-systemd cgroups code. I don't think it makes much sense to support this as an option though - it just increases the size of the test matrix & introduces new failure modes as you hit here. So on balance, I think my suggestion is that the QEMU & LXC drivers should both depend on systemd-container. This is what I did for the Fedora / RHEL RPM spec with
It might be a good idea if we change libvirt to at least issue a warning if it sees a systemd host without machined |
Ok, I now tested this on Ubuntu with systemd 243 and systemd-container installed. I can confirm that then the slicing is correct
But, and that is the non-fun part. The libvirt-lxc container still gets reaped on restart of the I can make sure that the dependency to I'll re-summarize logs of this latest setup (with |
With the need for The initial question remains, since commit 0219b35 either systemd needs to stop reaping these processes or libvirt needs to manage the processes representing the guest differently to not be victim of it. |
AFAIK, libvirt was already requesting that systemd should not reap these processes. In the libvirtd.service we set
|
Any suggestion how we should go on with this? As @berrange outlined the expectation from libvirts POV is that nothing should be reaped, but it is. Is there guidance by systemd how libvirt should further mark/setup the processes/groups/slices to not fall victim to the new code? |
So what is this bug about now? It seems initially this bug report was caused by libvirt not following the documented logic for acquiring a delegate cgroup tree. But now it turned into something else about killing processes? |
(btw, the delegation concept is documented here: https://systemd.io/CGROUP_DELEGATION.html) |
Sorry if that got lost in the former discussion. The bug always was (and still is) about libvirtd service restart unexpectedly reaping processes since 0219b35. With systemd-containerd installed the problem looks as @berrange expected, see this comment. And the following one for systemd-cgls output and debug enabled journal.
No, the service start/stop directly only starts libvirtd daemon itself. Later on, a user might take actions tos start a guest which will let libvirt spawn:
I think that follows point #4 in the linked design doc. I'm not sure without checkign code and/or the live system which design it follows in regard to the three scenarios but maybe @berrange can advise here.
That is interesting, as restart obviously is stop+start we might run into this then. The expectation (I'd think) would be that In the already attached logs you see two phases, initially it detects leftover processes that were part of the .service scope:
The control process 1524 is part of it and stays alive at this point (correect). But later on we see the child process int he machine slice being removed (that is the new behavior since 0219b35).
I'm not sure if it is related, but in between those two sections I see:
This is the scope that contained the /bin/bash process and I'm not sure why it is considered empty. I saw nothing removing the process from it. With the new code systemd v243 now ignores that the resource is busy and continues cleanup, that seems to me what eventually really kills A user of lbvirt-lxc would not want that to happen, so the question is - what would libvirt want/need to do differently to have this process in the machine slice not killed on restart of the service? |
With recent systemd 243 as in Ubuntus systemd packages
Note: This was reported to Ubuntu in bug 1844879
While formerly a LXC guest of the following style worked:
Attached guest definition smoke-lxc.xml
from Debian/Ubuntu testcase.
Up until recently this worked and through libvirtd restart guests survived.
I pinged upstream libvirt (possible that LXC guest cgroup management would need to be fixed there or in lxc itself) but it seems this behavior wasn't reported there yet either.
Note: restaring libvirt-lxc guests always triggered
And so far the processes stayed around and libvirt still had a container to manage after restart.
But since the systemd fix for issue #12386 came in via commit 0219b35 this broke libvirtd-lxc.
The text was updated successfully, but these errors were encountered: