New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node-labeller.sh: Consider AppArmor restrictions #8692
node-labeller.sh: Consider AppArmor restrictions #8692
Conversation
Use set -x to print commands and their arguments as they are executed. Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Even though the virt-handler pod is privileged, on the systems with AppArmor there might be a host profile which will be automatically picked for the /usr/sbin/libvirtd binary. That may block the execution of /usr/libexec/qemu-kvm. In such a case, try moving the qemu executable to a location, which is more common for AppArmor-enabled Linux distros. Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
BTW, this commit |
@vasiliy-ul: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/hold This needs some more discussions. |
/unhold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the mailing list conversation this needs some more discussion IIUC.
I think we clarified some concerns in the mailing list, though in general - yes, probably makes sense to have some more discussions. |
@vasiliy-ul this looks very hacky and makes me uncomfortable, mainly because the binary path would then not match between the capabilities XML generated by virt-handler and virt-launcher. Are the capabilities generated by virt-handler only used for the purpose of node labeling? And is the libvirtd instance running inside each virt-launcher container then generating its own capabilities from scratch at startup? IIUC the denial is caused by the fact that libvirtd running inside the virt-handler container gets the host AppArmor policy applied to it. If that's the case, since the AppArmor policy is mantained as part of libvirt itself, could we perhaps just patch it so that it allows QEMU binaries installed under OTOH, doesn't the AppArmor profile affect the virt-handler container only because the libvirt package is installed on the host? Does it even make sense to have that installed, if you're going to be using the host as a node in a Kubernetes cluster? Perhaps we should document that the libvirt package should not be installed on the host when KubeVirt is going to be deployed? Apologies if the questions are silly, I'm just not very familiar with how SELinux / AppArmor interact with containers. |
Yes, this is what I do not like either in the approach.
AFAIK, yes, the XML generated by node-labeller.sh is used only for labeling the node, and it is not used by virt-launcher.
I think it is already fixed to a certain extent in the latest libvirt profile. Mentioned it here: #8692 (comment) The addition of
should cover the KubeVirt case (though only if
We already have some documentation on how to adapt the host AppArmor profile to the KubeVirt case. IMHO, I don't think we should restrict users in what they might have installed on the host.
No, in fact good questions. Thanks for those. I am not insisting on moving the PR in, especially since it raises doubts and concerns. This is more like a quick workaround to address the problem. As already mentioned in the mailing list, perhaps having the use-case documented is just good enough. |
/hold |
That's not the case for the Debian / Ubuntu package, which passes
to meson. But that's for internal libvirt components such as
I don't see why we couldn't add a similar rule for QEMU.
Can you please point me to it?
Is it common practice to run workloads unrelated to Kubernetes on a Kubernetes node? I would have thought that you'd dedicate the machine fully to that role, but perhaps that doesn't match reality. |
Yeah, I think that would be the right thing to do 👍
It's been added recently: https://kubevirt.io/user-guide/operations/installation/#integration-with-apparmor
Well, it depends, I guess... I just think that restrictions on the host side make KubeVirt less user-friendly. Especially when this particular problem can be solved via adapting the profile on the host. |
I'll prepare a libvirt patch later this week.
Perhaps we could amend that to suggest uninstalling libvirt as a first possible workaround. Users might not actually need libvirt on the host, and overall it's much simpler to uninstall a package than it is to edit configuration files. The latter might also result in prompts on package upgrades. |
Patch posted to the list: https://listman.redhat.com/archives/libvir-list/2022-November/235395.html |
This may actually be a concern for me. I carry a a patch which I could not contribute yet, where I am copying capability files between virt-handler and virt-launcher, to significantly speed up VM starts in virt-launcher. If the capabilities look different for libvirt it may attempt to retry the discovery ... So I would appreciate any approach which would not move the binary. I am however also interesed in AppArmor support :) |
It definitely would. From the libvirt side (full discussion here), there's confusion as to whether the interactions that we see today between the host's AppArmor configuration and the libvirtd process running inside the virt-handler container are working as intended. IIUC the whole issue stems from the fact that, when spawned by virt-handler, libvirtd is running in a privileged context whereas that's not the case when spawned by virt-launcher. I understand that virt-handler needs to be privileged, but is there any way that the libvirtd part could be moved to a separate, unprivileged container? It seems to me like that would be a good idea in general, because it would mean that libvirtd would always have the same privileges, regardless of context. I can't point to any specific instance right now, but I wouldn't be surprised if the privileged libvirtd spawned by virt-handler would advertise more capabilities than the unprivileged one spawned by virt-launcher, potentially leading to a mismatch. Would this be feasible at all? Or are there reasons why it can't be done? |
I think right now the only reason why we have it still privileged is a chicken-and-egg problem. virt-handler brings up the kvm device plugin which the discovery would need in the init-container to run unprivileged. So if we can make |
@rmohr this is all pretty in-depth when it comes to the inner working of KubeVirt, but I think I more or less get the gist. Would it be possible to turn the node-labeller job from an init container into a regular one-off pod that virt-handler creates once I'm sure it's more complicated than that :) but do you think it would be doable? Honestly the pushback I've been getting about changing the AppArmor policy to allow running I think in the short term we should just document uninstalling libvirt in the host as the preferred workaround for the AppArmor issues, in addition to the ones that are already documented. |
Yes something like that could work. Just complicates things a bit. Another option would be to run the node-labeller as a sidecar instead of an init-container. It could run unprivileged. virt-handler could boot up to a certain point, run
Yes that sounds like a reasonable short-term solution for me. @vasiliy-ul ? |
Tracking issue opened: #8744
PR opened: kubevirt/user-guide#618 |
@vasiliy-ul now that we have an issue tracking the implementation of a proper solution and the documentation has been enhanced, can we close this PR? |
Yep, closing it. |
What this PR does / why we need it:
Even though the virt-handler pod is privileged, on the systems with AppArmor there might be a host profile which will be automatically picked for the
/usr/sbin/libvirtd
binary. That may block the execution of/usr/libexec/qemu-kvm
. In such a case, try moving the qemu executable to a location, which is more common for AppArmor-enabled Linux distros.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #7771, fixes #7638
Special notes for your reviewer:
The fix is needed only when the script is executed in a privileged container. In that case, the container runtime will not apply any profile (i.e. the container will run
unconfined
). However, if an AppArmor profile for libvirt is installed on the host, it will be automatically picked for the binary /usr/bin/libvirtd.For the VM workloads the problem is not relevant since
virt-launcher
pod is not privileged, hence the runtime will apply the default profile (e.g.cri-containerd.apparmor.d
in the output belllow):Also note: after moving the binary, we will have the new path reflected in the
/var/lib/kubevirt-node-labeller/virsh_domcapabilities.xml
:But I do not think it will cause issues, as AFAIK this path element is not used by the node-labeller.
Release note: