New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image garbage collection in K8S? #4437
Comments
|
what args do you use to run the kubelet, what's the output of |
|
|
Does this happen when you start kubelet after crio? we've seen this in the past when Just wondering, what's in |
|
I haven't implemented any specific startup order for kubelet / cri-o, just using the default systemd services. For cri-o that is the systemd service shipped with the package for CentOS 8 Stream. For the node above I can see that kubelet started 10 seconds before cri-o. My I haven't made any changes to this file so it should be the default one shipped with Fedora CoreOS. |
|
what happens when you make the kubelet depend on cri-o on startup? does this error pop up? |
|
After some testing it appears that the error message in kubelet and the error events for the node go away when crio is started before kubelet. And the vice versa is also true, the error consistently appears when crio is started after kubelet. So my fix for now is a systemd override for crio.service containing: |
|
awesome to hear! I was worried kubelet never registers the images filesystem if it starts before cri-o. Sounds like this issue is mostly cosmetic, and there's a work around. Are we good to close this? |
As far as I can tell this is the problem I am seeing, kubelet never registers the images filesystem if it starts before cri-o. You probably want to fix this in the systemd service for crio that you bundle in the OS packages. Or at least document it prominently as a known issue in the installation instructions. This caused our nodes to run out of disk space, yet Kubernetes didn't mark the nodes as having DiskPressure. I assume because kubelet incorrectly registered the images filesystem. So our clusters were trying to schedule pods on nodes that didn't have free disk to pull the images. |
|
ah I see! I would think kubelet should be able register it if it starts before cri-o, but we can work around in cri-o for now |
@nyxi Is it really necessary to have an empty |
|
@nyxi could you elaborate on how to implement the service order fix you suggested above? I am seeing this on both the nodes that I just built with cri-o. I'm assuming you mean something like the following but would like validation if possible 👍 I am also curious to @artheus question above if the additional, blank sudo vi /usr/lib/systemd/system/crio.service
# Add the following under [Unit]
Before=kubelet.service |
|
yes that should fix the problem @icpenguins |
|
Encountered the same issue. As a workaround until it gets fixed in the crio package itself, I would recommend not editing the file of the crio package because updating/upgrading the package would then install .rpmnew file instead of overwriting it with new version and either way it is not what you actually want. You can issue the following command:
and then paste the following content into the editor and save it: This would create new file: |
|
This was (finally) fixed by #4443 upstream. the related backports, in addition to new packaged versions, will fix them downstream |
containerd.service: Order containerd.service before kubelet.service, as this way it is started before a kubelet (if that unit is enabled) and allows for garbage collection to work (see cri-o/cri-o#4437 for comparison). As docker.service in https://github.com/moby/moby/blob/1f8d44babf18811ff9020de667bf6fda8d3c4401/contrib/init/systemd/docker.service#L4 is ordered after containerd.service this should suffice to ensure garbage collection to work with docker as kubernetes container runtime. Signed-off-by: David Runge <dave@sleepmap.de>
contrib/init/systemd/docker.service: Order docker.service before kubelet.service, as this way it is started before a kubelet (if that unit is enabled) and allows for garbage collection to work (see cri-o/cri-o#4437 for comparison). Signed-off-by: David Runge <dave@sleepmap.de>
So with Kubernetes you are supposed to let kubelet handle garbage collection of images.
But I can't seem to figure out how to configure that with cri-o as the container runtime? For my nodes I just get a lot of:
Warning ImageGCFailed 2s (x1416 over 4d21h) kubelet failed to get imageFs info: non-existent label "crio-images"For now I had to implement a simple systemd service/timer to run
crictl rmi --prune.Kubernetes 1.19.3
cri-o 1.19.0
The text was updated successfully, but these errors were encountered: