New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet fails to pull docker.io/rancher/pause:3.6 on Windows 11 Pro node #3915
Comments
Inspecting the manifest for
When checking how containerd evaluates platform compatibility though, it looks like it goes off string prefix. So even though the image is compatible with There's probably two paths I can see to resolve this in the immediate:
|
Looks like |
We don't technically support RKE2 on Windows 11 Pro, so we're not mirroring the image for that OS version. If you want to change the pause image to pull directly from MCR, you should be able to set |
There isn't an OS image for Windows 11 in any registry; the OS image that you are officially meant to use on Windows 11 is I couldn't figure out where the I would really like to override the containerd binary that's being used so I can test a fix for this. In the meantime I'll test adding this entry to the - config:
pause-image: mcr.microsoft.com/oss/kubernetes/pause:3.6
machineLabelSelector:
matchExpressions:
- key: cattle.io/os
operator: In
values:
- windows
matchLabels:
cattle.io/os: windows |
This comes from the rancher/rke2-runtime image on Docker Hub, with a tag matching the running RKE2 version. You can override this with the --runtime-image flag.
You can't, short of replacing the whole runtime image. You can however install and start containerd on your own, and point RKE2 at its socket with the --container-runtime-endpoint flag.
I would probably recommend using the standard |
Ah I see, Docker Hub doesn't show the Windows version in it's UI for that image: but it does actually exist in the manifest: Presumably because it doesn't have an OS version filter on it. I'm guessing the image isn't a real image you can run, but instead just contains the files which are extracted out onto host for execution? If that's the case I should still at least be able to use In this case I just need to know where Unfortunately, it looks like setting - config:
pause-image: mcr.microsoft.com/oss/kubernetes/pause:3.6
machineLabelSelector:
matchLabels:
cattle.io/os: windows - config:
pause-image: mcr.microsoft.com/oss/kubernetes/pause:3.6
machineLabelSelector:
matchLabels:
kubernetes.io/os: windows The file remains unchanged even after Rancher rolls out the changes: |
Even this doesn't work:
so maybe Rancher just can't apply the |
We build it at https://github.com/rancher/image-build-containerd and push to rancher/hardened-containerd on Docker Hub. The binaries are then copied from that image into the rke2-runtime image in the RKE2 Dockerfile: https://github.com/rancher/rke2/blob/master/Dockerfile.windows#L100
I'm not sure I can help with that. I would probably just create |
Yeah I think I might just have to do this. Rancher doesn't seem capable of setting that onto nodes, which is a little disappointing. In any case I can always patch the install script to write out the file manually and then apply it to all nodes once I have a working configuration. |
Huzzah, setting it manually into This at least gives me a path forward to make my own manifest that targets the right OS version, to see if that at least gets things over the line in the short term. |
Ok so putting a guide together for anyone else who runs into this situation and how you can work around things for now: Build some images with fixed OS versions (or use mine)You only need to do this step if you don't want to use the images I provide. The images I provide are currently:
These are identical to the originals; they just have a fixed up OS version in the metadata so they'll be pulled on Windows 11 hosts. If you'd prefer to make your own images and push them to a registry, then create the following files:
then on a Windows 11 machine with the Docker Engine for Windows installed, run something similar to the following. You'll need to replace the
Patch the install script to use Get-WindowsOptionalFeature and to set up the pause-image overrideDownload the patched script here: install-patched.txt You need to replace the You also then need to add the following code after if (!(Test-Path "C:\etc\rancher\rke2\config.yaml.d")) {
New-Item -ItemType Directory -Path "C:\etc\rancher\rke2\config.yaml.d"
}
Set-Content -Path "C:\etc\rancher\rke2\config.yaml.d\60-pause-image.yaml" -Value @"
pause-image: registry.redpoint.games/redpointgames/containers-for-windows-11/rke2-pause:3.6
"@ Then run the install script on all your nodes. It'll take several minutes for them to update and restart, so don't get too eager on running the test below. Test that Windows pods fetch images and work properlyAfter the nodes have restarted, you can create a Windows pod with:
If it is pulling correctly, you should see events like this when you do
If you're using the full image, it will take a while to pull (with no progress) because it's several GBs. I'd recommend testing with the nano image to start with. Once the image pulls and the pod starts, running
|
I'm going to close this issue out, as I ended up writing my own Kubernetes manager and doing a pull request against containerd to support 2022 containers on Windows 11 properly. |
NOTE: If you're coming to this issue and you just want something that works, see #3915 (comment).
Environmental Info:
RKE2 Version: v1.24.9+rke2r2 (installed via Rancher)
Node(s) CPU architecture, OS, and Version:
(a few of these went NotReady in my attempts to workaround this bug, but it was present even when all nodes were Ready)
Cluster Configuration:
A single Linux master node and 4 Windows 10/11 Pro nodes. I had to slightly modify the feature checks in the install script to use
Get-WindowsOptionalFeature
instead ofGet-WindowsFeature
, but other than that everything seemed to install properly.Describe the bug:
When scheduling nodes, the kubelet can't seem to pull the pause image, even though there's a
windows/amd64
version of it:Steps To Reproduce:
ps1
extension: install-patched.txtREPLACE_ME
with the hostname of the Rancher server.kubectl run win-test --image=mcr.microsoft.com/windows/server:ltsc2022
Expected behavior:
It should correctly pull the
windows/amd64
version of thedocker.io/rancher/pause:3.6
image. I can dodocker pull docker.io/rancher/pause:3.6
when running against the Docker Engine for Windows on the same machine, so this is an RKE2/Kubernetes specific bug.Actual behavior:
kubelet fails to pull the image.
I can successfully run both
mcr.microsoft.com/windows/server:ltsc2022
andmcr.microsoft.com/windows/nanoserver:ltsc2022
through Docker on the machine, so this is not some kind of fundamental OS incompatibility. It just seems like kubelet is failing to pull the image properly. Unfortunately I couldn't find anything in the logs to indicate what type of platform kubelet is trying to pull for (i.e. is it incorrectly detecting it as Linux or something like that?). I also couldn't find a way to override the platform that kubelet tries to pull images for, so I can't force it towindows/amd64
.The text was updated successfully, but these errors were encountered: