-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes 1.30: kueue controller fails to remove scheduling gates #2029
Comments
When I have seen issues like that, it seems to be related to a misbehaving webhook.
Why this would impact 1.30 and not 1.29 is a really interesting question. |
Well the following deployment does not reproduce it, the pods start normally on a 1.30 cluster. I'll continue to look for a reproducer, likely the issue has something to do with either istio injecting a sidecar into the Pod, or something about the argo workflows template.
|
It's not istio either, adding that to the pod starts fine. |
Alright I have a reproducer:
Contents of the kueue-manager-config Configmap:
Reproducer Pod (remove the metadata.annotations.' container.apparmor.security.beta.kubernetes.io/main' field to make it work) :
Likely cause is that the apparmor annotation graduated to Stable in 1.30 and perhaps some logic exists in the apiserver to automatically add that to the container security context? The stacktrace in the initial comment is still the same, and I see a value in it related to AppArmorProfile I think. I'm not sure kueue can actually fix this, I'll try if kueue admits the pods nicely if I set the apparmorprofile in the securitycontext. |
Setting the value in the container securityContext works, I'll close this as it requires a minor change to manifests to fix it.
|
@alculquicondor or @tenzen-y anything you think we'd want in the repo for this research? |
I guess the apiserver is rejecting the change? You should have seen some error logs in the Kueue manager for the pod reconciler. |
Yes, that's the stacktrace I posted in the initial comment. |
We don't support the K8s v1.30 yet, but mentioning this restriction might be worth it. |
I think you should open an issue on kubernetes/kubernetes. The apiserver shouldn't be trying to change a Pod if it cannot be changed. I left a comment, but @bh-tt better follow up with a dedicated bug kubernetes/kubernetes#123435 (comment) |
How is kueue removing the scheduling gates? Is it doing a read into a typed pod object, then a simple update back? If so, it is likely dropping the new apparmor field and is the thing the apiserver sees as trying to mutate the pod on update. Clients doing updates of Pod objects should either stay perfectly up to date with their API definitions so they never drop fields, or should use a patch to modify just the field they want to touch and nothing else. |
The OP is using
Opened #2056 |
1.29 or 1.30? the fields were added in 1.30 |
It could upgrade to 1.30 until a new kueue release, but we have some blockers to upgrade to 1.30: #2004 So maybe we should keep using v1.29. |
using 1.29 libs would be ok if #2056 is fixed |
Thank you for letting me know! |
What happened:
We have a Pod with the kueue.x-k8s.io/queue-name label to be managed by Kueue. Kueue correctly picks up the Pod, adds the scheduling gate, the managed label and the finalizer. The workload resource admits the Pod, but then the schedulingGate is never removed. As a result the Pod remains pending forever.
What you expected to happen:
I expected Kueue to remove the scheduling gate so the Pod can start.
How to reproduce it (as minimally and precisely as possible):
Not completely sure, but it appears to be related to upgrading tot k8s 1.30 (as on our other clusters with 1.29.3 the same kueue versions and config works fine). I'll try to get a reproducer setup.
I do have a stacktrace:
Anything else we need to know?:
For now we have disabled the pod framework controller. The batch/job controller seems unaffected, likely because (I think) it does not use scheduling gates.
Environment:
kubectl version
): v1.30.0git describe --tags --dirty --always
): v0.6.2cat /etc/os-release
): '"Debian GNU/Linux 12 (bookworm)"'uname -a
): 'Linux 6.1.0-20-amd64 Add more details and links related to the project #1 SMP PREEMPT_DYNAMIC Debian 6.1.85-1 (2024-04-11) x86_64 GNU/Linux'The text was updated successfully, but these errors were encountered: