New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dra: update for Kubernetes 1.28 #41856
Conversation
/sig node |
content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md
Outdated
Show resolved
Hide resolved
Hello @pohly 👋 please take a look at Documenting for a release - PR Ready for Review to get your PR ready for review before Tuesday 25th July 2023. Thank you! |
Several improvements under the hood which don't need to be documented here. API changes (like storing generated resource claim names in the pod status) are part of the generated API documentation. What is worth mentioning because it was listed as "limitation" before is that pre-scheduled pods are now supported better.
👷 Deploy Preview for kubernetes-io-vnext-staging processing.
|
PR updated. Sorry, I was on vacation and had to catch up this week. |
However, it is better to avoid this because a Pod which is assigned to a node | ||
blocks normal resources (RAM, CPU) that then cannot be used for other Pods | ||
while the Pod is stuck. To make a Pod run on a specific node while still going | ||
through the normal scheduling flow, create the Pod with a node selector that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
through the normal scheduling flow, create the Pod with a node selector that | |
through the normal scheduling flow, create the Pod with a `nodeSelector` that |
Not quite sure if this is correct but you capitalize other api fields or used code style for those. Thought it looked a bit off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a node selector" refers to the general concept here, not the specific field. That is then shown in the example. I think using plain English is fine here and also used elsewhere.
/cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ready for tech review. The suggestions here wouldn't block a merge.
blocks normal resources (RAM, CPU) that then cannot be used for other Pods | ||
while the Pod is stuck. To make a Pod run on a specific node while still going | ||
through the normal scheduling flow, create the Pod with a node selector that | ||
matches exactly the desired node: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matches exactly the desired node: | |
exactly matches the desired node: |
detects this and tries to make the Pod runnable by triggering allocation and/or | ||
reserving the required ResourceClaims. | ||
|
||
However, it is better to avoid this because a Pod which is assigned to a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit)
However, it is better to avoid this because a Pod which is assigned to a node | |
However, it is better to avoid this because a Pod that is assigned to a node |
future. | ||
## Pre-scheduled Pods | ||
|
||
When creating a Pod with `nodeName` already set, the scheduler gets bypassed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When creating a Pod with `nodeName` already set, the scheduler gets bypassed. | |
When you - or another API client - create a Pod with `.spec.nodeName` already set, the | |
scheduler gets bypassed. |
nodeSelector: | ||
kubernetes.io/hostname: name-of-the-intended-node | ||
... | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optionally:
You may also be able to mutate the incoming Pod, at admission time, to unset the `.spec.nodeName` | |
field and to use a node selector instead. | |
|
||
When creating a Pod with `nodeName` already set, the scheduler gets bypassed. | ||
If some ResourceClaim needed by that Pod does not exist yet, is not allocated | ||
or not reserved for the Pod, then the kubelet will fail to run the Pod and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be great to add instruction on how to unstuck it. I think simply deleting the pod will do. The only thing - mention that if Pod is part of the replicaset, another instance may be created with the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are numerous reasons for what might be wrong, so listing all possible remediation here will not be possible. The goal in this section is more about raising awareness of the problem ("don't do this"!) and explain that some automatic mitigation is available now through the kube-controller-manager changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm overall, thanks
Thanks /lgtm Fixup PRs welcome! |
LGTM label has been added. Git tree hash: 9f280718ce19836c8db17404415c3fe8328bca25
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SergeyKanzhelev, sftim The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is a follow-up to kubernetes#41856 with the suggested enhancements.
Darn, too slow! Sorry, I should have update the PR yesterday. I was still catching after my vacation. See #42445 for the follow-up. |
dra: update for Kubernetes 1.28
Related-to: kubernetes/enhancements#3063
Fixes: #38841