New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain the cause of the TopologyAffinityError #96890
Conversation
Hi @pablitoergosum. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pablitoergosum The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@pablitoergosum: Adding label Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I like this idea in general (= make TM's behaviour easier to understand), but it seems this PR needs to be rebased? Also, not sure why we have a merge commit here. |
@@ -83,13 +89,15 @@ func filterProvidersHints(providersHints []map[string][]TopologyHint) [][]Topolo | |||
if len(hints[resource]) == 0 { | |||
klog.Infof("[topologymanager] Hint Provider has no possible NUMA affinities for resource '%s'", resource) | |||
allProviderHints = append(allProviderHints, []TopologyHint{{nil, false}}) | |||
lackingResources = append(lackingResources, resource) | |||
err = fmt.Errorf("not enough '%s' to allocate a pod", strings.Join(lackingResources, ", ")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure in general about this specific log, but this can be done outside the loop, right before we return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for diving into the PR! The idea behind it was to create slice of all missing resources and return it entirely with TopologyAffinityError
, which could be truly helpful for the ones not familiar with Scope
and the TopologyManager
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry for the long delay, had other things to take care.
Please rebase this PR, because the merge commits makes a bit too hard to review; for example it was nonobvious you want to bubble up the Topology Manager error up into the Topology Affinity Error.
I think the idea of making the topology manager behaviour more transparent is very nice and should totally be explored.
I was wondering if we want to make the actual error easier to be consumed, for example leveraging the pod status conditions (https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/) this is just a proposal, I think we can discuss it a bit more.
Along with rebasing the PR @pablitoergosum could you also add a release note as this is a user facing change. /kind cleanup |
@pablitoergosum: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
What this PR does / why we need it:
It provides some information explaining why TopologyAffinityError occured.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: