Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA: generated resource claim names #117351

Merged

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Apr 14, 2023

What type of PR is this?

/kind cleanup
/kind api-change

What this PR does / why we need it:

One concern raised during the initial DRA reviews and again in #116254 (comment) was the risk of name conflicts between automatically and manually created ResourceClaims.

This PR addresses that by switching to generated names for automatically created ResourceClaims, with a pod.status extension to record that generated name for code where listing all ResourceClaims is impossible (kubelet) or more complex (node authorizer).

Which issue(s) this PR fixes:

Fixes #113722

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The names of ResourceClaims generated from ResourceClaimTemplate are now generated. The base name is still `<pod>-<claim name>`, but a random suffix will avoid name collisions.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- KEP: https://github.com/kubernetes/enhancements/issues/3063

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/code-generation area/kubelet sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 14, 2023
@bart0sh bart0sh added this to WIP in SIG Node PR Triage Apr 14, 2023
@fedebongio
Copy link
Contributor

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Apr 20, 2023
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to fix comments in type ClaimSource - "The name of the ResourceClaim will be..."

// the corresponding ResourceClaim.
type PodResourceClaimStatus struct {
// Name uniquely identifies this resource claim inside the pod.
// This must be a DNS_LABEL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And must match a defined resource claim?

Copy link
Contributor Author

@pohly pohly Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I wonder whether that should replace "must be a DNS_LABEL" because that follows from "must be the name of an entry in pod.spec.resourceClaims". I'll use both for now.

Name string

// ResourceClaimName is the name of the ResourceClaim that was
// generated for the Pod.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...in the same namespace as this pod.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, let's call that out explicitly.

@pohly pohly force-pushed the dra-generated-resource-claim-names branch from d0c7f40 to 820fdc9 Compare May 3, 2023 10:58
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 3, 2023
@pohly pohly force-pushed the dra-generated-resource-claim-names branch from 820fdc9 to 67ac68e Compare May 3, 2023 12:04
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 3, 2023
@pohly pohly force-pushed the dra-generated-resource-claim-names branch from a8e4a55 to f408285 Compare July 7, 2023 19:47
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 7, 2023
@pohly pohly force-pushed the dra-generated-resource-claim-names branch from f408285 to ac5ddb8 Compare July 7, 2023 20:51
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2023
@pohly pohly force-pushed the dra-generated-resource-claim-names branch from ac5ddb8 to a7597e7 Compare July 8, 2023 10:38
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2023
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With comments from Jordan address through use of MutationCache, the controller changes lgtm

SIG Node CI/Test Board automation moved this from Archive-it to PRs - Needs Approver Jul 10, 2023
func Name(pod *v1.Pod, podClaim *v1.PodResourceClaim) string {
if podClaim.Source.ResourceClaimName != nil {
return *podClaim.Source.ResourceClaimName
func Name(pod *v1.Pod, podClaim *v1.PodResourceClaim) (name *string, mustCheckOwner bool, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much overhead is it to just always check for the owner? It just feels awkward to have this extra boolean returned from here. Or better yet, why can't we just check for the owner in here and return an error if the pod passed in is not the owner whenever it "must" be checked.

klog.V(3).InfoS("Processing resource", "podClaim", podClaim.Name, "pod", pod.Name)
claimName, mustCheckOwner, err := resourceclaim.Name(pod, podClaim)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably wrap this error to be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Comment on lines +77 to +80
if claimName == nil {
// Nothing to do.
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would there ever be no error, but claimName == nil? I would think that that would be an error condition from within resourceclaim.Name().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the new feature: if creating a claim was intentionally skipped (nil in the claim name field), then all code dealing with the pod can skip the claim. resourceclaim.Name indicates that with a nil claimName. Returning a special error code also would work.

This was a use case that came up with Multi-Network. We don't use this with "normal" claims yet, but if we ever need it, all releases >1.28 will support it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not an error to have a nil claim name, then it shouldn't be hidden behind an error. Maybe just adding a comment above the check as to why it might ever be nil is sufficient (at which point we can remove the comment of "Nothing to do" as that should be obvious from the new comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment changed.

Comment on lines -275 to -280
// The error usually has already enough context ("resourcevolumeclaim "myclaim" not found"),
// but we can do better for generic ephemeral inline volumes where that situation
// is normal directly after creating a pod.
if isEphemeral && apierrors.IsNotFound(err) {
err = fmt.Errorf("waiting for dynamic resource controller to create the resourceclaim %q", claimName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a whole lot of context on why this was needed to begin with, but why is this not needed anymore? I guess it's all captured within the resourceclaim.Name() call now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. It'll check for this based on the pod status field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

@pohly pohly force-pushed the dra-generated-resource-claim-names branch from a7597e7 to aff8342 Compare July 11, 2023 11:29
Comment on lines 76 to +80

if claimName == nil {
// Nothing to do.
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment such as the following here? Looks like it will need to be propagated to 3 places in total.

// The claim name might be nil if no underlying resource claim
// was generated for the referenced claim. There are valid use
// cases when this might happen, so we simply skip it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed - please take another look.

func Name(pod *v1.Pod, podClaim *v1.PodResourceClaim) string {
if podClaim.Source.ResourceClaimName != nil {
return *podClaim.Source.ResourceClaimName
func Name(pod *v1.Pod, podClaim *v1.PodResourceClaim) (name *string, mustCheckOwner bool, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I don't feel too strongly, so we can leave it as is.

Generating the name avoids all potential name collisions. It's not clear how
much of a problem that was because users can avoid them and the deterministic
names for generic ephemeral volumes have not led to reports from users. But
using generated names is not too hard either.

What makes it relatively easy is that the new pod.status.resourceClaimStatus
map stores the generated name for kubelet and node authorizer, i.e. the
information in the pod is sufficient to determine the name of the
ResourceClaim.

The resource claim controller becomes a bit more complex and now needs
permission to modify the pod status. The new failure scenario of "ResourceClaim
created, updating pod status fails" is handled with the help of a new special
"resource.kubernetes.io/pod-claim-name" annotation that together with the owner
reference identifies exactly for what a ResourceClaim was generated, so
updating the pod status can be retried for existing ResourceClaims.

The transition from deterministic names is handled with a special case for that
recovery code path: a ResourceClaim with no annotation and a name that follows
the Kubernetes <= 1.27 naming pattern is assumed to be generated for that pod
claim and gets added to the pod status.

There's no immediate need for it, but just in case that it may become relevant,
the name of the generated ResourceClaim may also be left unset to record that
no claim was needed. Components processing such a pod can skip whatever they
normally would do for the claim. To ensure that they do and also cover other
cases properly ("no known field is set", "must check ownership"),
resourceclaim.Name gets extended.
This is not something that normally happens, but the API supports it because it
might be needed at some point, so we have to test it.
This addresses the following bad sequence of events:
- controller creates ResourceClaim
- updating pod status fails
- pod gets retried before the informer receives
  the created ResourceClaim
- another ResourceClaim gets created

Storing the generated ResourceClaim in a MutationCache ensures that the
controller knows about it during the retry.

A positive side effect is that ResourceClaims now get index by pod owner and
thus iterating over existing ones becomes a bit more efficient.
@pohly pohly force-pushed the dra-generated-resource-claim-names branch from aff8342 to fec2578 Compare July 11, 2023 12:24
@klueska
Copy link
Contributor

klueska commented Jul 11, 2023

/lgtm
/approve

For kubelet changes

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: klueska, pohly, soltysh, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 11, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 71a041eaa079748cc524c5a3a0c5bf53fd73857e

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-ci-robot k8s-ci-robot merged commit e0dafe5 into kubernetes:master Jul 11, 2023
15 checks passed
SIG Node CI/Test Board automation moved this from PRs - Needs Approver to Done Jul 11, 2023
SIG Node PR Triage automation moved this from Needs Reviewer to Done Jul 11, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jul 11, 2023
pohly added a commit to pohly/kubernetes that referenced this pull request Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Development

Successfully merging this pull request may close these issues.

dynamic resource allocation: use generated name for ResourceClaim from template
10 participants