Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Attachment ID matching #106557

Closed

Conversation

jsafrane
Copy link
Member

When deciding if a string is VolumeAttachment ID or a volume unique name, make sure that volumes of a CSI driver named "csi-foo.bar" are recognized as unique volume names and not attachment IDs.

Use a regexp that allows only characters that are in sha256 (0-9,a-f), which is part of the attachment ID. Unique volume names always have "^" in their name and the regexp won't match them.

/kind bug

Which issue(s) this PR fixes:

Fixes #97230

Fixed detaching CSI volumes from nodes when a CSI driver name has prefix "csi-".

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 19, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2021
When deciding if a string is VolumeAttachment ID or a volume unique name,
make sure that volumes of a CSI driver named "csi-foo.bar" are recognized
as unique volume names and not attachment IDs.

Use a regexp that allows only characters that are in sha256 (0-9,a-f),
which is part of the attachment ID. Unique volume names always have "^"
in their name and the regexp won't match them.
@jsafrane
Copy link
Member Author

/priority important-soon
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 22, 2021
@@ -645,7 +649,7 @@ func getAttachmentName(volName, csiDriverName, nodeName string) string {
// and false otherwise
func isAttachmentName(unknownString string) bool {
// 68 == "csi-" + len(sha256hash)
return strings.HasPrefix(unknownString, "csi-") && len(unknownString) == 68
return len(unknownString) == 68 && attachmentNameRegexp.MatchString(unknownString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been wondering if we can get rid of this check altogether. Do you remember why was this check added? It sounds like it was added because we record intree volume names in node.Status.VolumeAttached field and when ASOW is rebuild during ADC restart, we try to detach using intree names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment says it's related to CSI migration:

if isAttachmentName(volumeName) {
// Detach can also be called with the attach ID as the `volumeName`. This codepath is
// hit only when we have migrated an in-tree volume to CSI and the A/D Controller is shut
// down, the pod with the volume is deleted, and the A/D Controller starts back up in that
// order.
attachID = volumeName

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I did read the comment, but I am trying to understand condition under which we will hit this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked it with in-tree AWS EBS PV and in-line AWS EBS volume in a pod, both with CSI migration enabled and disabled, and I've never experienced attachment ID in volumeName.

Looking at the code, there was a corner case where attachment ID was used in the volume name, 5dde1df#diff-3e8f8a6e5eca57db8e84bcc6c896f0a3b10289fe55d8e999b4bdf1cfdc0de394R447-R452

But the code got removed later on in 129f153#diff-3e8f8a6e5eca57db8e84bcc6c896f0a3b10289fe55d8e999b4bdf1cfdc0de394L542-L546

It seems to me that volumeName is always unique volume ID in format plugin^unique ID. I think we could remove the whole isAttachmentName().

@ddebroy, you wrote the later commit referenced here, what do you think?

Copy link
Member Author

@jsafrane jsafrane Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, node.status.volumesAttached with an in-tree in-line AWS EBS volume migrated to CSI looks like a CSI volume:

  volumesAttached:
  - devicePath: ""
    name: kubernetes.io/csi/ebs.csi.aws.com^vol-07acc366093a6af02
  volumesInUse:
  - kubernetes.io/csi/ebs.csi.aws.com^vol-07acc366093a6af02

(today's master, almost Kubernetes 1.23)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! I have filed #106749 to ensure the corner case situation is handled by reintroducing 129f153#diff-3e8f8a6e5eca57db8e84bcc6c896f0a3b10289fe55d8e999b4bdf1cfdc0de394L510 which was incorrectly removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #107025 to remove the whole AttachID mathing, IMO it's not used anywhere.

@jsafrane
Copy link
Member Author

/hold
#107025 looks like a better solution

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Dec 15, 2021
@k8s-ci-robot
Copy link
Contributor

@jsafrane: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jsafrane
Copy link
Member Author

/close

@k8s-ci-robot
Copy link
Contributor

@jsafrane: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Volumeattachments are not being removed on deleting pods
4 participants