Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UnmountDevice with deleted pod. #64882

Merged
merged 1 commit into from
Jun 20, 2018

Conversation

jsafrane
Copy link
Member

@jsafrane jsafrane commented Jun 7, 2018

When a pod is deleted, kubelet can't read VolumeAttachment objects. It should cache all information in a json file.

Fixes #63827

Work in progress: missing (unit?) tests

Release note:

NONE

@saad-ali @vladimirvivien @sbezverk
/sig storage

When a pod is deleted, kubelet can't read VolumeAttachment objects. It
should cache all information in a json file.
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 7, 2018
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 7, 2018
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 8, 2018
@jsafrane
Copy link
Member Author

jsafrane commented Jun 8, 2018

Unit tests added. Previous unit tests were not exactly running, because of return in this block:

// Verify
if err != nil {
if !tc.shouldFail {
t.Errorf("test should not fail, but error occurred: %v", err)
}
return
}

I.e. the tests stopped after the first tc.shouldFail test. I made the tests working.

@jsafrane jsafrane changed the title WIP: Fix UnmountDevice with deleted pod. Fix UnmountDevice with deleted pod. Jun 8, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 8, 2018
@jsafrane
Copy link
Member Author

/assign @saad-ali @vladimirvivien

@vladimirvivien
Copy link
Member

/LGTM

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, vladimirvivien

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@jsafrane
Copy link
Member Author

I'll pick it into 1.10 and 1.11 when it gets merged (tomorrow?)

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions, otherwise lgtm

@vladimirvivien can you please review this from the block device perspective.

/assign @vladimirvivien

CC @davidz627 who authored the DeviceMount code for CSI

@@ -269,6 +269,25 @@ func (c *csiAttacher) MountDevice(spec *volume.Spec, devicePath string, deviceMo
return err
}

dataDir := filepath.Dir(deviceMountPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this end up putting the file in the same dir that is mounted?
On the mount side, it saves the file in /blah/savefile.txt and puts the mount in /blah/mount/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's why there is filepath.Dir. It ends up in /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-938861e4756911e8/vol_data.json and the volume is mounted into /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-938861e4756911e8/globalmount.


data := map[string]string{
volDataKey.volHandle: csiSource.VolumeHandle,
volDataKey.driverName: csiSource.Driver,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mount side saves a lot more:

		volDataKey.specVolID:    spec.Name(),
		volDataKey.volHandle:    pvSource.VolumeHandle,
		volDataKey.driverName:   pvSource.Driver,
		volDataKey.nodeName:     node,
		volDataKey.attachmentID: attachID,

Are we sure we don't need anything else from the spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, UnstageVolume needs just volume id (+ driver).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali the mount side saves more info because it's needed to reconstruct the VolumeAttachment ID in subsequent steps.

volDataKey.volHandle: csiSource.VolumeHandle,
volDataKey.driverName: csiSource.Driver,
}
if err := saveVolumeData(dataDir, volDataFileName, data); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimirvivien: you recently moved saveVolumeData(...) from SetupAt(...) to NewMounter(...). Does something similar need to happen here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali unfortunately, the way the Attacher API is written, no params is available during csi_plugin.go#NewAttacher/#NewDetacher that would allow the volume info to be persisted at that point as is done wit NewMounter/Unmounter. So, the saving of the volume info has to happen during MountDevice call.

}
return err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the MountDevice(...) operation fails below, how is the save file cleaned up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If MounDevice fails, then the whole directory is cleaned by removeMountDir. But you raised valid point, there are other error paths (e.g. failed to load secrets) that lead to uncleaned json file. I'll fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filled #65323 + PR

@saad-ali saad-ali added this to the v1.12 milestone Jun 20, 2018
@saad-ali saad-ali modified the milestones: v1.12, v1.11 Jun 20, 2018
@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request Labels Incomplete

@jsafrane @saad-ali @vladimirvivien

Action required: This pull request requires label changes. If the required changes are not made within 3 days, the pull request will be moved out of the v1.11 milestone.

kind: Must specify exactly one of kind/bug, kind/cleanup or kind/feature.
priority: Must specify exactly one of priority/critical-urgent, priority/important-longterm or priority/important-soon.

Help

@jsafrane
Copy link
Member Author

/retest

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 92dfcfc into kubernetes:master Jun 20, 2018
Copy link
Member

@vladimirvivien vladimirvivien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saad's dir clean up concerns should be addressed.

volDataKey.volHandle: csiSource.VolumeHandle,
volDataKey.driverName: csiSource.Driver,
}
if err := saveVolumeData(dataDir, volDataFileName, data); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali unfortunately, the way the Attacher API is written, no params is available during csi_plugin.go#NewAttacher/#NewDetacher that would allow the volume info to be persisted at that point as is done wit NewMounter/Unmounter. So, the saving of the volume info has to happen during MountDevice call.


data := map[string]string{
volDataKey.volHandle: csiSource.VolumeHandle,
volDataKey.driverName: csiSource.Driver,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali the mount side saves more info because it's needed to reconstruct the VolumeAttachment ID in subsequent steps.

k8s-github-robot pushed a commit that referenced this pull request Jul 12, 2018
#65323-upstream-release-1.11

Automatic merge from submit-queue.

Automated cherry pick of #64882: Fix UnmountDevice with deleted pod. #65323: Fix cleanup of volume metadata json file.

Cherry pick of #64882 #65323 on release-1.11.

#64882: Fix UnmountDevice with deleted pod.
#65323: Fix cleanup of volume metadata json file.

```release-note
Fixed cleanup of CSI metadata files.
```
k8s-github-robot pushed a commit that referenced this pull request Jul 19, 2018
#65323-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #64882: Fix UnmountDevice with deleted pod. #65323: Fix cleanup of volume metadata json file.

Cherry pick of #64882 #65323 on release-1.10.

#64882: Fix UnmountDevice with deleted pod.
#65323: Fix cleanup of volume metadata json file.

```release-note
Fixed cleanup of CSI metadata files.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. milestone/incomplete-labels release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants