-
Notifications
You must be signed in to change notification settings - Fork 39.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between volume path and mount path #74653
Distinguish between volume path and mount path #74653
Conversation
@@ -917,7 +917,7 @@ func (oe *operationExecutor) CheckVolumeExistenceOperation( | |||
if attachable != nil { | |||
var isNotMount bool | |||
var mountCheckErr error | |||
if isNotMount, mountCheckErr = mounter.IsLikelyNotMountPoint(mountPath); mountCheckErr != nil { | |||
if isNotMount, mountCheckErr = mounter.IsNotMountPoint(mountPath); mountCheckErr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for safety, we should use (reverted, because it checks for attachable volume only).mounter.IsNotMountPoint
here because mount source may be a directory on the host (bind mount)
A very special case here is mount.IsNotMountPoint/IsLikelyNotMountPoint
may fail because mount point is broken (e.g. stale NFS file handle). If this happens, reconstruction will fail and never recover because reconstruction happens only once.
I'm not sure if this is expected. If it is not expected, maybe we need to use another way to check volume existence in reconstruction or don't check it at all?
e85a1ca
to
8940976
Compare
/test pull-kubernetes-e2e-gce-csi-serial |
if checkErr != nil { | ||
return nil, checkErr | ||
} | ||
// If mount or symlink doesn't exist, volume reconstruction should be failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jingxu97 can this check be removed? This means that the directory still exists and should be cleaned up.
Also this assumes that all volumes use mounts, which is not necessarily true (esp for ephemeral types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, this is the PR that introduced this check. Reading through the original bug and discussion, I'm not sure how this fix actually addresses the original issue... cc @chakri-nelluri
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok @jingxu97 and I discussed and I understand now.
If the volume was partially torn down (ie the mount point is gone, but the directory still exists), then we don't want reconstruction to add the volume back to the ASW (because it's actually not mounted) and cause an empty directory to be mounted into the pod. Let's update the comment here to make that reason more obvious.
However, this brings up some other issues:
- What happens if the volume plugin doesn't actually use mounts? Maybe this isn't an issue for attachable plugins. I can't think of a real attachable plugin that is not mount-based.
- If there was a case of kubelet/node crash in the middle of cleanup, then we may not be able to properly cleanup the directories for CSI because we only try reconstruction once during kubelet initialization, when the CSI plugin may not be registered yet. This is also more of a corner case and we don't need to consider fixing it urgently for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jingxu97 can this check be removed? This means that the directory still exists and should be cleaned up.
Also this assumes that all volumes use mounts, which is not necessarily true (esp for ephemeral types).
@msau42 This code is conditional and was only meant for plugins which support attach and needs to be reattached and mounted on a node reboot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What happens if the volume plugin doesn't actually use mounts? Maybe this isn't an issue for attachable plugins. I can't think of a real attachable plugin that is not mount-based.
Block devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If there was a case of kubelet/node crash in the middle of cleanup, then we may not be able to properly cleanup the directories for CSI because we only try reconstruction once during kubelet initialization, when the CSI plugin may not be registered yet. This is also more of a corner case and we don't need to consider fixing it urgently for now.
@msau42 This might show up if we restart Kubelet to recover for any other issues..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CheckVolumeExistance function does handle raw block, although I wonder if it possible to the link to change across reboots? I think this is what the losetup is supposed to protect against. cc @wongma7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only kubelet manipulates these links, technically it can change but so can a mount device (we check that a mount point exists only, not if it points to the right thing). (losetup doesn't help here as the failure scenario the losetup code is intended to prevent is something different like: volume at /dev/sda somehow gets attached then another volume gets attached at /dev/sda, silently. The loop device holding onto /dev/sda is supposed to prevent detach from happening at all. If there's a reboot, I don't believe the loop devices persist anyhow.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test the reboot behavior for raw block separately to make sure we're not going to accidentally give the wrong device to container if the device names change on a reboot.
/lgtm |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/assign @jingxu97
Jing is our reconstruction expert, let's make sure she takes a look at this.
@@ -516,6 +507,16 @@ func (rc *reconciler) reconstructVolume(volume podVolume) (*reconstructedVolume, | |||
newMounterErr) | |||
} | |||
|
|||
// Check existence of mount point for filesystem volume or symbolic link for block volume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment describes how CheckVolumeExistenceOperation
works internally to check volume existence. It's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @saad-ali means to add a comment about the change? Instead of using volume.volumePath, we need to use volumeMounter.GetPath() because the mount point setup is different for different plugins. We have to be careful about this when adding/modifying code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, sorry, because it's already merged, we can add it later.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cofyc, saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind bug |
…upstream-release-1.13 Automated cherry pick of #74653: Distinguish volume path with mount path
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
partially addresses #74650
Special notes for your reviewer:
Does this PR introduce a user-facing change?: