Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish between volume path and mount path #74653

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
29 changes: 15 additions & 14 deletions pkg/kubelet/volumemanager/reconciler/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,7 @@ func (rc *reconciler) StatesHasBeenSynced() bool {
type podVolume struct {
podName volumetypes.UniquePodName
volumeSpecName string
mountPath string
volumePath string
pluginName string
volumeMode v1.PersistentVolumeMode
}
Expand Down Expand Up @@ -477,7 +477,7 @@ func (rc *reconciler) reconstructVolume(volume podVolume) (*reconstructedVolume,
pod.UID,
volume.podName,
volume.volumeSpecName,
volume.mountPath,
volume.volumePath,
volume.pluginName)
if err != nil {
return nil, err
Expand All @@ -492,15 +492,6 @@ func (rc *reconciler) reconstructVolume(volume podVolume) (*reconstructedVolume,
} else {
uniqueVolumeName = util.GetUniqueVolumeNameFromSpecWithPod(volume.podName, plugin, volumeSpec)
}
// Check existence of mount point for filesystem volume or symbolic link for block volume
isExist, checkErr := rc.operationExecutor.CheckVolumeExistenceOperation(volumeSpec, volume.mountPath, volumeSpec.Name(), rc.mounter, uniqueVolumeName, volume.podName, pod.UID, attachablePlugin)
if checkErr != nil {
return nil, checkErr
}
// If mount or symlink doesn't exist, volume reconstruction should be failed
if !isExist {
return nil, fmt.Errorf("Volume: %q is not mounted", uniqueVolumeName)
}

volumeMounter, newMounterErr := plugin.NewMounter(
volumeSpec,
Expand All @@ -516,6 +507,16 @@ func (rc *reconciler) reconstructVolume(volume podVolume) (*reconstructedVolume,
newMounterErr)
}

// Check existence of mount point for filesystem volume or symbolic link for block volume
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment describes how CheckVolumeExistenceOperation works internally to check volume existence. It's correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @saad-ali means to add a comment about the change? Instead of using volume.volumePath, we need to use volumeMounter.GetPath() because the mount point setup is different for different plugins. We have to be careful about this when adding/modifying code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, sorry, because it's already merged, we can add it later.

isExist, checkErr := rc.operationExecutor.CheckVolumeExistenceOperation(volumeSpec, volumeMounter.GetPath(), volumeSpec.Name(), rc.mounter, uniqueVolumeName, volume.podName, pod.UID, attachablePlugin)
if checkErr != nil {
return nil, checkErr
}
// If mount or symlink doesn't exist, volume reconstruction should be failed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingxu97 can this check be removed? This means that the directory still exists and should be cleaned up.

Also this assumes that all volumes use mounts, which is not necessarily true (esp for ephemeral types).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this is the PR that introduced this check. Reading through the original bug and discussion, I'm not sure how this fix actually addresses the original issue... cc @chakri-nelluri

#51985

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msau42 PR #51985 adds an extra check to make sure that the volume is mounted on a reboot and return a failure for volume reconstruction, if it is not. Hope this clarifies the need for check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @jingxu97 and I discussed and I understand now.

If the volume was partially torn down (ie the mount point is gone, but the directory still exists), then we don't want reconstruction to add the volume back to the ASW (because it's actually not mounted) and cause an empty directory to be mounted into the pod. Let's update the comment here to make that reason more obvious.

However, this brings up some other issues:

  • What happens if the volume plugin doesn't actually use mounts? Maybe this isn't an issue for attachable plugins. I can't think of a real attachable plugin that is not mount-based.
  • If there was a case of kubelet/node crash in the middle of cleanup, then we may not be able to properly cleanup the directories for CSI because we only try reconstruction once during kubelet initialization, when the CSI plugin may not be registered yet. This is also more of a corner case and we don't need to consider fixing it urgently for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingxu97 can this check be removed? This means that the directory still exists and should be cleaned up.

Also this assumes that all volumes use mounts, which is not necessarily true (esp for ephemeral types).

@msau42 This code is conditional and was only meant for plugins which support attach and needs to be reattached and mounted on a node reboot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What happens if the volume plugin doesn't actually use mounts? Maybe this isn't an issue for attachable plugins. I can't think of a real attachable plugin that is not mount-based.

Block devices?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If there was a case of kubelet/node crash in the middle of cleanup, then we may not be able to properly cleanup the directories for CSI because we only try reconstruction once during kubelet initialization, when the CSI plugin may not be registered yet. This is also more of a corner case and we don't need to consider fixing it urgently for now.

@msau42 This might show up if we restart Kubelet to recover for any other issues..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CheckVolumeExistance function does handle raw block, although I wonder if it possible to the link to change across reboots? I think this is what the losetup is supposed to protect against. cc @wongma7

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only kubelet manipulates these links, technically it can change but so can a mount device (we check that a mount point exists only, not if it points to the right thing). (losetup doesn't help here as the failure scenario the losetup code is intended to prevent is something different like: volume at /dev/sda somehow gets attached then another volume gets attached at /dev/sda, silently. The loop device holding onto /dev/sda is supposed to prevent detach from happening at all. If there's a reboot, I don't believe the loop devices persist anyhow.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test the reboot behavior for raw block separately to make sure we're not going to accidentally give the wrong device to container if the device names change on a reboot.

if !isExist {
return nil, fmt.Errorf("Volume: %q is not mounted", uniqueVolumeName)
}

// TODO: remove feature gate check after no longer needed
var volumeMapper volumepkg.BlockVolumeMapper
if utilfeature.DefaultFeatureGate.Enabled(features.BlockVolume) && volume.volumeMode == v1.PersistentVolumeBlock {
Expand Down Expand Up @@ -680,12 +681,12 @@ func getVolumesFromPodDir(podDir string) ([]podVolume, error) {
}
unescapePluginName := utilstrings.UnescapeQualifiedName(pluginName)
for _, volumeName := range volumePluginDirs {
mountPath := path.Join(volumePluginPath, volumeName)
klog.V(5).Infof("podName: %v, mount path from volume plugin directory: %v, ", podName, mountPath)
volumePath := path.Join(volumePluginPath, volumeName)
klog.V(5).Infof("podName: %v, volume path from volume plugin directory: %v, ", podName, volumePath)
volumes = append(volumes, podVolume{
podName: volumetypes.UniquePodName(podName),
volumeSpecName: volumeName,
mountPath: mountPath,
volumePath: volumePath,
pluginName: unescapePluginName,
volumeMode: volumeMode,
})
Expand Down
6 changes: 3 additions & 3 deletions pkg/volume/plugins.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,10 +139,10 @@ type VolumePlugin interface {
NewUnmounter(name string, podUID types.UID) (Unmounter, error)

// ConstructVolumeSpec constructs a volume spec based on the given volume name
// and mountPath. The spec may have incomplete information due to limited
// and volumePath. The spec may have incomplete information due to limited
// information from input. This function is used by volume manager to reconstruct
// volume spec by reading the volume directories from disk
ConstructVolumeSpec(volumeName, mountPath string) (*Spec, error)
ConstructVolumeSpec(volumeName, volumePath string) (*Spec, error)

// SupportsMountOption returns true if volume plugins supports Mount options
// Specifying mount options in a volume plugin that doesn't support
Expand Down Expand Up @@ -275,7 +275,7 @@ type BlockVolumePlugin interface {
// The spec may have incomplete information due to limited information
// from input. This function is used by volume manager to reconstruct
// volume spec by reading the volume directories from disk.
ConstructBlockVolumeSpec(podUID types.UID, volumeName, mountPath string) (*Spec, error)
ConstructBlockVolumeSpec(podUID types.UID, volumeName, volumePath string) (*Spec, error)
}

// VolumeHost is an interface that plugins can use to access the kubelet.
Expand Down
14 changes: 7 additions & 7 deletions pkg/volume/util/operationexecutor/operation_executor.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ type OperationExecutor interface {
// ExpandVolumeFSWithoutUnmounting will resize volume's file system to expected size without unmounting the volume.
ExpandVolumeFSWithoutUnmounting(volumeToMount VolumeToMount, actualStateOfWorld ActualStateOfWorldMounterUpdater) error
// ReconstructVolumeOperation construct a new volumeSpec and returns it created by plugin
ReconstructVolumeOperation(volumeMode v1.PersistentVolumeMode, plugin volume.VolumePlugin, mapperPlugin volume.BlockVolumePlugin, uid types.UID, podName volumetypes.UniquePodName, volumeSpecName string, mountPath string, pluginName string) (*volume.Spec, error)
ReconstructVolumeOperation(volumeMode v1.PersistentVolumeMode, plugin volume.VolumePlugin, mapperPlugin volume.BlockVolumePlugin, uid types.UID, podName volumetypes.UniquePodName, volumeSpecName string, volumePath string, pluginName string) (*volume.Spec, error)
// CheckVolumeExistenceOperation checks volume existence
CheckVolumeExistenceOperation(volumeSpec *volume.Spec, mountPath, volumeName string, mounter mount.Interface, uniqueVolumeName v1.UniqueVolumeName, podName volumetypes.UniquePodName, podUID types.UID, attachable volume.AttachableVolumePlugin) (bool, error)
}
Expand Down Expand Up @@ -862,14 +862,14 @@ func (oe *operationExecutor) ReconstructVolumeOperation(
uid types.UID,
podName volumetypes.UniquePodName,
volumeSpecName string,
mountPath string,
volumePath string,
pluginName string) (*volume.Spec, error) {

// Filesystem Volume case
if volumeMode == v1.PersistentVolumeFilesystem {
// Create volumeSpec from mount path
klog.V(5).Infof("Starting operationExecutor.ReconstructVolumepodName")
volumeSpec, err := plugin.ConstructVolumeSpec(volumeSpecName, mountPath)
volumeSpec, err := plugin.ConstructVolumeSpec(volumeSpecName, volumePath)
if err != nil {
return nil, err
}
Expand All @@ -886,17 +886,17 @@ func (oe *operationExecutor) ReconstructVolumeOperation(
podName,
uid)
}
// mountPath contains volumeName on the path. In the case of block volume, {volumeName} is symbolic link
// volumePath contains volumeName on the path. In the case of block volume, {volumeName} is symbolic link
// corresponding to raw block device.
// ex. mountPath: pods/{podUid}}/{DefaultKubeletVolumeDevicesDirName}/{escapeQualifiedPluginName}/{volumeName}
volumeSpec, err := mapperPlugin.ConstructBlockVolumeSpec(uid, volumeSpecName, mountPath)
// ex. volumePath: pods/{podUid}}/{DefaultKubeletVolumeDevicesDirName}/{escapeQualifiedPluginName}/{volumeName}
volumeSpec, err := mapperPlugin.ConstructBlockVolumeSpec(uid, volumeSpecName, volumePath)
if err != nil {
return nil, err
}
return volumeSpec, nil
}

// CheckVolumeExistenceOperation return a func() to check mount path directory if volume still exists
// CheckVolumeExistenceOperation checks mount path directory if volume still exists
func (oe *operationExecutor) CheckVolumeExistenceOperation(
volumeSpec *volume.Spec,
mountPath, volumeName string,
Expand Down