Currently, volume plugins have a SetUp
method which is called in the context of a higher-level
workflow within the kubelet which has externalized the problem of managing the ownership of volumes.
This design has a number of drawbacks that can be mitigated by completely internalizing all concerns
of volume setup behind the volume plugin SetUp
method.
- The ownership management is currently repeatedly applied, which breaks packages that require special permissions in order to work correctly
- There is a gap between files being mounted/created by volume plugins and when their ownership is set correctly; race conditions exist around this
- Solving the correct application of ownership management in an externalized model is difficult and makes it clear that the a transaction boundary is being broken by the externalized design
Fully externalizing any one concern of volumes is difficult for a number of reasons:
- Many types of idempotence checks exist, and are used in a variety of combinations and orders
- Workflow in the kubelet becomes much more complex to handle:
- composition of plugins
- correct timing of application of ownership management
- callback to volume plugins when we know the whole
SetUp
flow is complete and correct - callback to touch sentinel files
- etc etc
- We want to support fully external volume plugins -- would require complex orchestration / chatty remote API
Since all of the ownership information is known in advance of the call to the volume plugin SetUp
method, we can easily internalize these concerns into the volume plugins and pass the ownership
information to SetUp
.
The volume Builder
interface's SetUp
method changes to accept the group that should own the
volume. Plugins become responsible for ensuring that the correct group is applied. The volume
Attributes
struct can be modified to remove the SupportsOwnershipManagement
field.
package volume
type Builder interface {
// other methods omitted
// SetUp prepares and mounts/unpacks the volume to a self-determined
// directory path and returns an error. The group ID that should own the volume
// is passed as a parameter. Plugins may choose to ignore the group ID directive
// in the event that they do not support it (example: NFS). A group ID of -1
// indicates that the group ownership of the volume should not be modified by the plugin.
//
// SetUp will be called multiple times and should be idempotent.
SetUp(gid int64) error
}
Each volume plugin will have to change to support the new SetUp
signature. The existing
ownership management code will be refactored into a library that volume plugins can use:
package volume
func ManageOwnership(path string, fsGroup int64) error {
// 1. recursive chown of path
// 2. make path +setgid
}
The workflow from the Kubelet's perspective for handling volume setup and refresh becomes:
// go-ish pseudocode
func mountExternalVolumes(pod) error {
podVolumes := make(kubecontainer.VolumeMap)
for i := range pod.Spec.Volumes {
volSpec := &pod.Spec.Volumes[i]
var fsGroup int64 = 0
if pod.Spec.SecurityContext != nil &&
pod.Spec.SecurityContext.FSGroup != nil {
fsGroup = *pod.Spec.SecurityContext.FSGroup
} else {
fsGroup = -1
}
// Try to use a plugin for this volume.
plugin := volume.NewSpecFromVolume(volSpec)
builder, err := kl.newVolumeBuilderFromPlugins(plugin, pod)
if err != nil {
return err
}
if builder == nil {
return errUnsupportedVolumeType
}
err := builder.SetUp(fsGroup)
if err != nil {
return nil
}
}
return nil
}