Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI: Skip attach/detach for non-attachable volumes #2523

Merged
merged 3 commits into from Aug 27, 2018

Conversation

jsafrane
Copy link
Member

@jsafrane jsafrane commented Aug 13, 2018

In this proposal, we describe a way how to skip creation of VolumeAttachments for non-attachable CSI drivers.

Benefits:

  • Admins don't need to deploy external attacher for non-attachable CSI drivers. It was very confusing to them, they could not understand why they should run an attacher for non-attachable drivers. Now they won't need to do it.
  • Somewhat faster pod startup

This PR depends on CSIPlugin CSIDriver introduced in #2514

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 13, 2018
@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Aug 13, 2018
@jsafrane
Copy link
Member Author


In all cases listed above, a non-attachable CSI driver becomes "attachable" (i.e. requires external attacher). Downgrade does not affect attachable CSI drivers, both "old" and "new" Kubernetes processes them in the same way.

For non-attachable volumes, if the volume was mounted by "new" Kubernetes, it has no VolumeAttachment instance. "Old" A/D controller does not know about it. However, it will periodically call plugin's `VolumesAreAttached()` that checks for `VolumeAttachment` presence. Volumes without `VolumeAttachment` will be reported as not attached and A/D controller will call `Attach()` on these.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, the VolumeAttachment object will be created, but the CSI driver does not have external-attacher running, so we will be calling Attach() forever?

Or do we require that if you downgrade Kubernetes, you also downgrade the CSI driver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you downgrade Kubernetes (or delete CSIDriver instance or disable the alpha feature), you're required to run an external CSI attacher for non-attachable CSI drivers, just like before this proposal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note about external attacher.


* A/D controller will not create `VolumeAttachment` instances for non-attachable volumes. Etcd load will be reduced.

* On the other hand, all CSI volumes still must go though A/D controller. A/D controller **must** process every CSI volume and kubelet **must** wait until A/D controller marks a volume as attached, even if A/D controller basically does nothing. All CSI volumes must be added to `Node.Status.VolumesInUse` and `Node.Status.VolumesAttached`. This does not introduce any new API calls, all this is already implemented, however this proposal won't reduce `Node.Status` update frequency in any way. If *all* volumes will move to CSI eventually, pod startup will be slower than when using in-tree volume plugins that don't go through A/D controller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Node object will also get much bigger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note about it.

@jsafrane
Copy link
Member Author

Fixed the comments. And renamed CSIPlugin to CSIDriver, following updates in #2514


### Kubelet / VolumeManager
* Kubelet must create a (shared) informer to watch CSIDriver and pass it to CSI volume plugin in [`ProbeVolumePlugins`](https://github.com/kubernetes/kubernetes/blob/8db5328c4c1f9467ab0d70ccb991a12d4675b6a7/cmd/kubelet/app/plugins.go#L101).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have to update RBAC policies so as kubelet can read that CR?

Copy link
Member

@liggitt liggitt Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and NodeAuthorizer. if the access is read-only, updating the system:node role's rules would be sufficient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubelet needs only read-only access to CSIDrivers, IMO we don't need to update NodeAuthorizer.


In all cases listed above, an "attachable" CSI driver becomes non-attachable. Upgrade does not affect attachable CSI drivers, both "old" and "new" Kubernetes processes them in the same way.

For non-attachable volumes, if the volume was attached by "old" Kubernetes (or "new" Kubernetes before CSIDriver instance was created), it has `VolumeAttachment` instance. It will be deleted by `Detach()`, as it deletes `VolumeAttachment` instance also for non-attachable volumes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a problem when controller-manager is newer but kubelet is older? The controller-manager will not create VolumeAttachment object but kubelet will be stuck waiting for WaitForAttach ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admin should not enable the alpha feature in this case, I think. Can we solve it by documentation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about for beta/GA? The standard upgrade procedure is to update master first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the requirement to not update your drivers until both master+nodes are beyond some version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that whatever creates CSIDriver instances should create it only when both A/D controller and kubelet understand it. I.e. they're both at the right version and they have necessary alpha features enabled.

On downgrade, A/D controller gets updated first and it starts creating VolumeAttachment instances. It's up to the admin to start external attacher in this case, because CSI required it at that old version (or version with disabled alpha features).

Copy link
Member

@vladimirvivien vladimirvivien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few questions. Main concerns is complexity introduced to avoid racing conditions and up/down grades.

* In addition to the above, `Detach()` removes `VolumeAttachment` instance even if the volume is not attachable. This deletes `VolumeAttachment` instances created by old A/D controller or before `CSIDriver` instance was created.

### A/D controller
* A/D controller must pass a (shared) informer to watch CSIDriver and pass it to CSI volume plugin in [`ProbeVolumePlugins`](https://github.com/kubernetes/kubernetes/blob/8db5328c4c1f9467ab0d70ccb991a12d4675b6a7/cmd/kube-controller-manager/app/plugins.go#L82).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying to inject an informer from the A/D controller loop into the CSIPlugin.NewAttacher during plugin boot up? If so, that sounds a bit leaky where controller concerns making their way to the plugin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rephrase it, it does not need to be a/d controller, it can be kube-controller-manager itself. And I tried to add a parameter with the informer to ProbeVolumePlugins() and it's too early. kubelet does not have fully established API client at that time. So I'll either create a new non-shared informer in Init() of the plugin or add it to VolumeHost.

* A/D controller must pass a (shared) informer to watch CSIDriver and pass it to CSI volume plugin in [`ProbeVolumePlugins`](https://github.com/kubernetes/kubernetes/blob/8db5328c4c1f9467ab0d70ccb991a12d4675b6a7/cmd/kube-controller-manager/app/plugins.go#L82).

### Kubelet / VolumeManager
* Kubelet must create a (shared) informer to watch CSIDriver and pass it to CSI volume plugin in [`ProbeVolumePlugins`](https://github.com/kubernetes/kubernetes/blob/8db5328c4c1f9467ab0d70ccb991a12d4675b6a7/cmd/kubelet/app/plugins.go#L101).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the Kubelet is creating the informer and pass it to the CSI volume plugin. I am guessing the informer is created in a place like CSIVolumePlugin.Init() and made available to ProbeVolumePlugin. If so, that same informer could be made available for the CSIPlugin.{Attacher|Detacher}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the informer will be in csiPlugin struct, so it's available for Mounter and Attacher. I need to figure out who creates the informer, see above.

* If CSIDriver for the driver does not exist, perform usual logic (i.e. treat the volume as attachable).
* This keeps the behavior the same as in old Kubernetes version without CSIDriver object.
* This also happens when CSIDriver informer has not been quick enough. It is suggested that CSIDriver instance is created **before** any pod that uses corresponding CSI driver can run.
* In case that CSIDriver informer (or user) is too slow, CSI volume plugin `Attach()` will create `VolumeAttachment` instance and wait for (non-existing) external attacher to fulfill it. The CSI plugin shall recover when `CSIDriver` instance is created and skip attach. Any `VolumeAttachment` instance created here will be deleted on `Detach()`, see the next bullet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What situation would cause a driver not to have a CSIDriver object representing it? We should try not to have this racing and synchronization case. Resolution seems complex. Now I understand why @saad-ali was thinking of having a CSIDriver.Status.Ready flag.

Also, maybe the ideal thing to have is the CSI registrar create the CSIDriver, instead of Human, which would guarantee that CSIDriver object does not show up before CSI Driver is ready. Once feature is introduced, I think CSIDriver object should always be coupled with a running/deployed driver it describes.

Copy link
Member Author

@jsafrane jsafrane Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali wants CSIDriver instances optional. So if admin decides that they don't want the instance for a driver, Kubernetes must work. Admin can create the instance at any later time and kubernetes must carry on. Admin can even delete the instance and kubernetes and all running pods must survive.

I'd like to have the instance mandatory and immutable (users can't switch driver from attachable to non-attachable), but that's perhaps unrealistic requirement.


## Alternatives considered
A/D controller and kubelet can be easily extended to check if a given volume is attachable. This would make mounting of non-attachable volumes easier, as kubelet would not need to wait for A/D controller to mark the volume as attached. However, there would be issues when upgrading or downgrading Kubernetes (or marking CSIDriver as attachable or non-attachable, which has basically the same handling).
* On upgrade (i.e. a previously attachable CSI volume becomes non-attachable), A/D controller could discover that an attached volume is not attachable any longer. A/D controller could clean up `Node.Status.VolumesAttached`, but since A/D controller does not know anything about `VolumeAttachment`, we would either need to introduce a new volume plugin call to clean it up in CSI volume plugin, or something else would need to clean it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a realistic scenario ? Can a driver that was attachable becomes non-attachable? That sounds like a misconfiguration. Again, maybe the feature should be gated during alpha/beta forcing users to buy into the CSIDriver registry thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a driver that was attachable becomes non-attachable?

Yes, admin creates / modifies CSIDriver for the driver.

Or admin enables alpha feature that gates the CSIDriver logic in Kubernetes. This one is particularly interesting, because there will be some (short?) time when the feature is enabled on some components and disabled on others.

## Alternatives considered
A/D controller and kubelet can be easily extended to check if a given volume is attachable. This would make mounting of non-attachable volumes easier, as kubelet would not need to wait for A/D controller to mark the volume as attached. However, there would be issues when upgrading or downgrading Kubernetes (or marking CSIDriver as attachable or non-attachable, which has basically the same handling).
* On upgrade (i.e. a previously attachable CSI volume becomes non-attachable), A/D controller could discover that an attached volume is not attachable any longer. A/D controller could clean up `Node.Status.VolumesAttached`, but since A/D controller does not know anything about `VolumeAttachment`, we would either need to introduce a new volume plugin call to clean it up in CSI volume plugin, or something else would need to clean it.
* On downgrade (i.e. a previously non-attachable CSI volume becomes attachable), kubelet must discover that already mounted volume has changed from non-attachable to attachable and put it into `Node.Status.VolumesInUse`. This would race with A/D controller detaching the volume when a pod was deleted at the same time a CSIDriver instance was made attachable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, is that possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, admin deletes CSIDriver instance. Or changes it to "attachable". Or disables the alpha feature. Or downgrades Kubernetes to version that does not implement CSIDriver logic. From our point of view (A/D controller + kubelet) , a driver that was not attachable becomes attachable

* GA: 1.14

## Alternatives considered
A/D controller and kubelet can be easily extended to check if a given volume is attachable. This would make mounting of non-attachable volumes easier, as kubelet would not need to wait for A/D controller to mark the volume as attached. However, there would be issues when upgrading or downgrading Kubernetes (or marking CSIDriver as attachable or non-attachable, which has basically the same handling).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this issue is introduced as alpha and feature-gated, that should help avoid misconfigurations right? Meaning, this feature will only work when enabled along with a properly-configured CSIDriver object (and supporting code) is running.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sure we need to support downgrade to older Kubernetes that did not have this feature. And IMO we should also support disabling the feature in a running cluster.

@jsafrane
Copy link
Member Author

Added some clarification about informer creation and upgrade/downgrade in "Alternatives considered".

@jsafrane
Copy link
Member Author

Is there anything missing in this proposal? I'm working on proof-of-concept and it looks good so far. Creating / deleting CSIDriver instances while pod with non-attachable CSI volumes (surprisingly ;-) ) works.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 27, 2018
@saad-ali
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 27, 2018
@k8s-ci-robot k8s-ci-robot merged commit 69686f9 into kubernetes:master Aug 27, 2018
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Sep 5, 2018
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

CSI: skip attach for non-attachable drivers

**What this PR does / why we need it**:
This is implementation of kubernetes/community#2523. CSI volumes that don't need attach/detach now don't need external attacher running.

WIP:
 * contains #67803 to get CSIDriver API. Ignore the first commit.
 * ~~missing e2e test~~

/sig storage

cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied @davidz627 

**Release note**:
```release-note
CSI volume plugin does not need external attacher for non-attachable CSI volumes.
```
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
CSI: Skip attach/detach for non-attachable volumes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants