Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI: send pod information in NodePublishVolumeRequest #2439

Merged
merged 1 commit into from
Jan 29, 2019

Conversation

jsafrane
Copy link
Member

@jsafrane jsafrane commented Jul 31, 2018

This proposal adds possibility to send pod information in CSI NodePublishVolumeRequest call.

Fixes kubernetes/kubernetes#64984

It depends on #2514

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jul 31, 2018
@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 31, 2018
* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume.
* `csi.kubernetes.io/pod.namespace`: namespace of the pod that wants the volume.
* `csi.kubernetes.io/pod.uid`: uid of the pod that wants the volume.
* `csi.kubernetes.io/serviceAccount.name`: name of the service account under which the pod operates. Namespace is the same as `pod.namespace`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be the annotation names be like x.y.z/w1-w2-w3 ?

Example: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L98

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not annotation.

As I wrote, I follow names introduced in Flex, but I don't have strong opinion about the format. CSI itself does not describe how volume_attributes should look like.

Copy link

@kfox1111 kfox1111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, looks very good. :) A couple of things inline.

## Detailed design

### CSI enhancement
We don't need to change CSI protocol in any way. It allows kubelet to pass `pod.name`, `pod.uid` and `pod.spec.serviceAccountName` in [`NodePublish` call as `volume_attributes`]((https://github.com/container-storage-interface/spec/blob/master/spec.md#nodepublishvolume)). `NodePublish` is roughly equivalent to Flex `mount` call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NodeServer has two sets of calls that I think are applicable. NodePublishVolume/Unpublish, but also NodeStageVolume/Unstage https://github.com/container-storage-interface/spec/blob/master/spec.md#nodestagevolume. Those calls also need the additional info to function properly.

Copy link
Member Author

@jsafrane jsafrane Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case several pods use the same volume on the same node, NodeStageVolume is called only once. It would include just the first Pod. NodePublishVolume is called separately for each pod that uses the volume and CSI driver can do the right thing there.

In addition, Flex driver did not get any pod in mountdevice call (which is equivalent of NodeStageVolume in CSI and nobody complained.

Flex did not provide pod information in unmount (=NodeUnpublishVolume) and nobody complained. Do you have concrete use case when pod can be useful in NodeUnpublishVolume?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, really? I thought stage was called once per pod, but mount was called per container mounting it? I guess I was mistaken. So I'll have to rewrite the image driver to have those semantics then. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane For inline CSI volume, what would be the equivalent of flexvolume's options? Currently we can only pass in volumeAttributes to csi as part of a PersistentVolume deployment. Is there a way to support volumeAttributes as part of inline CSI volume as part of the pod spec?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9. CSI volume plugin stores this flag for the CSI driver.

When a volume is being mounted for a pod:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional csi stage call here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added NodeStageVolume to the example, with a note that it won't get any pod information.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks. The clarity will help others writing drivers.


### CSI driver wants pod information in `NodePublish`
1. CSI driver author provides documentation and templates that the driver need pod information in `NodePublish`.
2. Admin installs CSI driver with CSI Driver Registrar `--requires-publish-pod`, e.g. by applying helm template from CSI diver author.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/diver/driver

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks

@jsafrane
Copy link
Member Author

We're considering using cluster-global API object to hold "configuration" of CSI driver in Kubernetes, i.e. how Kubernetes talks to the driver. It should contain also a flag if kubelet should pass pod name/namespace/uid in NodePublish. We won't need new gRPC API in kubelet this way.

The design is it flux right now, stay tuned.

@kfox1111
Copy link

hmm. cluster global info might allow for pod inline validation to happen earlier too. +1.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 15, 2018
@jsafrane
Copy link
Member Author

I reworked the proposal to use CSIDriver introduced in #2514. It's much shorter and easier now.

@jsafrane
Copy link
Member Author

And I kept the original commit there, just in case we need to revert to it.

Copy link

@kfox1111 kfox1111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. just a couple of typo's to fix.

## High-level design
We decided to pass the pod information as `NodePublishVolumeRequest.volume_attributes`.

* Kubernetes passes pod information only to CSI drivers that explicitly require that information in their [`CSIDriver` instance](https://github.com/kubernetes/community/pull/2523). These drivers are tightly coupled to Kubernetes and may not work or may require reconfiguration on other cloud orchestrators. It is expected (but not limited to) that these drivers will provider ephemeral volumes similar to Secrets or ConfigMap, extending Kubernetes secret or configuration sources.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo at 'provider ephemeral'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks!

We decided to pass the pod information as `NodePublishVolumeRequest.volume_attributes`.

* Kubernetes passes pod information only to CSI drivers that explicitly require that information in their [`CSIDriver` instance](https://github.com/kubernetes/community/pull/2523). These drivers are tightly coupled to Kubernetes and may not work or may require reconfiguration on other cloud orchestrators. It is expected (but not limited to) that these drivers will provider ephemeral volumes similar to Secrets or ConfigMap, extending Kubernetes secret or configuration sources.
* Kubernetes will not pass pod information to CSI drivers that don't know or don't care about pods and service accounts. It is expected (but not limited to) that these drivers will provide real persistent storage. Such CSI driver would reject a CSI call with pod information as invalid. This is current behavior of Kubernete and it will be the default behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernete typo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link

@kfox1111 kfox1111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add one more that I think will be useful?
csi.kubernetes.io/volume.name: name of the volume as specified in the spec.

ex:

volumes:
  - name: foo
    csi:
      driver: image

it would be "foo" in this case.

@jsafrane
Copy link
Member Author

@kfox1111, do you have concrete use case for csi.kubernetes.io/volume.name? I am not against per se, but I'd like to know why can be this information useful.

* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume.
* `csi.kubernetes.io/pod.namespace`: namespace of the pod that wants the volume.
* `csi.kubernetes.io/pod.uid`: uid of the pod that wants the volume.
* `csi.kubernetes.io/serviceAccount.name`: name of the service account under which the pod operates. Namespace of the service account is the same as `pod.namespace`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings above define sort of API between Kubernetes and kubernetes-aware CSI drivers. @liggitt, does it need any special approval? It will be set to stone when we implement this proposal.

Background: volume_attributes is generic map[string]string, without any prescribed format of keys in CSI standard. I used the above keys because similar ones are already used in flex volumes. Now is the time to change the keys if we don't like them.

@kfox1111
Copy link

At very least, volume.name + pod.id is a guaranteed unique identifier for the volume instance. So could be useful to drivers that are pod aware and want to track lifecycle? There may be other uses too. Haven't thought too much about it.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other then one comment.

We don't need to change CSI protocol in any way. It allows kubelet to pass `pod.name`, `pod.uid` and `pod.spec.serviceAccountName` in [`NodePublish` call as `volume_attributes`]((https://github.com/container-storage-interface/spec/blob/master/spec.md#nodepublishvolume)). `NodePublish` is roughly equivalent to Flex `mount` call.

The only thing we need to do is to **define** names of the `volume_attributes` keys that CSI drivers can expect:
* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

csi.storage.kubernetes.io to align with CSIDriver group name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to csi.storage.k8s.io/*

@jsafrane
Copy link
Member Author

Renamed parameters to csi.storage.k8s.io/* and squashed.

@jsafrane jsafrane force-pushed the csi-pod-info branch 2 times, most recently from b97efb4 to 5068303 Compare August 28, 2018 12:42
@jsafrane
Copy link
Member Author

And added feature gate CSIPodInfo.

@jsafrane
Copy link
Member Author

Implementation: kubernetes/kubernetes#67945

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Sep 6, 2018
Automatic merge from submit-queue (batch tested with PRs 68171, 67945, 68233). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

CSI pod information in NodePublish

**What this PR does / why we need it**:
This is implementation of kubernetes/community#2439. It brings CSI closer to Flex volume features, CSI driver can (optionally) get name/namespace/UID/ServiceAccount of the pod that requires the CSI volume mounted. This allows CSI drivers to either do AAA before the volume is mounted or tailor content of  volume to the pod.

Work in progress:
  * contains #67803 to get `CSIDriver` API. Ignore the first commit.

Related to #64984 #66362 (fixes only part of these issues)

/sig storage

cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied 

**Release note**:

```release-note
CSI NodePublish call can optionally contain information about the pod that requested the CSI volume.
```
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2019
@saad-ali
Copy link
Member

This was implemented as alpha in k8s 1.12 (kubernetes/kubernetes#67945). Merging the doc so that it can be updated.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 29, 2019
@k8s-ci-robot k8s-ci-robot merged commit 7fadb45 into kubernetes:master Jan 29, 2019
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
CSI: send pod information in NodePublishVolumeRequest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants