Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce credentials on create and publish calls and call NodePublishVolume MULTIPLE times #83

Merged
merged 3 commits into from Aug 30, 2017

Conversation

saad-ali
Copy link
Member

@saad-ali saad-ali commented Aug 16, 2017

  1. Modify NodePublishVolume so that it is expected to be called multiple times even for the same volume/node combo to handle the case where a volume is being referenced by multiple workloads and each workload does not have equal access to the volume.
  2. Introduce the ability to pass through end user credentials for authentication/authorization on calls that create and publish calls.
  3. Add wording requiring uniqueness of target_path in NodePublishVolume.

Closes #40 - NodePublish : Ability to call NodePublish multiple times on the same node
Partially addresses #17 - NodePublish – Secure access to shared mounted volumes
Closes #49 - NodePublishVolume target_path uniqueness requirement

CC @oritnm

csi.proto Outdated
// the arbitrary (possibly non-string) data value here.
// This information is sensitive and should be treated as such (not logged,
// etc.)
map<string, bytes> Data = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the SP knows how to decode these strings? Are you planing to provide a set of prdefined strings (enums) such as username, secret?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it viable that an SP would advertise (via documentation) the credentials that it expects here? and then it's up to the CO (via operator or end-user specification) to include the appropriate details? if so then we should probably document the expectation more clearly

why the need for a standard enum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the SP knows how to decode these strings? Are you planing to provide a set of prdefined strings (enums) such as username, secret?

As @jdef said the SP would advertise (via documentation) the credentials that it expects here (same as the map<string, string> parameters on CreateVolume(...) requests).

is it viable that an SP would advertise (via documentation) the credentials that it expects here? and then it's up to the CO (via operator or end-user specification) to include the appropriate details? if so then we should probably document the expectation more clearly

Will add clarification.

why the need for a standard enum?

  1. Future-proofing, if we want to extend it at some point, having a separate message already defined will make it cleaner. and 2) I've learned type aliasing is good practice, explicit type checking reduces bugs.

That said, I don't feel super strongly so I'd be willing to change it.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to comments made above:

Credentials are added in this PR for one half of some symmetric RPC pairs. For example, added to CreateVolume but not DeleteVolume; added to *PublishVolume but not *UnpublishVolume. If the v1 spec aims to support stateless plugins, shouldn't it be possible to pass Credentials to each RPC of a symmetric pair, e.g. to CreateVolume AND DeleteVolume? Otherwise the implication is that the plugin MAY need to cache credentials for some time -- and who knows, those credentials might have a shorter lifetime than that of a volume/workload.

spec.md Outdated
@@ -558,6 +563,18 @@ message VolumeMetadata {
// each Plugin keeps this information as small as possible.
map<string, string> values = 1;
}

// A standard way to encode credential data. The total bytes of the values in
// the Data field must be less than MaxSecretSize bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the definition of MaxSecretSize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to define a const, but protobuf doesn't let you do that. So changing to a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
// the arbitrary (possibly non-string) data value here.
// This information is sensitive and should be treated as such (not logged,
// etc.)
map<string, bytes> Data = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicitly declare whether this field is OPTIONAL or REQUIRED

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
message Credentials {
// Data contains the credential data, for example username and password. Each
// key must consist of alphanumeric characters, '-', '_' or '.'. The
// serialized form of the secret data is a base64 encoded string, representing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the value type of a map entry is bytes -- why the need for any encoding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. I am going to change bytes to string. That way string data can be passed through without worrying about character encoding, and binary data can be passed in encoded base64 or quoted-printable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

csi.proto Outdated
// the arbitrary (possibly non-string) data value here.
// This information is sensitive and should be treated as such (not logged,
// etc.)
map<string, bytes> Data = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it viable that an SP would advertise (via documentation) the credentials that it expects here? and then it's up to the CO (via operator or end-user specification) to include the appropriate details? if so then we should probably document the expectation more clearly

why the need for a standard enum?

@saad-ali
Copy link
Member Author

In addition to comments made above:

Credentials are added in this PR for one half of some symmetric RPC pairs. For example, added to CreateVolume but not DeleteVolume; added to *PublishVolume but not *UnpublishVolume. If the v1 spec aims to support stateless plugins, shouldn't it be possible to pass Credentials to each RPC of a symmetric pair, e.g. to CreateVolume AND DeleteVolume? Otherwise the implication is that the plugin MAY need to cache credentials for some time -- and who knows, those credentials might have a shorter lifetime than that of a volume/workload.

Yep, I was debating this. Much more likely is that the CO loses the credentials (user deletes the secret before the volume is detach/unmounted), so I'd prefer the SP cache the credentials so that cleanup operations can proceed even if the CO loses the credential information. I can see it both ways. If you feel strongly, I'm ok changing it.

@saad-ali
Copy link
Member Author

Feedback addressed. PTAL.

I added another commit with my new changes, to make it easier to code review. Once the PR is approved, I will squash these commits down before merge.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more likely is that the CO loses the credentials (user deletes the secret before the volume is detach/unmounted), so I'd prefer the SP cache the credentials so that cleanup operations can proceed even if the CO loses the credential information

OK, makes sense. I'm fine keeping this as-is for now. We can always revisit later if it becomes a big problem.

Can we call out this strategy in the docs for clarity? Something about the fact that the SP is expected to cache the credentials if they're needed for the symmetric unpublish/delete RPC.

spec.md Outdated
The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
This RPC may be called by the CO multiple times: at least once per volume per workload
(in particular, if the same volume is referenced by two different workloads, `NodePublishVolume` will be called twice, once for each workload, with possibly different `target_path` and/or auth credentials).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: according to our style guide, this whole sentence should be one line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix. Wasn't sure if I should count the giant parenthesised statement as a separate line or not. But strictly speaking, you're right, it's one big sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
This RPC may be called by the CO multiple times: at least once per volume per workload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we want to dictate that in the spec and callout "workload" specifically. Or we can just say, the CO MAY call NodePublishVolume multiple times on the node with different target_path. It's up to CO to decide whether to call this RPC multiple times or not.

Also, I think we need to call out the reverse workflow in the spec. More specifically, when to call ControllerUnpublishVolume if NodeUnpublishVolume is called multiple times? One idea is to say: ControllerUnpublishVolume MUST be called after NodeUnpublishVolume is called for each target_path and succeeds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we want to dictate that in the spec and callout "workload" specifically. Or we can just say, the CO MAY call NodePublishVolume multiple times on the node with different target_path. It's up to CO to decide whether to call this RPC multiple times or not.

Good point. Will change the wording.

Also, I think we need to call out the reverse workflow in the spec. More specifically, when to call ControllerUnpublishVolume if NodeUnpublishVolume is called multiple times? One idea is to say: ControllerUnpublishVolume MUST be called after NodeUnpublishVolume is called for each target_path and succeeds.

Ah right. Will add.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we want to dictate that in the spec and callout "workload" specifically. Or we can just say, the CO MAY call NodePublishVolume multiple times on the node with different target_path. It's up to CO to decide whether to call this RPC multiple times or not.

Actually, now that I think about it more. I think it should be a requirement on the CO that it calls this once per workload per volume so that the SP can design against that assumption. By saying that it MAY be called multiple times with different target_path, leaves it up to the CO to decide and the SP to handle both cases. Let me know what you think.
If you agree, I'll change it to:

This RPC MUST be called by the CO at least once per volume per workload (in particular, if the same volume is referenced by two different workloads, `NodePublishVolume` MUST be called twice, once for each workload).
Therefore, the SP SHOULD handle multiple `NodePublishVolume` calls for the same volume with possibly different `target_path` and/or auth credentials.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali, the "workload" might mean different things on different COs. For instance, it might be container for Docker, and pod for Kubernetes. For Mesos, I don't even know what "workload" will mean because it supports both pod style and container style workload. If we want to call out "workload", we must clearly define what is that in the spec.

I think the motivation for this change is to allow the CO to publish a volume on a node in different ways, instead of just one way. For instance, mount a fuse filesystem with different options. I think if we say in the spec that the CO MAY call this multiple times, then that should be suffice to force the SP to be written in such a way that it can handles multiple calls to NodePublishVolume.

We should call out different target_path I think. Because it's super weird to me that the CO call NodePublishVolume multiple times with the same target_path but with different parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali, the "workload" might mean different things on different COs. For instance, it might be container for Docker, and pod for Kubernetes. For Mesos, I don't even know what "workload" will mean because it supports both pod style and container style workload. If we want to call out "workload", we must clearly define what is that in the spec.

That makes sense. But we need to do that regardless, since the spec already contains many instances of "workload", for example:

  • Node - A host where the user workload will be running
  • ControllerPublishVolume - This RPC will be called by the CO when it wants to place a workload that uses the volume onto a node
  • ControllerUnpublishVolume - This RPC is typically called by the CO when the workload using the volume is being moved to a different node, or all the workload using the volume on a node has finished.
  • ProbeNode - The CO SHOULD call this RPC for the node at which it wants to place the workload.
  • etc.

I'll add add a definition.

I think the motivation for this change is to allow the CO to publish a volume on a node in different ways, instead of just one way. For instance, mount a fuse filesystem with different options. I think if we say in the spec that the CO MAY call this multiple times, then that should be suffice to force the SP to be written in such a way that it can handles multiple calls to NodePublishVolume.

If a SP knows that it will be called everytime a workload is scheduled they may be able to make simplifying assumptions. That said I can't concretely state what those are, so I'm fine with dropping the CO requirement. I'll change it to:

The SP MUST handle `NodePublishVolume` being called multiple times for the same node for the same volume but with possibly different `target_path` and/or auth credentials.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@akutz
Copy link
Contributor

akutz commented Aug 16, 2017

Hi,

If introducing the notion of credentials at all I worry about assuming when they're needed. Would it perhaps make sense to relocate the credentials to gRPC metadata (like an HTTP header) so that the credentials could be made available for all RPCs that require some form of elevated privilege?

Here is an example of where I extract gRPC metadata on the server as well where I inject it from the client.

The same transport mechanism could easily be used to pass the credential information.

@saad-ali
Copy link
Member Author

Hi,

If introducing the notion of credentials at all I worry about assuming when they're needed. Would it perhaps make sense to relocate the credentials to gRPC metadata (like an HTTP header) so that the credentials could be made available for all RPCs that require some form of elevated privilege?

Here is an example of where I extract gRPC metadata on the server as well where I inject it from the client.

The same transport mechanism could easily be used to pass the credential information.

Thanks for the pointer @akutz. I sympathise with your concern about assuming when credentials are needed. But I don't think passing that information over the transport layer would solve that problem, since you'd still need to figure out which calls actually require it. And I'd really like to avoid splitting information required for CSI between proto and transport layer. As for figuring out which calls require credentials, this must be advertised in documentation by the SP. We could consider adding a discovery mechanism (possibly extend capabilities to advertise which calls require credentials), but I think that is unnecessary.

csi.proto Outdated
// This information is sensitive and MUST be treated as such (not logged,
// etc.) by the CO.
// This field is REQUIRED.
map<string, string> Data = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we use map<string, bytes> here and don't even mention base64 here. By default, the bytes type in protobuf will map to base64 encoded string in JSON representation (https://developers.google.com/protocol-buffers/docs/proto3#json).

Using map<string, string> here will force all COs to do base64 encoding to convert a binary secret to a string. If the SP expect the binary format anyway, it'll need to do a base64 decoding. I feel that this extra complexity is unnecessary given that protobuf already handles it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map<string, bytes> was what I proposed initially but I realized that makes the normal use case harder.

Using map<string, string> here will force all COs to do base64 encoding to convert a binary secret to a string

To be very clear, map<string, string> requires no extra work (base64 encoding) when passing string values. Only if the value contains non-text (aka non-string, binary) data will data need to be base64 encoded.

map<string, bytes>, on the other hand, makes passing string data more complicated: to pass a string via a byte array you must always specify the character-encoding (US-ASCII, UTF-8, etc.). Either the this character-encoding would be published as part of the documentation, and clients would have to handle it, or we'd have to create another field to allow it to be set dynamically. Either way it would necessarily make the passing string data more complicated, whereas the alternative, map<string, string>, makes only binary data needs to be special cased (documentation indicates the value for some key is binary and must be base64 encoded).

Because the majority of cases will likely want to pass in a simple username/password string, I strongly prefer optimizing for that use case, rather than the binary blob use case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the value contains non-text data will data need to be base64 encoded.

How will the CO know if the value is base64-encoded binary or regular UTF-8 encoded text? Do you perhaps intend that for a given key (eg., "password"), the value must always be UTF-8 encoded text?

I prefer encoding text to UTF-8 on it's way to another process over encoding binary to base64.

There is always map<string, oneof[string, bytes]> (pseudo code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will the CO know if the value is base64-encoded binary or regular UTF-8 encoded text?

The CO will not need to know. As far as the CO is concerned it needs to find a populate a string field. Only SP and end user/cluster admin will need to be aware that binary data needs to be passed in for a particular key, and as part of the documentation the SP will specify which binary to text encoding to use (base64 or quoted printable), and when handing over that secret to the CO the end user/cluster admin will ensure the correct encoding is applied and the CO gets string data.

Do you perhaps intend that for a given key (eg., "password"), the value must always be UTF-8 encoded text?

Nope. UTF-8 is a character encoding, if we operate directly on strings, instead of byte arrays, senders don't have to worry about the character encoding, since the underlying protocol, gRPC, takes care of that (serializing/deserializing the string over the wire).

I prefer encoding text to UTF-8 on it's way to another process over encoding binary to base64.

Because in the majority of cases this will likely be used to transfer string data (username, password, etc.), we should simplify that and not requiring callers to have to navigate encoding (make it dead simple: you have a string, give me a string, don't have to worry about anything else since gRPC takes care of transport).

spec.md Outdated
@@ -565,15 +565,18 @@ message VolumeMetadata {
}

// A standard way to encode credential data. The total bytes of the values in
// the Data field must be less than MaxSecretSize bytes.
// the Data field must be less than 1 Mebibyte.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a very randomly picked limit :) I don't have a strong feeling on this. I think having restriction is better than not having it. Removing a restriction is always easier than adding a restriction later.

Just for the sake of consistency, should we pose a restriction on all the fields in this spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need to address that in the PR. maybe something to follow up with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of consistency, should we pose a restriction on all the fields in this spec?

Good idea, I might add another commit to this PR, otherwise I'll just open an issue to follow up on it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #87 to track.

csi.proto Outdated
// End user credentials used to authenticate/authorize volume creation
// request.
// This field is OPTIONAL.
Credentials userCredentials = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow protobuf style guide and use snake_case for field names:
https://developers.google.com/protocol-buffers/docs/style

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

csi.proto Outdated
// This information is sensitive and MUST be treated as such (not logged,
// etc.) by the CO.
// This field is REQUIRED.
map<string, string> Data = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow protobuf style guide and use lower case snake_format for field names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

csi.proto Outdated
// This information is sensitive and MUST be treated as such (not logged,
// etc.) by the CO.
// This field is REQUIRED.
map<string, string> Data = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on how the CO will know what credentials to pass to a given plugin.

Some plugins will require username/password others a JWT and others nothing at all.

I'd prefer a set of official strategies like UsernamePassword, Username, NoAuth, GenericSecret, etc. then have the plugin respond to the CO's initial Get*Capabilities() request with it's preferred authentication mechanism.

That way the CO only needs to support a small set of authentication strategies that are clearly specified by this spec.

This let's CO integrate with new plugins without changing code or leaking SP-specific details through the CSI plugin or around it (through SP-specific documentation around authentication).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the best things about the CSI is that the CO doesn't need to write special code for every SP.

Having to read the SP documentation to figure out what keys are expected and how to format the values seems counterproductive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these comments to the issue as that's a more appropriate forum for such discussion.
#17 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back and forth. Let's keep the conversation attached to the PR.

Copying over comments from #17 (comment):

@cpuguy83 commented 6 hours ago
I think we can use a google.protobuf.Any (https://github.com/google/protobuf/blob/master/src/google/protobuf/any.proto#L112) we can pass around both the type information and the details and leave it opaque to the CO.

gpaul commented 5 hours ago • edited
@cpuguy83 Interesting concept, I've never seen that before. That may be useful although (if I take your meaning) it would appear to require a working internet connection from the cluster to the SP - which given that the actual data resides at the SP sounds quite reasonable to me. But still, how will the CO know how to populate whichever fields are declared in the response?

I'm not clear how authentication can be opaque to the CO. It seems that fundamentally the CO needs to get details about the user to the SP through the CSI plugin?

Depending on which SP, the exact details will be different.

I imagine a CO cluster configured with two SPs (SP-1 and SP-2).

Both SPs require the end-user to be authenticated to determine whether the volume may be published RW vs. RO vs. not at all as well as for auditing that user's actions performed on the data once published.

SP-1 requires that the end-user authenticate using a username+password combination. This information is likely stored as a secret by the CO and should accompany any Create/Destroy/*Publish/etc. RPCs to the CSI plugin for SP-1. Perhaps there's some mapping between secret name and SP credential which let's the CO determine which secret is the SP-1 username and which is the SP-1 password?

SP-2 is integrated with the specific CO cluster through some Single Sign-On component and expects the CO to pass it a valid authentication token that represents the end-user.

It seems to me that the CO will need to perform quite a lot of authentication-specific logic from when the user launches the application until some volume is published to a Node.

If the spec provides canonical messages defining credential types for a few common authentication strategies the CO can build generic authentication flows regardless of which CSI plugin / SP the 'CreateVolume' call gets scheduled to. At least, I hope so.

@cpuguy83 commented 5 hours ago
I think only the user interface needs to know the specifics of the auth
protocol.
With the Any concept, this can be determined by passing the protobuf type
URL (which can be found by querying the SP).

This is of course all really easy to say, I've not attempted to implement
something like this before.

@gpaul commented 5 hours ago
I like the direction this is going. It would appear that this way the user could specify a SP for a given application (ie., specify the Any URL) and pass a list of key/value pairs appropriate for the SP where the keys would be username, password or token, etc. and the values would be plain text (username), the name of a secret (password) some blob (token) to be obtained by the CO by sending the user through some SSO flow. Is this what you have in mind?

I would rather not force the user to specify the SP for a given application. I hope to present the user with abstract 'storage profiles' like archive and fast instead of a specific list of SPs. Of course, that indirection comes at the cost of additional complexity.

@gpaul commented 4 hours ago
Perhaps users can be required to enter their SP-specific credentials once (as secrets) for every SP instead of once per action. These could be opaque sets of k/v pairs as specified by the SP authentication documentation for static credentials or the user could select "Share Auth token" for an SP that requires authn via SSO.

It sounds like the CO could build such flows given PR 83.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the best things about the CSI is that the CO doesn't need to write special code for every SP.

Completely agreed, that is a goal, and is possible with what is proposed in the PR.

Having to read the SP documentation to figure out what keys are expected and how to format the values seems counterproductive.

Like @cpuguy83 said, "only the user interface needs to know the specifics of the auth protocol."

To expand on that: COs wouldn't have to special case; a CO would remain unaware of the particulars of any given credential parameter. Instead users or cluster admins would be responsible for reading the documentation for the particular storage plugin and figuring out what the required credentials are, and then use some opaque mechanism that the CO exposes in its API to pass-through the required info (in Kubernetes this would be a list of secrets). This is very similar to passing opaque parameters to CreateVolume. The reason we want to credentials to be different from the generic parameters is because they likely will require special handling: for example where as generic parameters will be pass through as far as the CO is concerned (end to end from CO API to call to SP), credential values may be pulled from a secret store. So an example flow would be:

  1. SP publishes the requirements for the list of credentials it expects in its documentation.
  2. Cluster admin/user read the docs, and then puts the required credentials in their CO secret store.
  3. Cluster admin/user references the secrets in the CO's secret store that should be passed in on, for example, volume provisioning.
  4. The CO doesn't care what the requirements were, it assumes the user took care of parsing that, and simply fetches the secrets and populates them into the request to the SP.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed flow.

I am hoping that the user will not having to care about SP details. I don't want them to couple their application definition to a specific SP. Instead I'd like them to specify archive storage or fast storage and have the same application work on-prem, on their laptop, and across cloud providers.

As with all problems, this one can be solved by another level of indirection. I expect the cluster admin to maintain a mapping (called a "storage profile") from archive and fast to HDD- / SSD-backed EBS volumes when deploying in the cloud; slow HDD and fast HDD when deploying on-prem (testing lab?); and map both profiles to the same HDD for developing on their laptop (in this latter case, the user happens to also be the cluster admin.)

Proceeding from your list of steps this seems possible given this PR: the per-SP credentials (configured cluster-wide or per user) can be looked up (for secrets, perhaps according to naming convention) or generated (for SSO auth tokens) by the CO before performing RPCs.

Sounds like this works for me, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali that all sounds good. I do have one question: if a secret store only stores binary data (because it does not care about the type of the secret it stores), when the CO retrieves the secret from the secret store, it needs to convert the binary secret to string format. What the CO needs to do here?

Or we dictate that the secret store the CO uses must support string format so that CO can just pass those secrets to the plugin?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping that the user will not having to care about SP details. I don't want them to couple their application definition to a specific SP. Instead I'd like them to specify archive storage or fast storage and have the same application work on-prem, on their laptop, and across cloud providers.

As with all problems, this one can be solved by another level of indirection. I expect the cluster admin to maintain a mapping (called a "storage profile") from archive and fast to HDD- / SSD-backed EBS volumes when deploying in the cloud; slow HDD and fast HDD when deploying on-prem (testing lab?); and map both profiles to the same HDD for developing on their laptop (in this latter case, the user happens to also be the cluster admin.)

Proceeding from your list of steps this seems possible given this PR: the per-SP credentials (configured cluster-wide or per user) can be looked up (for secrets, perhaps according to naming convention) or generated (for SSO auth tokens) by the CO before performing RPCs.

Yep, another level of indirection is the key. It'll be up to the way the CO exposes storage. Kubernetes is able to achieve this by keeping the (application developer) user definitions portable and putting the non-portable (implementation specific) definitions in configs created by the cluster admin (i.e. StorageClasses, PV, etc.). Which will work very nicely with what is proposed here.

Sounds like this works for me, thanks.

Great!

@saad-ali that all sounds good. I do have one question: if a secret store only stores binary data (because it does not care about the type of the secret it stores), when the CO retrieves the secret from the secret store, it needs to convert the binary secret to string format. What the CO needs to do here?

Or we dictate that the secret store the CO uses must support string format so that CO can just pass those secrets to the plugin?

We discussed this offline yesterday. And basically what you said, either the CO will need to do handle the conversion or the CO will need to expose a way to set string secrets (this is what k8s ended up doing). If the CO decides to do the conversion, it would still need to document the expected character encoding: the CO could tell end users that if storing strings in the binary secret store, they must use a specific character-set encoding, then the CO could use that character-set to decode the data from byte to string and pass the resulting string into CSI.

@saad-ali
Copy link
Member Author

Feedback addressed. PTAL.

As mentioned before, I added more commits with my new changes, to make it easier to code review. Once the PR is approved, I will squash these commits down before merge.

@jdef
Copy link
Member

jdef commented Aug 19, 2017

@saad-ali is it worth adding commentary on the possible need to cache credentials for symmetric RPCs, as per #83 (review)

Can we call out this strategy in the docs for clarity? Something about the fact that the SP is expected to cache the credentials if they're needed for the symmetric unpublish/delete RPC.

Copy link

@gpaul gpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @saad-ali. SGTM.

@saad-ali
Copy link
Member Author

@saad-ali is it worth adding commentary on the possible need to cache credentials for symmetric RPCs, as per #83 (review)

@jdef added. PTAL.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @saad-ali LGTM once the .proto is regenerated and CI passes

@cpuguy83
Copy link
Contributor

While I like the simplicity that map<string,string> brings, I'm not sure how this field is expected to be populated.
For instance, how does one take a value from a secret store and apply it to this field?
It seems like it puts a lot of burden on the SP to know every type of value that it can get and try different things to decode it.

@jdef
Copy link
Member

jdef commented Aug 22, 2017 via email

@cpuguy83
Copy link
Contributor

@jdef But if I'm taking data from a secret store, I can't just try and force it into a literal map<string,string>, and then I would have to (as the CO) come up with a key and send the data as a value.

I'm sure I'm missing something here, just want to make sure I understand the whole flow here.

Copy link
Member

@jieyu jieyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch LGTM with some wording suggestions.

spec.md Outdated
@@ -936,6 +976,8 @@ message NodePublishVolumeResponse {

A Node Plugin MUST implement this RPC call.
This RPC is a reverse operation of `NodePublishVolume`.
This RPC MUST remove any mounts setup by the corresponding `NodePublishVolume`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably won't mention the word mount here because there might be things other than mount that this call needs to cleanup. For instance, for the iscsi case, detach will happen after the last NodeUnpublishVolume.

I would probably say:

This RPC MUST undo the work by the corresponding `NodePublishVolume`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
@@ -936,6 +976,8 @@ message NodePublishVolumeResponse {

A Node Plugin MUST implement this RPC call.
This RPC is a reverse operation of `NodePublishVolume`.
This RPC MUST remove any mounts setup by the corresponding `NodePublishVolume`.
This Plugin SHALL assume that this RPC will be executed at least once for each successful `NodePublishVolume` call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call out that for each target_path, there should be at least one call to NodePublishVolume by the CO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
@@ -936,6 +976,8 @@ message NodePublishVolumeResponse {

A Node Plugin MUST implement this RPC call.
This RPC is a reverse operation of `NodePublishVolume`.
This RPC MUST remove any mounts setup by the corresponding `NodePublishVolume`.
This Plugin SHALL assume that this RPC will be executed at least once for each successful `NodePublishVolume` call.
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability, the CO MUST guarantee that this RPC is called before `ControllerUnpublishVolume` is called for the given node and the given volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call out that ControllerUnpublishVolume will be called by the CO after NodeUnpublishVolume is called for each target_path and is successful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@saad-ali
Copy link
Member Author

@jdef But if I'm taking data from a secret store, I can't just try and force it into a literal map<string,string>, and then I would have to (as the CO) come up with a key and send the data as a value.

I'm sure I'm missing something here, just want to make sure I understand the whole flow here.

@akutz If your question is about string vs byte, the assumption you may be missing is that the CO's secret store is assumed to expose a mechanism for users to store secret string data not just raw binary data (Kubernetes allows both, and I believe mesos intends to support both as well).

If your question is more about what the key and values should be, since the CO has no way to programatically know that, take a look at this thread #83 (comment)

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, might it be worth addressing #49 as part of this ticket?

spec.md Outdated
@@ -936,7 +976,9 @@ message NodePublishVolumeResponse {

A Node Plugin MUST implement this RPC call.
This RPC is a reverse operation of `NodePublishVolume`.
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability, the CO MUST guarantee that this RPC is called before `ControllerUnpublishVolume` is called for the given node and the given volume.
This RPC MUST undo the work by the corresponding `NodePublishVolume`.
This RPC SHOULD be called by the CO at least once for each `target_path` that was successfully setup via `NodePublishVolume`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a particular reason that this says SHOULD vs. SHALL? are there any valid use cases for a CO to successfully invoke NodePublishVolume prior to a workload .. and never call NodeUnpublishVolume, ever?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, changed.

@saad-ali
Copy link
Member Author

also, might it be worth addressing #49 as part of this ticket?

Done. PTAL.

@cpuguy83
Copy link
Contributor

Does that make sense?

I think I've got it now. Thanks.
LGTM

@jieyu
Copy link
Member

jieyu commented Aug 24, 2017

LGTM!

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Please don't jump on me for the late comment, and I apologize if it's already been discussed, but isn't the cred caching on the SP side a possible security concern? Wouldn't it be safer to pass the creds to the calls as they (the creds) are required? That way there's no cache in the SP that could potentially be compromised.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @saad-ali,

You replied to a comment above as if I was the person who asked the question, but I wasn't. I think that made you think you'd answered me. I'm only pointing this out so that you don't think I'm raising some similar issue twice that you've already answered for me.

@saad-ali
Copy link
Member Author

Hi @saad-ali, You replied to a comment above as if I was the person who asked the question, but I wasn't. I think that made you think you'd answered me. I'm only pointing this out so that you don't think I'm raising some similar issue twice that you've already answered for me.

Ah yep, I mentioned you by accident when replying to @cpuguy83, thanks for the clarification.

Please don't jump on me for the late comment, and I apologize if it's already been discussed, but isn't the cred caching on the SP side a possible security concern? Wouldn't it be safer to pass the creds to the calls as they (the creds) are required? That way there's no cache in the SP that could potentially be compromised.

No worries! Best to have all concerns addressed.

My experience with credentials in the Kubernetes API has been that if credentials are required for clean up calls (like unpublish and delete), then we are very likely to run into scenarios where the credentials are no longer available and we get stuck unable to cleanup. I'm going to assume that if the volume plugin is compromised, all bets are off, since it will likely have elevated privileges to the storage system. So while I agree that credential caching on the SP side could be a security concern, I'd prefer to optimize for a more automated user experience.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @saad-ali,

I'm just concerned about the fact that the SP has to cache multiple sets of credentials (one per volume that was created/published), requiring some type of stateful store on the SP-side. And if there is a crash, how is the SP going to recover and be equipped to perform a Delete or Unpublish that might require creds if the creds were only cached in memory?

@saad-ali
Copy link
Member Author

@akutz That's a valid concern. However, if we don't have this requirement, the same exact problems apply to the CO: it can lose credentials, for many reasons, and then have no way to delete or unpublish. On the CO side, at least with k8s, the primary way for us to persist credentials is controlled by the end user (secrets), and end users can inadvertently delete those (e.g. deleting a namespace) at any point leaving the system in a state that it can't cleanup. To prevent that I'd like the SPs to store enough state to be able to clean up in that case. I agree that making the plugin behavior for this robust across crash scenarios would be challenging, but I think a plugin crashing is less likely than users nuking credentials randomly (this happens a lot!).

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

I personally feel all security-related matters should be the domain of the CO and not on each of the potential plug-ins created for the multiple COs that can use them. The companies and teams behind the COs are likely to be larger and more capable of navigating these issues and implementing any required caching than expecting the same of each person or persons creating plug-ins.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @saad-ali,

One of my concerns is that you'll end up with an ecosystem of dozens of SPs, each with their own configuration and persistence requirements (etcd, Consul, libkv, etc.) instead of relying on the COs to supply configuration and credential information. In the end I think it will slow adoption of CSI if many plug-ins require one or more external dependencies just to meet the requirements of CSI. Especially as there is nothing that is going to enforce all the possible SP authors to utilize similar or compatible means of distributed configuration or caching.

@saad-ali
Copy link
Member Author

Good points @akutz! I agree that we should keep the bar for writing a plugin low, and lean towards adding complexity to the CO instead of plugins where possible.

@jdef @cpuguy83 @jieyu are you ok with that? If so I'll make the change to add credentials on both sides.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @saad-ali,

To play my own devil's advocate, I totally understand your position too. I wonder if a compromise isn't to introduce, in a future update, a recommended list of safe, secure, and distributed configuration tools that SPs should consider using. That way if K8s or Docker or Mesos are using one or more plug-ins, an administrator only has to configure and manage a single instance of etcd, for example, to support all of the SPs.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @saad-ali,

In line with the above suggestion, instead of credentials, the COs could then pass a secret along with the RPCs that is used to access the config/credential store. Does that make sense?

@saad-ali
Copy link
Member Author

To play my own devil's advocate, I totally understand your position too. I wonder if a compromise isn't to introduce, in a future update, a recommended list of safe, secure, and distributed configuration tools that SPs should consider using. That way if K8s or Docker or Mesos are using one or more plug-ins, an administrator only has to configure and manage a single instance of etcd, for example, to support all of the SPs.

In line with the above suggestion, instead of credentials, the COs could then pass a secret along with the RPCs that is used to access the config/credential store. Does that make sense?

At this point I'm convinced that it would be best to just let the COs figure it out instead of making the SPs so complicated.

@jdef
Copy link
Member

jdef commented Aug 24, 2017

@saad-ali I'm fine w/ credentials on both sides

Modify NodePublishVolume so that it is expected to be called multiple
times even for the same volume/node combo to handle the case where a
volume is being referenced by multiple workloads and each workload does
not have equal access to the volume.
@saad-ali
Copy link
Member Author

Introduced credentials on delete and unpublish calls and removed requirement for SP to cache credentials.

PTAL

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2nd commit description has a typo:

Introduce the ability to pass through end user credentials for authentication/authorization on calls that create and publish calls.

Should probably end with "publish volumes"

csi.proto Outdated
@@ -565,6 +600,11 @@ message NodePublishVolumeRequest {
// Whether to publish the volume in readonly mode. This field is
// REQUIRED.
bool readonly = 7;

// End user credentials used to authenticate/authorize node publish request.
// These Credentials are not passed in on NodeUnpublishVolume calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the "These Credentials are not passed in on NodeUnpublishVolume calls." statement? It's confusing because the controller unpublish call DOES have Credentials now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Good catch. Forgot to delete that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Introduce the ability to pass through end user credentials for
authentication/authorization on calls that create and publish volumes.
@saad-ali
Copy link
Member Author

PTAL

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@saad-ali
Copy link
Member Author

It sounds like we have consensus. I'll give til EOD to allow for more feedback before I merge.

@saad-ali
Copy link
Member Author

Merging.

@saad-ali saad-ali merged commit 8dbb732 into container-storage-interface:master Aug 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants