Skip to content

flexvolume: rework for the new volume controller design#26926

Closed
mcluseau wants to merge 1 commit into
kubernetes:masterfrom
mcluseau:wip-issue-20262
Closed

flexvolume: rework for the new volume controller design#26926
mcluseau wants to merge 1 commit into
kubernetes:masterfrom
mcluseau:wip-issue-20262

Conversation

@mcluseau
Copy link
Copy Markdown
Contributor

@mcluseau mcluseau commented Jun 7, 2016

* This commit redesigns the flexvolume plugin and exposes changes to the internal storage API including the Attacher interface.
* WARNING: The flexvolume plugin API has changed with this release. Existing Flex drivers will have to be modified to work with the new interface (see Flex volume documentation for details).

This change is Reviewable

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Jun 7, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

2 similar comments
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Jun 7, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Jun 7, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Jun 7, 2016
@pmorie
Copy link
Copy Markdown
Contributor

pmorie commented Jun 7, 2016

@k8s-bot ok to test

@saad-ali saad-ali self-assigned this Jun 7, 2016
@saad-ali
Copy link
Copy Markdown
Member

saad-ali commented Jun 7, 2016

Thanks for jumping on this @MikaelCluseau
I'll try to take a look at this tmr.

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 7, 2016

A few notes:

  • I'm splitting commits in small change steps (with hope it helps review);
  • we need the pod namespace to get the secrets from the secretRef, maybe the namespace should be in volume.Spec?
  • I don't know what to do of the hostname for now.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 7, 2016
@pmorie
Copy link
Copy Markdown
Contributor

pmorie commented Jun 7, 2016

we need the pod namespace to get the secrets from the secretRef, maybe the namespace should be in volume.Spec?

That's a bit of a nasty tangle. You'll need to know the namespace this is being attached for.

@pmorie
Copy link
Copy Markdown
Contributor

pmorie commented Jun 7, 2016

Let me elaborate on my last comment:

  1. We currently do not allow cross-namespace references to secret
  2. PersistentVolumes are cluster-scoped (ie, non-namespaced)
  3. I would like to avoid exposing namespace into volume spec if at all possible

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 7, 2016

On 06/08/2016 03:33 AM, Paul Morie wrote:

Let me elaborate on my last comment:

  1. We currently do not allow cross-namespace references to secret
  2. PersistentVolumes are cluster-scoped (ie, non-namespaced)
  3. I would like to avoid exposing namespace into volume spec if at
    all possible

I understand, but FlexVolume currently have this feature of putting a
secret's keys/values in the options sent to its plugin. And you cannot
get a secret without its namespace. I see the following possibilities:

  1. add the namespace to volume.Spec (not what you want);
  2. add the namespace to the parameters of the Attacher interface
    (probably not what you want too);
  3. add the namespace to the SecretRef of the FlexVolume before calling
    the Attach or Mount (probably meaning specific code in a generic
    part + a hidden field in an API object);
  4. discard this feature (breaking the API, but it's tagged alpha anyway).

History showed that I don't always have all to alternatives in mind, but
for now it looks like a "choose your evil".

@chakri-nelluri
Copy link
Copy Markdown
Contributor

Thanks for jumping in @MikaelCluseau. I was on a short vacation. Just catching up on github. Since you already started this exercise, I can help you review and verify it.

@saad-ali
Copy link
Copy Markdown
Member

saad-ali commented Jun 8, 2016

Plugins inferring namespace from pod to fetch secrets is problematic.

Between the options @MikaelCluseau listed:

1 add the namespace to volume.Spec (not what you want)

Agreed that sticking the namespace in volume.Spec is not ideal, but volume.Spec is entirely internal so it doesn't require any API changes which is nice.

2 add the namespace to the parameters of the Attacher interface (probably not what you want too);

This is equally bad/good as 1.

3 add the namespace to the SecretRef of the FlexVolume before calling the Attach or Mount (probably meaning specific code in a generic part + a hidden field in an API object);

This would work but it is super hacky. I say no.

4 discard this feature (breaking the API, but it's tagged alpha anyway).

Discarding the feature is an option for Flex for the reason you stated. But other plugins like rbd have the same issue, so we need to solve it.

There is a 5th option:

Modify the API by to have SecretRef be a *ObjectReference type instead of *LocalObjectReference. *ObjectReference contains the namespace as well as the name. This seems like the "correct" solution--when you reference a secret you must provide the full reference (including the namespace). But, it requires backwards compatibility considerations. Let's discuss this in the 9 AM PST sig-storage meeting. I'll add it to the agenda.

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 8, 2016

On 06/09/2016 08:40 AM, Chakravarthy Nelluri wrote:

I can help you review and verify it.

Thanks, that's good for me because I'm an occasionnal contributor and
the last I want is to introduce bad code.

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 8, 2016

On 06/09/2016 08:48 AM, Saad Ali wrote:

Between the options @MikaelCluseau https://github.com/MikaelCluseau
listed:

 1. add the namespace to volume.Spec (not what you want)

Agreed that sticking the namespace in |volume.Spec| is not ideal, but
volume.Spec is entirely internal so it doesn't require any API changes
which is nice.

[...]

There is a 5th option:

Modify the API by to have |SecretRef| be a |_ObjectReference| type
instead of |_LocalObjectReference|. |*ObjectReference| contains the
namespace as well as the name. This seems like the "correct"
solution--when you reference a secret you must provide the full
reference (including the namespace). But, it requires backwards
compatibility considerations. Let's discuss this in the 9 AM PST
sig-storage meeting. I'll add it to the agenda.

I agree it looks like the correct way. The only drawback I can see here
is that users will have to specify the namespace, and that namespace
will be constrained to the pod's namespace. That said, the API server
can do the grunt work here (fill it with the pod's namespace by default
and verify the constraint). An practical path could be to add the
namespace in volume.Spec as a first step, and move to the 5th option
afterwards.

@saad-ali
Copy link
Copy Markdown
Member

saad-ali commented Jun 8, 2016

I spoke with @thockin offline. He had a good point which is because these secret references are inside a volume definition that is ultimately referenced by a Pod, LocalObjectReference is the correct way to reference the secret. The user should not have to respecify the namespace it should be the same namespace as the pod. Also the user should not be able to reference a secret from a different namespace than the pod namespace. I agree with these points.

Therefore, I say let's go with #2 for now. @pmorie, speak up if you disagree.

@saad-ali
Copy link
Copy Markdown
Member

saad-ali commented Jun 8, 2016

agree it looks like the correct way. The only drawback I can see here
is that users will have to specify the namespace, and that namespace
will be constrained to the pod's namespace. That said, the API server
can do the grunt work here (fill it with the pod's namespace by default
and verify the constraint). An practical path could be to add the
namespace in volume.Spec as a first step, and move to the 5th option
afterwards.

Just saw your comment @MikaelCluseau. Yes, we are in agreement. Either 1 or 2 are fine with me. If @pmorie is ok with it, could you implement one of these approaches?

@chakri-nelluri
Copy link
Copy Markdown
Contributor

Thanks @saad-ali I like it too. The final consumer is Pod and works for our flex volume use case too.

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 8, 2016

On 06/09/2016 10:04 AM, Saad Ali wrote:

Just saw your comment @MikaelCluseau
https://github.com/MikaelCluseau. Yes, we are in agreement. Either 1
or 2 are fine with me. If @pmorie https://github.com/pmorie is ok
with it, could you implement one of these approaches?

Sure, I'll do whatever is needed to get at least FlexVolume to this
point. I prefer the 1st approach thought, because it specifies
information available to plugins in one central place.

@saad-ali
Copy link
Copy Markdown
Member

saad-ali commented Jun 8, 2016

Sure, I'll do whatever is needed to get at least FlexVolume to this point. I prefer the 1st approach thought, because it specifies information available to plugins in one central place.

Awesome, 1 is fine with me. Let's go with that. If @pmorie has beef, we'll deal with it 👊 😆

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 9, 2016

On 06/09/2016 10:34 AM, Saad Ali wrote:

Sure, I'll do whatever is needed to get at least FlexVolume to
this point. I prefer the 1st approach thought, because it
specifies information available to plugins in one central place.

Awesome, 1 is fine with me. Let's go with that. If @pmorie
https://github.com/pmorie has beef, we'll deal with it 👊 😆

okay ;-) I'm adding PodNamespace and PodName as the pod's name was used
in error logging un FlexVolume and there's a strong link with the Pod
anyway (may be useful for other plugins too).

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Jun 9, 2016

On 06/09/2016 02:00 PM, Mikaël Cluseau wrote:

okay ;-) I'm adding PodNamespace and PodName as the pod's name was
used in error logging un FlexVolume and there's a strong link with the
Pod anyway (may be useful for other plugins too).

and the PodUID... because of

func (f *flexVolumeMounter) GetDeviceMountPath(spec *volume.Spec) string {
name := f.driverName
return f.plugin.host.GetPodVolumeDir(f.podUID,
utilstrings.EscapeQualifiedNameForDisk(name), spec.Name())
}

@mcluseau mcluseau force-pushed the wip-issue-20262 branch 2 times, most recently from 4a9b0cd to ee932f4 Compare June 9, 2016 05:17
@saad-ali
Copy link
Copy Markdown
Member

We've decided to punt on this PR for 1.4. There are some open questions that we need to address.

  1. How attach calls get access to secrets. Passing in the namespace seems like a hack we should address it correctly.
  2. With the attach/detach controller running on master inside a container, a directory containing the flex volume plugins from the node would need to be mounted into the container. This makes an already poor deployment story worse (some users may not even have access to the master).
  3. Introducing the attacher plugin in flex means the attach/detach controller will call out to the flex volume plugin for all flex volumes (regardless of if that driver implements an attacher or not) and if that driver isn't installed on the master, attach will fail. We need capabilities or plugin probing to be able to address this more holistically.

If we were to get this change in, it would also require a rewrite of existing flex drivers. Because of this and the issues above, we've decided to skip this release and hopefully more holistically address the problem space for the next release.

@mcluseau
Copy link
Copy Markdown
Contributor Author

mcluseau commented Aug 20, 2016

I will also remove the references to Attacher from the Mounter (we used it to get asrc directory in SetUpAt).
[edit] pushing it in the default implementation of SetUpAt, in fact, where it may make sense (if we need a plugin that does attach the device and then let the default mount happen).

@gr4701
Copy link
Copy Markdown

gr4701 commented Aug 20, 2016

sorry to hear that. since this gates other PR's perhaps we should have a weekly sync up on this ?
i would like us to target this then early for 1.5 ; so please let us know what help we can provide.

thanks,

--gopal

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Aug 21, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@chakri-nelluri
Copy link
Copy Markdown
Contributor

Opened #31298 to figure out the mechanism to access secrets in attach & detach calls.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Sep 5, 2016

GCE e2e build/test passed for commit 94b493549cc4d21d358d7398c7daceec6026af31.

@k8s-github-robot
Copy link
Copy Markdown

This PR hasn't been active in 30 days. It will be closed in 59 days (Dec 4, 2016).

cc @MikaelCluseau @rootfs @saad-ali

You can add 'keep-open' label to prevent this from happening, or add a comment to keep it open another 90 days

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Details If you have questions or suggestions related to this bot's behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCE e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCI GKE smoke e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins kops AWS e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot kops aws e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins unit/integration failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCI GCE e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins Kubemark GCE e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCE etcd3 e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins CRI GCE Node e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot cri node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GKE smoke e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCE Node e2e failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins verification failed for commit a4a7bae. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot
Copy link
Copy Markdown

@MikaelCluseau PR needs rebase

@k8s-github-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @smarterclayton
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@mcluseau
Copy link
Copy Markdown
Contributor Author

Will be merged as part of #41804.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.