Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for handling multiple credential secrets #2403

Merged
merged 9 commits into from
Feb 5, 2021

Conversation

carlisia
Copy link
Contributor

@carlisia carlisia commented Apr 7, 2020

Signed-off-by: Carlisia carlisia@vmware.com

@carlisia carlisia mentioned this pull request Apr 7, 2020
7 tasks
@skriss skriss added the Area/Design Design Documents label Apr 8, 2020
@carlisia
Copy link
Contributor Author

Hey @vmware-tanzu/velero-maintainers, would love to get a yay or nay on this. If we want this change, I'll open a ticket, maybe for v1.6?


## Goals

- To allow Velero to create and store multiple secrets for provider credentials
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this breaks down into two sub-goals:

  1. allow multiple secrets to exist for a single provider, i.e. two separate GCP service account credentials files, each of which could be used for a different BSL (this is basically not possible today)
  2. allow multiple provider plugins to be used at a time (e.g. both AWS and GCP backup locations), each of which has its own 1+ secrets (this is technically possible today but not through the CLI, requires manually editing the YAML)

Does that match how you were thinking about it?

Copy link
Contributor Author

@carlisia carlisia Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is precisely it. Another way to say it: there would be a 1.1 secret to BSL, and each BSL (therefore secret) would be associated with a provider/plugin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The velero backup-location (create|set) will have a new --credentials mapStringString flag which sets the name of the corresponding credentials secret for a provider. Format is provider:credentials-secret-name.

To clarify, and I think I'm backtracking here: a plugin will have n secrets. A BSL for that plugin can use any of those secrets. There might be > 1 BSL using the same secret for that plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me.

The 1:1 mapping would be new. 1:M mapping (secret:BSL) would be like our current behavior, where 1 secret applies to multiple (or all) BSLs/VSLs.

Copy link
Contributor Author

@carlisia carlisia Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point about allowing multiple plugins to be used at the same time is a good point tho. I did not address this in this proposal but with this feature added, it only makes sense and it'll be beneficial.

@carlisia
Copy link
Contributor Author

carlisia commented Jul 7, 2020

This plan to allow the AWS Cluster API Provider to support multiple sets of AWS credentials/roles from a single controller for the cluster-api project might be helpful:

https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/040406e90dd5f67e840cad00b3ec01c89ffbc24c/docs/proposal/20200506-single-controller-multitenancy.md

@nrb nrb changed the base branch from master to main July 17, 2020 22:42
@nrb nrb added this to the v1.6 milestone Aug 13, 2020
@nrb
Copy link
Contributor

nrb commented Aug 13, 2020

Assigning a tentative milestone of v1.6 for visibility.

Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need this ability, but one thing that I think this proposal needs to also address is rotation/short-lived credentials. This isn't exactly multiple credentials, but it's related to how we handle secrets and I think it will be worthwhile to think about at the same time.

These are important because some systems will use these to minimize blast radius should an attack occur. If a short-lived credential is acquired, it can only be exploited for a limited amount of time. Ideally this would be Velero's default, but I also don't think we should make that change immediately without a transition period.

Some things I can think of off the top of my head that will help support the short-lived credentials, though not an exhaustive list:

  • Re-trying failed uploads at the end of a backup, in case the credential expired during backup. Pair this with exponential back-off, so we're not retrying forever?
  • Re-trying snapshot operations if we get a permission denied error on taking a snapshot, in case the credential expired during the snapshot operation.
  • Making sure rotated credentials are supplied to restic, so that its own transfer operations are carried out completely. This will require some research into restic itself, to see exactly what it supports

In terms of getting them into the pod, I need to do more research to see if the files that secrets get mounted as are updated quickly or not.

@nrb
Copy link
Contributor

nrb commented Sep 14, 2020

Upon further consideration, my previous comment is a completely different feature. The data model should allow supporting rotation, but I think implementing these separately makes sense.

@carlisia
Copy link
Contributor Author

This might be pertinent: #1655.

@nrb nrb added this to To do in v1.6.0 Nov 2, 2020
@zubron
Copy link
Contributor

zubron commented Nov 16, 2020

I've spent some time looking at this design proposal and the related areas of the code and I have some thoughts.

The following is quite verbose, but a lot of it is me trying to make sure I understand things :) I've written this under the assumption that the proposal will be updated to reflect the comments here, that it will allow multiple credentials to be set per plugin so that, for example, different credentials could be used for different BSLs which are both on AWS.

If we want to move away from having a single credential object and instead have a more dynamic approach to creating and using credentials (adding them when adding a new plugin or creating a new BSL or VSL), we will need to modify how plugins are invoked so that the correct credential is used. Currently, there is a single secret, which is mounted into every pod deployed by Velero (theVelero Deployment and the Restic DaemonSet). This secret contains a single data key (cloud) which is accessible within the pods at the path /credentials/cloud. This path is made known to all plugins through provider specific environment variables. All possible provider environment variables are set to this path.

The examples in the current design proposal show different secrets being mounted however, my understanding is that is not possible to modify volume mounts in a running pod/container. Changes to secrets (such as adding new data keys) are automatically updated within the volume mount. This means that any update to the cloud-credentials secret would automatically be available to pods where that secret is mounted. This would allow us to work around the restriction of modifying volume mounts to instead still use a single secret, but add more keys to it for new sets of credentials.

Instead of using fixed environment variables which reference a single secret when creating the pods when installing velero, we will need to modify how we invoke plugins so that a different set of credentials can be selected for use. There are two ways which came to mind:

  • Instead of setting an environment variable, pass the path of the necessary credentials file through to the plugins and make any plugin responsible for using that credentials file when creating a client to interact with a storage provider. This has the benefit of removing provider specific information from the core velero codebase however this would require all plugins to be updated.
  • Modify restartableProcess to set the environment variables before running a plugin process. Each plugin process would still have the same set of environment variables set, however the value used for each of these variables would instead be a different path within the /credentials directory. Taking this approach would not require any changes from plugins as the credentials information would be made available to them in the same way.

We will also need to ensure that the restic controllers are updated in the same way so that correct credentials are used (when creating a ResticRepository or processing PodVolumeBackup/PodVolumeRestore).

When we need to add a new set of credentials (either with a new plugin, or with a new BSL/VSL), we will read in the credentials file and add it as a new key to the existing cloud-credentials secret. The key can be named in such a way that associates it with the corresponding plugin/BSL/VSL. The credential to use can be selected based on the name, or a reference to the credential can be stored with the corresponding object (BSL/VSL), and then made available to a plugin in one of the ways described above.

@zubron
Copy link
Contributor

zubron commented Nov 19, 2020

Converting this to a draft PR to remove it from the review queue while I am making some changes to it.

@zubron zubron marked this pull request as draft November 19, 2020 20:01
Signed-off-by: Carlisia <carlisia@vmware.com>
zubron added a commit to carlisia/velero that referenced this pull request Dec 1, 2020
The changes here are based on [this comment](vmware-tanzu#2403 (comment))
and a discussion with @carlisia.
The changes here are based on [this comment](vmware-tanzu#2403 (comment))
and a discussion with @carlisia.

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@zubron zubron marked this pull request as ready for review December 1, 2020 17:02
@zubron
Copy link
Contributor

zubron commented Dec 1, 2020

This still needs some further updates to discuss the impact on upgrades and Helm but marking as ready for review now to get some input.

## Goals

- To allow Velero to create and store multiple secrets for provider credentials, even multiple credentials for the same provider
- To improve the UX for configuring the velero deployment with multiple plugins/providers, and corresponding IAM secrets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line mentions IAM secrets, though we don't discuss them anywhere else in the document.

What can we do to make that experience better? Often, IAM credentials are applied at the node level and may not be Kubernetes secrets.

Based on https://github.com/vmware-tanzu/velero-plugin-for-aws#option-2-set-permissions-using-kube2iam, it looks like that would be up to kube2iam or similar. It may also be the case that IAM scenarios are not covered by this particular design.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of discussion on IAM credentials is my fault. The initial set of goals was written by @carlisia and I didn't expand on that topic. Now that I look at the link you posted, I don't think the kube2iam approach is compatible with the case of having multiple credentials for AWS managed by Velero. An IAM role could be created which supports multiple sets of credentials, but that isn't something that is controllable through Velero. And I'm not sure if we would want to introduce such specific AWS functionality into Velero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have the IAM case be handled separately from this design change if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I wrote that down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree on not introducing any specific infra provider functionality into it; I was using AWS as a popular example. However, GCP has a similar feature, and so does Azure.

They're called different things, but I think the idea is the same in that they all three tie the auth to the nodes/pods instead of via a credentials file. A brief skim of the READMEs says that the Azure one is the only one that has K8s resources associated directly in this flow, but the others may have some behind the scenes that aren't covered in the readme.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have the IAM case be handled separately from this design change if possible.

100% fine with that. I think we can call that out explicitly in the non-goals section. Something like "Node-based auth features" or similar so that we're not just focused on the AWS solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the Azure one looks similar to the existing workflow, the outcome of using AAD pod identity is to create a credentials file which is then used with velero install. It's not clear to me if multiple roles/identities can be handled through one file in that method.

The others though are managed outside of velero though, and I think would continue to work as before. If you had a BSL that used a plugin which included those methods of specifying credentials, you would say that no credentials are needed and then no credentials would be passed through in the environment. I think if you needed to access other accounts using those plugins, it might be possible to add other credentials, and use them in the environment for the plugin process (overriding the credentials for the pod/deployment).

Copy link
Contributor Author

@carlisia carlisia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can modify restartableProcess to set the environment variables before running a plugin process

@zubron would it be possible to give us a code sample of what this could look like? Particularly, how would this work w/o making a change to the plugin itself. This is what I'm trying to figure out, if you have thought this far and could give an example it would be helpful.

Everything else is making sense to me.

@carlisia
Copy link
Contributor Author

One other thing that I'm thinking is how we are thinking of the cases below:

Case (1)

  • A BSL is configured for provider X
  • The plugin for provider X is configured with many credentials.
  • The BSL will try to validate itself against each until one is found to be valid.

Case (2)

  • A BSL is configured for provider X
  • The plugin for provider X is configured with credentials A, B, C and D
  • The BSL is configured to use credentials A and D
  • Try both credentials until one of the two is found to be valid

Case (3)

  • A BSL is configured for provider X
  • The plugin for provider X is configured with credentials A, B, C and D
  • The BSL can only be configured to use one of these credentials

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@zubron zubron self-assigned this Dec 16, 2020
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@zubron
Copy link
Contributor

zubron commented Dec 16, 2020

@vmware-tanzu/velero-maintainers I've updated the doc to include more of the details discussed during the meeting on Wednesday - please take a look :)

The doc does not make a decision on how the credentials will be made available to the plugins, just outlines the different approaches. I'm personally leaning towards option 2 which is to serialize the contents of the secret to a tmp file and pass it through via config to the plugin. Although this requires modifications in the plugins, it is ultimately more flexible as it removes the need to Velero to have to know about the different env vars needed by each plugin.

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
Copy link
Contributor

@jenting jenting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the option 2 approach is:

  1. The Velero server pod writes the secret content to the Velero pod's or node's disk.
  2. The plugin uses the credentials from pod or host disk.

But as far as I know, the plugin is mounted as initContainer (before Velero server pod). How can we guarantee that the disk credentials (generated by Velero server) could be created before the plugin (initContainer) created?

design/secrets.md Outdated Show resolved Hide resolved
@nrb nrb added kind/release-blocker Must fix issues for the coming release (milestone) P1 - Important labels Feb 3, 2021
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@github-actions github-actions bot requested a review from jenting February 3, 2021 19:51
Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change and one big one - we need a title before accepting ;)

design/secrets.md Outdated Show resolved Hide resolved
design/secrets.md Outdated Show resolved Hide resolved
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@github-actions github-actions bot requested a review from nrb February 3, 2021 21:05
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
@nrb nrb dismissed ashish-amarnath’s stale review February 5, 2021 18:15

Review is stale at this point.

v1.6.0 automation moved this from Review in progress to Reviewer approved Feb 5, 2021
@nrb nrb merged commit 23ebf00 into vmware-tanzu:main Feb 5, 2021
v1.6.0 automation moved this from Reviewer approved to Done Feb 5, 2021
dharmab pushed a commit to dharmab/velero that referenced this pull request May 25, 2021
* Proposal for handling multiple credential secrets

Signed-off-by: Carlisia <carlisia@vmware.com>

* Update mulitple credentials design

The changes here are based on [this comment](vmware-tanzu#2403 (comment))
and a discussion with @carlisia.

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update multiple credentials design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Clarify points around node-based auth

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add more details around setting env vars

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Fix spelling

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update design to detail selected approach

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add title for design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Remove usage of AIM

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

Co-authored-by: Bridget McErlean <bmcerlean@vmware.com>
ywk253100 pushed a commit to ywk253100/velero that referenced this pull request Jun 29, 2021
* Proposal for handling multiple credential secrets

Signed-off-by: Carlisia <carlisia@vmware.com>

* Update mulitple credentials design

The changes here are based on [this comment](vmware-tanzu#2403 (comment))
and a discussion with @carlisia.

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update multiple credentials design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Clarify points around node-based auth

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add more details around setting env vars

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Fix spelling

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update design to detail selected approach

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add title for design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Remove usage of AIM

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

Co-authored-by: Bridget McErlean <bmcerlean@vmware.com>
gyaozhou pushed a commit to gyaozhou/velero-read that referenced this pull request May 14, 2022
* Proposal for handling multiple credential secrets

Signed-off-by: Carlisia <carlisia@vmware.com>

* Update mulitple credentials design

The changes here are based on [this comment](vmware-tanzu#2403 (comment))
and a discussion with @carlisia.

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update multiple credentials design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Clarify points around node-based auth

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add more details around setting env vars

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Fix spelling

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Update design to detail selected approach

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Add title for design doc

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

* Remove usage of AIM

Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>

Co-authored-by: Bridget McErlean <bmcerlean@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Design Design Documents kind/release-blocker Must fix issues for the coming release (milestone)
Projects
No open projects
v1.6.0
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

7 participants