Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add design doc for Custom CA support for S3 BSLs and Velero Installation #2259

Merged
merged 3 commits into from
Feb 27, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
97 changes: 97 additions & 0 deletions design/custom-ca-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Custom CA Bundle Support for S3 Object Storage

It is desired that Velero performs SSL verification on the Object Storage
endpoint (BackupStorageLocation), but it is not guaranteed that the Velero
container has the endpoints' CA bundle in it's system store. Velero needs to
support the ability for a user to specify custom CA bundles at installation
time and Velero needs to support a mechanism in the BackupStorageLocation
Custom Resource to allow a user to specify a custom CA bundle. This mechanism
needs to also allow Restic to access and use this custom CA bundle.

## Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support adding the certs on the client side? I know velero client-side commands such as velero (backup|restore) (describe|logs) on AWS/GCP/Azure get a 1-time use URL for retrieving files from object store, and on Minio they use a BSL's Spec.Config.publicUrl value in order to get non-encrypted values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It currently does not but I wouldn't rule that out. Are you thinking that the velero install command would install the cert on the client's machine? Or that the velero client commands would have a new flag --cacert which a user can optionally specify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure, honestly. I think either would work, so long as it's documented. A --cacert flag would probably be easier since certs can be managed in several different ways depending on client OS.

Mostly, I'd like to avoid users hitting issues like this one often. #2256


- Enable Velero to be configured with a custom CA bundle at installation
- Enable Velero support for custom CA bundles with S3 API BackupStorageLocations
skriss marked this conversation as resolved.
Show resolved Hide resolved
- Enable Restic to use the custom CA bundles whether it is configured at installation time or on the BackupStorageLocation

## Non Goals

- Support non-S3 providers

## Background

Currently, in order for Velero to perform SSL verification of the object
storage endpoint the user must manually set the `AWS_CA_BUNDLE` environment
variable on the Velero deployment. If the user is using Restic, the user has to
either:
1. Add the certs to the Restic container's system store
1. Modify Velero to pass in the certs as a CLI parameter to Restic - requiring
a custom Velero deployment

## High-Level Design

On the Velero deployment at install time, we can set the AWS environment variable
`AWS_CA_BUNDLE` which will allow Velero to communicate over https with the
proper certs when communicating with the S3 bucket. This means we will add the
ability to specify a custom CA bundle at installation time. For more
information, see "Install Command Changes".

On the Restic daemonset, we will want to also mount this secret at a pre-defined
location. In the `restic` pkg, the command to invoke restic will need to be
updated to pass the path to the cert file that is mounted if it is specified in
the config.

This is good, but doesn't allow us to specify different certs when
`BackupStorageLocation` resources are created.

In order to support custom certs for object storage, Velero will add an
additional field to the `BackupStorageLocation`'s provider `Config` resource to
provide a secretRef which will contain the coordinates to a secret containing
the relevant cert file for object storage. Then, in order for Restic to be able
to consume and use this cert, Velero will need to modify the Restic daemonset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an accepted path for Velero? It feels odd to me to update the daemonset for Restic on the fly when creating BSLs as this could interrupt some activity Restic is undertaking.

However, without modifying the restic daemonset I don't see a feasible way to get the custom CA bundle into Restic at runtime. In our PoC we had to use an initcontainer on Restic which would symlink a secret that is mounted into the Restic container. This would allow us to modify a secret from Velero and have the associated changes populated in the Restic container. I do not think that's a viable approach moving forward though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea: take a look at Velero's code for managing restic repo credentials, i.e. the encryption key. The way this works today is that the key is stored in a secret in the velero namespace, and each time we exec a restic command, we read the contents of the secret, write it out to a temp file, pass the path of the temp file to restic, and then remove the temp file afterwards. This has worked pretty well in practice*.

We might be able to take a similar approach for the CA bundles, where they're stored in secrets, the BSL has a reference to them, and they're projected into a temp file just-in-time for restic invocations.

Here are a few links to relevant sections of the current code:

https://github.com/vmware-tanzu/velero/blob/master/pkg/restic/repository_manager.go#L238-L245
https://github.com/vmware-tanzu/velero/blob/master/pkg/restic/common.go#L168-L203

  • as an aside, yes, this is over-complicated given that we're currently using a static/non-secret encryption key for restic repos, but this was implemented before we decided to go with that approach instead of using unique, secret keys per repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skriss thanks a lot for the links I definitely think we could follow a similar pattern for the CA bundle here since restic allows for a --cacert command flag which we can specify like we do the password file here: https://github.com/vmware-tanzu/velero/blob/master/pkg/restic/command.go#L86.

as an aside, yes, this is over-complicated given that we're currently using a static/non-secret encryption key for restic repos

Makes sense. I have faced similar problems like this in the past and it seems like this is a common pattern used to solve the issue of needing data stored somewhere in memory of a pod that a controller will manage.

I will try something like this out and update the document.

to include the secret and mount it at a pre-defined path. This path will then
be passed in to the restic server at command invocation.

## Detailed Design

The `AWS_CA_BUNDLE` environment variable works for the Velero deployment
because this environment variable is passed into the AWS SDK which is used in
the [plugin][1] to build up the config object. This means that a user can
simply define the CA bundle in the deployment as an env var. This can be
utilized for the installation of Velero with a custom cert by simply setting
this env var to the contents of the CA bundle, or the env var can be mapped to
a secret which is controlled at installation time. I recommend using a secret
as it makes the Restic integration easier as well.

At installation time, if a user has specified a custom cert then the Restic
daemonset should be updated to include the secret mounted at a predefined path.
We could optionally use the system store for all custom certs added at
installation time. Restic supports using the custom certs [in addition][3] to
the root certs.

In the case of the BSL being created with a secret reference, then at runtime
the secret will need to be consumed. This secret will be read and applied to
the AWS `session` object. The `getSession()` function will need to be updated
to take in the custom CA bundle so it can be passed [here][4].

The Restic daemonset will need to be updated to include the secret mounted as a
volume in the containers at a defined path. The restic [command invocation][2]
will need to be updated to include the path to the file as an argument to the
restic server using `--cacert`. For the path when a user defines a custom cert
on the BSL, Velero will be responsible for updating the daemonset to include
the secret mounted as a volume at a predefined path.

Where we mount the secret is a fine detail, but I recommend mounting the certs
to `/certs` to keep it in line with the other volume mount paths being used.

### Install command changes

The installation flags should be updated to include the ability to pass in a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlisia This is relevant to #2202. My guess would probably be part of the plugin add command there, with a new option for certs that would create the secret and do the updating on the deployment, and restic DaemonSet. I think this is only valid for ObjectStore plugins, at least the restic DaemonSet part. I'm envisioning the command would take a path to the cert file, create the Secret in the cluster, then put a secretRef in the Velero Deployment/restic DaemonSet.

We'd also need to have the backup-location (set|create) commands have an argument to take a reference to a secret name (which is the cert) in the same namespace. Possibly the snapshot-location (set|create) commands, if it's relevant to them. The command would create or edit the BSL (or VSL) to have a reference to the Secret within the same namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy that! I'll add it to the doc, thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it will work the same way as with the credentials for the provider (as far as adding it to the cluster).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think it'll work the same way, it's just a Secret with different contents.

cert file. Then the install command would do the heavy lifting of creating a
secret and updating the proper fields on the deployment and daemonset to mount
the secret at a well defined path.

[1]: https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/master/velero-plugin-for-aws/object_store.go#L135
[2]: https://github.com/vmware-tanzu/velero/blob/master/pkg/restic/command.go#L47
[3]: https://github.com/restic/restic/blob/master/internal/backend/http_transport.go#L81
[4]: https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/master/velero-plugin-for-aws/object_store.go#L154