Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable secret backend #2239

Merged
merged 1 commit into from Jul 14, 2017

Conversation

Projects
None yet
7 participants
@liron-l
Copy link
Contributor

commented Jun 11, 2017

This commit extends SwarmKit secret management with pluggable secret
backends support. The solution uses the existing docker plugin
framework for loading plugins and the existing SwarmKit data backend for
storing them.

The approach is to add a new driver parameter to existing secrets,
which defines whether the values are taken as is or fetched from one of
the secret plugins. The loading of secrets is done using the standard
docker plugin infrastructure, which is already accessible in SwarmKit
and used in other flows (e.g., networking).
The fetched values are are stored as regular SwarmKit secrets.

Remarks:

  • I've added support for mocking the plugin subsystem when settings up
    the controlapi server.
    I preferred this approach over loading the full plugin subsystem in UT.

Work still needed in this CR:

  • More unit tests (pending initial iteration)
  • Customized error handling (e.g., customize error string for Not
    Found)

Work still needed to complete this feature:

  • Inject secrets as part of plugin initialization
  • CLI support in docker
  • Docs
  • Support scheduling plugins in swarm
    moby/moby#33575

Signed-off-by: liron liron@twistlock.com

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch 3 times, most recently from 23a2ee6 to f966946 Jun 11, 2017

@codecov

This comment has been minimized.

Copy link

commented Jun 11, 2017

Codecov Report

Merging #2239 into master will increase coverage by 0.03%.
The diff coverage is 81.25%.

@@            Coverage Diff            @@
##           master   #2239      +/-   ##
=========================================
+ Coverage   61.07%   61.1%   +0.03%     
=========================================
  Files         128     128              
  Lines       20556   20579      +23     
=========================================
+ Hits        12554   12575      +21     
+ Misses       6627    6619       -8     
- Partials     1375    1385      +10
@@ -386,6 +386,9 @@ message SecretSpec {

// Data is the secret payload - the maximum size is 500KB (that is, 500*1024 bytes)
bytes data = 2;

// Driver is the name of the secret driver that is used to store the specified secret
string driver = 3;

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 12, 2017

Collaborator

I wonder if it makes sense to use the Driver type here, to allow the future possibility of passing secret-specific options to the driver.

const MaxSecretSize = 500 * 1024 // 500KB
const (
// SecretsPluginAPI is the endpoint for fetching secrets from plugins
SecretsPluginAPI = "/SecretsDriver.GetSecret"

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jun 12, 2017

Contributor

SecretProvider? Same with the capability name.

@@ -157,12 +169,16 @@ func (s *Server) ListSecrets(ctx context.Context, request *api.ListSecretsReques
// or if the secret data is too long or contains invalid characters.
// - Returns an error if the creation fails.
func (s *Server) CreateSecret(ctx context.Context, request *api.CreateSecretRequest) (*api.CreateSecretResponse, error) {
err := s.populateSecretFromPlugin(ctx, request.Spec)

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 12, 2017

Collaborator

Another alternative would be to put this resolution in the dispatcher, so the secret is fetched at the time the task is sent to the node where it will run. This is what I suggested earlier. The advantages would be:

  • swarmkit would not be modifying the secret Spec. Specs are meant to be under the control of the user, and so far swarmkit never changes them. We're considering signing specs with a user-controlled key in the future.
  • Calling UpdateSecret (for example, to change the secret's labels) would not have side effects on the payload, if it happened to change in the backend.
  • We would avoid storing a copy of the secret payload inside the Raft datastore, which people may not want to do for security reasons.

The disadvantages would be:

  • Inability to access the secrets backend could block deploying tasks, not just creating/updating secrets.
  • To avoid redundant queries to the backend for each of N tasks that reference a secret, it would make sense to have an in-memory cache of secret payloads with a time-to-live, but this would add complexity.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 12, 2017

Author Contributor

@diogomonica @cpuguy83 what do you think?
@aaronlehmann i think that caching should be the plugin responsibility. WDYT?

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jun 12, 2017

Contributor

SGMT @aaronlehmann

One way around issues is to have a separate call that the the manager runs here to ensure that the secret is accessible before sending it for dispatch. Doesn't need to store anything.

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jun 12, 2017

Contributor

I also think we should create a driver interface that have implementations for the built-in store and plugins.
Then it's a simple d := GetDriver(spec.Driver); d.<method>

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 13, 2017

Author Contributor

Thanks @aaronlehmann, the assignment solution still means that we need to modify the secret spec payload inside Assignment_Secret before we send the assignment (that is, after the secret query in addTaskDependencies). Is this a valid approach?
I think your first concern might cause debugging issues with secret plugins, @diogomonica WDYT?

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 13, 2017

Author Contributor

@cpuguy83 I've abstracted the plugin/driver initialization and setup with a new drivers package. This package can be used to load any type of plugin in Swarmkit. Let me know what you think.

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 13, 2017

Collaborator

Thanks @aaronlehmann, the assignment solution still means that we need to modify the secret spec payload inside Assignment_Secret before we send the assignment (that is, after the secret query in addTaskDependencies). Is this a valid approach?

That's a fair point, however I think it's preferable to modify the spec for last-mile delivery versus storing a modified version in the data store. One possibility to avoid modifying the spec at all would be to introduce another Data field outside of Spec that could be freely modified by the manager. When the worker receives a secret, it would check both fields to see which one is populated. @diogomonica WDYT?

This comment has been minimized.

Copy link
@diogomonica

diogomonica Jun 18, 2017

Contributor

@aaronlehmann I like that approach. Would still allow us to sign the spec itself, and if the secret is external, that component would be unsigned.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch 13 times, most recently from 6d827fc to 1ac46c1 Jun 13, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jun 18, 2017

@aaronlehmann @cpuguy83, @diogomonica I moved the secret resolution to the assignmentSet flow. Let me know if this makes sense and I will add dedicated UT.

}

// Get gets a secret from the secret provider
func (d *SecretDriver) Get(spec *api.SecretSpec) ([]byte, error) {

This comment has been minimized.

Copy link
@diogomonica

diogomonica Jun 18, 2017

Contributor

Do we want to provide an api.SecretSpec here? Would be great if the driver got information of which service is requesting this secret.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 19, 2017

Author Contributor

@diogomonica @cpuguy83 @aaronlehmann, I can think about two options to pass service parameters to plugin in a deterministic backward compatible way:

  1. Send the Task as binary blob inside the request
  2. Use the ServiceAnnotations
    Do you have other ideas?

This comment has been minimized.

Copy link
@diogomonica

diogomonica Jul 7, 2017

Contributor

Riyaz and I came up with a list. For now let's not worry about passing more stuff and agree on the general API.

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 8, 2017

Author Contributor

Great, thanks!


// populateSecretFromPlugin populates the secret value for the given specification using the secret plugin subsystem.
func (a *assignmentSet) populateSecretFromDriver(spec *api.SecretSpec) error {
if spec == nil || spec.Driver.Name == "" {

This comment has been minimized.

Copy link
@diogomonica

diogomonica Jun 18, 2017

Contributor

Why not use validateSecretSpec

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 19, 2017

Author Contributor

Thanks, I've added shared code to validate the secret payload, after the value is populated.

}

// SecretsProviderRequest is the request specification for retrieving secrets from plugins.
type SecretsProviderRequest struct {

This comment has been minimized.

Copy link
@diogomonica

diogomonica Jun 18, 2017

Contributor

Would be great if we have extra metadata on the caller of the request, so that external plugin can issue customized secrets.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 19, 2017

Author Contributor

I passed the ServiceAnnotations, I hope this makes sense.

This comment has been minimized.

Copy link
@riyazdf

riyazdf Jun 22, 2017

from discussion in slack: it might be more useful to pass the full ServiceSpec, though that would require converting into JSON

@diogomonica

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2017

Overall I think this is a good start. We need to figure out the best way of providing external metadata on the ultimate service this secret is being requested for, such that an external plugin can issue secrets for a specific service.

An example would be a TLS certificate that gets issued on-the-fly by the secrets plugin, and needs to include the service name/other metadata on the x509 certificate that depends on the service itself.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 1ac46c1 to 4a4f588 Jun 19, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jun 19, 2017

@diogomonica @cpuguy83 @aaronlehmann
I published another iteration with the following changes:

  1. Moved the secret resolution to the dispatcher next to fetching the secret values from raft store.
  2. Add the ServiceAnnotations metadata to the driver request
  3. Added a simple dispatcher UT for resolving the secret values
assert.NoError(t, err)
defer stream.CloseSend()

time.Sleep(500 * time.Millisecond)

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 19, 2017

Collaborator

Is the sleep necessary? Recv is a blocking function.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 22, 2017

Author Contributor

Thanks, fixed

@@ -238,16 +238,26 @@ func (s *Server) RemoveSecret(ctx context.Context, request *api.RemoveSecretRequ
}
}

// ValidateSecretPayload validates the secret payload size
func ValidateSecretPayload(data []byte) error {

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 19, 2017

Collaborator

I'd prefer to move this to a subpackage like api/secret or api/validation, so that dispatcher doesn't import controlapi.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 22, 2017

Author Contributor

Moved to api/validation/secrets.go, I kept the prefix validation in the function name, let me know if you prefer validation.SecretPayload

@@ -15,16 +15,23 @@ import (
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials"

"encoding/json"

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jun 19, 2017

Collaborator

nit: Standard library imports such as this are typically put in the top section of the import statement, sorted alphabetically.

This comment has been minimized.

Copy link
@liron-l

liron-l Jun 22, 2017

Author Contributor

Thanks fixed

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 4a4f588 to 8b88006 Jun 22, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jun 22, 2017

Thanks @aaronlehmann I've updated the review according to your comments.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 1ab8d83 to 8533e89 Jul 2, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 2, 2017

Thanks for all comments, @aaronlehmann, @diogomonica, @cpuguy83.
I've update the review according to the comments, please take a look.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 8533e89 to 84bd7cc Jul 5, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 5, 2017

Per discussion with @diogomonica, I removed ServerSpec from plugin request. Additional service properties will be added if required.


if len(spec.Data) >= MaxSecretSize || len(spec.Data) < 1 {
return grpc.Errorf(codes.InvalidArgument, "secret data must be larger than 0 and less than %d bytes", MaxSecretSize)
if spec.Driver.Name != "" {

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 5, 2017

Contributor

I assume the built-in backend must have a name, or is likely to have a name at some point just to be explicit... maybe this check is not sufficient?

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @cpuguy83, I made this type nullable, and if defined, validation will throw an error if no name is specified.

@@ -393,6 +393,9 @@ message SecretSpec {
// The currently recognized values are:
// - golang: Go templating
Driver templating = 3;

// Driver is the the secret driver that is used to store the specified secret
Driver driver = 4 [(gogoproto.nullable) = false];

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 5, 2017

Contributor

Are we sure this shouldn't be nullable?
I see a lot of checks for if spec.Driver.Name != "" {}

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @cpuguy83, I made this nullable. Now, I separate between nullable driver (OK) to initialized value with name (added validation)

@@ -245,3 +260,24 @@ func (a *assignmentSet) message() api.AssignmentsMessage {

return message
}

// populateSecretFromPlugin populates the secret value for the given specification using the secret plugin subsystem.
func (a *assignmentSet) populateSecretFromDriver(spec *api.SecretSpec, readTx store.ReadTx) error {

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 5, 2017

Contributor

This function still confuses me a bit. How is the secret populated for the built-in raft store?

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @cpuguy83, the secret spec was populated by the calling method. I melded everything to a single function (secret, which fetch the value from raft store and populate the secret value if needed). To keep flow consistent, I've change the Debug message in case the secret is not found in raft-store to Error, hope this makes more sense now.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 56b9b2f to dc7d83a Jul 5, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 5, 2017

Thanks for the comments @cpuguy83, I've updated the review based on your feedback.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from dc7d83a to e776b92 Jul 5, 2017

@@ -63,4 +69,5 @@ var createCmd = &cobra.Command{

func init() {
createCmd.Flags().StringP("file", "f", "", "Rather than read the secret from STDIN, read from the given file")
createCmd.Flags().StringP("driver", "d", "", "The secret driver")

This comment has been minimized.

Copy link
@riyazdf

riyazdf Jul 5, 2017

nit: I think we should note that it is STDIN if not specified

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @riyazdf, I changed it according to the file flag format.

// Check if secret driver is defined
if spec.Driver != nil {
// Ensure secret driver has a name
if spec.Driver.Name == "" {

This comment has been minimized.

Copy link
@riyazdf

riyazdf Jul 5, 2017

I might be tracing the code incorrectly but it seems that we could have a non-nil driver with an empty name? See: https://github.com/docker/swarmkit/pull/2239/files#diff-b7cdf7ddfbe8b31d75bc99e8d2d0fa78R58

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @riyazdf, you are right, I modified the create flow accordingly.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from e776b92 to 7df28e3 Jul 5, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 5, 2017

Thanks @riyazdf, let me know if you think we should make the Driver non-nullable type.


// secret populates the secret value from raft store. For external secrets, the value is populated
// from the secret driver.
func (a *assignmentSet) secret(secretID string, readTx store.ReadTx) (*api.Secret, error) {

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 5, 2017

Contributor

Just a nit, I like to pass the transaction first.

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks, makes sense, I didn't notice the order.

// secret populates the secret value from raft store. For external secrets, the value is populated
// from the secret driver.
func (a *assignmentSet) secret(secretID string, readTx store.ReadTx) (*api.Secret, error) {
secret := store.GetSecret(readTx, secretID)

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 5, 2017

Contributor

Should we only do this if the driver is nil?

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 5, 2017

Author Contributor

Thanks @cpuguy83, as far as I understand no, for two reasons:

  1. We need to fetch the actual Driver object in case additional Driver metadata is needed for initiating the plugin
  2. We need the api.Secret to initiate the assignment. Since I have the secret ID, name and value I can probably generate this object. However, I feel that using the raft store is more robust.
    WDYT?

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 7df28e3 to b400d18 Jul 5, 2017

@riyazdf

This comment has been minimized.

Copy link

commented Jul 5, 2017

@liron-l: thanks! I think nullable makes sense, though I'll let you know if I think of a reason we should make it non-nullable 👍

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 8, 2017

Thanks @aaronlehmann, @riyazdf and @cpuguy83, I hope the last iteration satisfies all requirements.

// from the secret driver.
func (a *assignmentSet) secret(readTx store.ReadTx, secretID string) (*api.Secret, error) {
secret := store.GetSecret(readTx, secretID)
if secret == nil {

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 10, 2017

Contributor

Looks like this is returning any time the local store doesn't have the secret.

I still think it's better to check the local store only if driver was not specified.

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 10, 2017

Author Contributor

@cpuguy83 but if a driver was specified, I still need to query the api.Secret object to fetch the api.Driver (to correctly materialize the secret).

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 10, 2017

Contributor

Oh, I see now.
Thanks.

This comment has been minimized.

Copy link
@riyazdf

riyazdf Jul 10, 2017

I'm not sure I follow: doesn't this function return if the store.GetSecret returns nil?

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 10, 2017

Contributor

The secret metadata is still stored in the raft store and must exist there.

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 10, 2017

Author Contributor

@riyazdf, just to clarify, following @cpuguy83 comments, I consolidate all secret fetching functionality to a single function.
The flow:

  1. Fetch the secret (since the secret is required by the task, return error if not found)
  2. If secrets driver is defined, fetch the secret value from the driver (otherwise return the secret value)
    Let me know if additional refactoring is needed.

This comment has been minimized.

Copy link
@riyazdf

riyazdf Jul 10, 2017

got it. Thanks @cpuguy83 and @liron-l for clarifying. :)


// SecretsProviderRequest is the request specification for retrieving secrets from plugins.
type SecretsProviderRequest struct {
Name string `json:"name"` // Name is the name of the secret plugin

This comment has been minimized.

Copy link
@cpuguy83

cpuguy83 Jul 10, 2017

Contributor

s/plugin//

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 10, 2017

Author Contributor

Thanks fixed.

@cpuguy83
Copy link
Contributor

left a comment

Not all that familiar with the swarmkit side of things, but this code LGTM.

Not a fan of plugingetter, but it's what's available from docker right now.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from b400d18 to 184a88a Jul 10, 2017

@riyazdf
Copy link

left a comment

LGTM, thank you @liron-l for the hard work!

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2017

Thanks so much @riyazdf @cpuguy83 @aaronlehmann, @diogomonica who else needs to review this commit before merging?

@@ -1,9 +1,12 @@
package dispatcher

import (
"fmt"

This comment has been minimized.

Copy link
@aaronlehmann

aaronlehmann Jul 11, 2017

Collaborator

minor nit: normally there would be a blank line here

This comment has been minimized.

Copy link
@liron-l

liron-l Jul 12, 2017

Author Contributor

Thanks @aaronlehmann fixed.

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 184a88a to 93a76cb Jul 12, 2017

Pluggable secret backend
This commit extends SwarmKit secret management with pluggable secret
backends support. The solution uses the existing docker plugin
framework for loading plugins and the existing SwarmKit data backend for
storing them.

The approach is to add a new `driver` parameter to existing secrets,
which defines whether the values are taken as is or fetched from one of
the secret plugins. The loading of secrets is done using the standard
docker plugin infrastructure, which is already accessible in SwarmKit
and used in other flows (e.g., networking).
The fetched values are evaluated before assigning them to worker nodes,
so the payload is not stored in the raft store.

Remarks:
* I've added support for mocking the plugin subsystem when settings up
the controlapi server.
I preferred this approach over loading the full plugin subsystem in UT.

Work still needed in this CR:
- [ ] More unit tests (pending initial iteration)
- [ ] Customized error handling (e.g., customize error string for Not
Found)

Work still needed to complete this feature:
- [ ] Inject secrets as part of plugin initialization
- [ ] CLI support in docker
- [ ] Docs
- [ ] Support scheduling plugins in swarm
moby/moby#33575

Signed-off-by: liron <liron@twistlock.com>

@liron-l liron-l force-pushed the twistlock:pluggable_secret_backend branch from 93a76cb to e9a7bc0 Jul 12, 2017

@liron-l

This comment has been minimized.

Copy link
Contributor Author

commented Jul 14, 2017

Thanks @aaronlehmann, @cpuguy83, @riyazdf, @diogomonica.
What is the process of merging this change?

@diogomonica diogomonica merged commit eebac27 into docker:master Jul 14, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/project 61.1% (target 0%)
Details
dco-signed All commits are signed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.