-
Notifications
You must be signed in to change notification settings - Fork 8
Vault provider #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vault provider #1
Conversation
Signed-off-by: Oracle Public Cloud User <opc@k8ssecvm3.sub04190007390.secwest.oraclevcn.com>
Signed-off-by: Oracle Public Cloud User <opc@k8ssecvm3.sub04190007390.secwest.oraclevcn.com>
Signed-off-by: Oracle Public Cloud User <opc@k8ssecvm3.sub04190007390.secwest.oraclevcn.com>
Signed-off-by: Oracle Public Cloud User <opc@k8ssecvm3.sub04190007390.secwest.oraclevcn.com>
This reverts commit b71835a.
Signed-off-by: Oracle Public Cloud User <opc@k8ssecvm3.sub04190007390.secwest.oraclevcn.com>
|
does this need the target branch updated? |
|
It will be nice to have this branch kept synced with Kubernetes/kubernetes:master to avoid large merge conflicts at a later stage. |
| client *api.Client | ||
| encryptPath string | ||
| decryptPath string | ||
| authPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A go fmt would be good as there's some mixed/misaligned indentation here and a couple of other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected the code using gofmt -w, also local verification hack/make-rules/../../hack/verify-gofmt.sh going through now.
|
@vineet-garg My initial reaction - this commit is too large (110 files). Let's figure out offline how to break this up into smaller commits. |
| } | ||
| wrapper := &clientWrapper{ | ||
| client: client, | ||
| encryptPath: "/v1/" + transit + "/encrypt/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use path.Join for all the url building. That will let you remove all the extra trim/slash checking above. right now if config.AuthPath isn't set authPath below will be auth//. path.Join handles all that for you.
|
|
||
| func newVaultClient(config *EnvelopeConfig) (*api.Client, error) { | ||
| vaultConfig := api.DefaultConfig() | ||
| vaultConfig.Address = config.Address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts on supporting ReadEnvironment() out of the box. In my experience, this is typically always a feature everyone wants since it allows you to seamlessly integrate with existing vault deployments that depend on the env vars.
You can see them all here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a good idea, Though we need more configuration parameters than just client parameters, like authentication parameters, key names etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhorwit2 perhaps as a follow-on PR. I'd like to get the basic support in first.
| err = c.appRoleToken(config) | ||
| default: | ||
| err = fmt.Errorf("invalid authentication configuration %+v", config) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this validation required? The config is validated after it is initially deserialized. Nothing that I saw modifies these values either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not validation, just error handling in a default case. The Default case is not expected to be reachable, but its still good to have the default case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a comment then to say this is not an expected code path. The fact that it's there currently makes it seem like it's a possible outcome.
|
|
||
| result, ok := resp.Data["plaintext"].(string) | ||
| if !ok { | ||
| return result, fmt.Errorf("failed type assertion of vault decrypt response to string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add the type and/or value to the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values are secrets and cannot be put in error messages. Type "String" is already mentioned in the err message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, i meant add the type of the response that was not a string reflect.TypeOf(resp.Data["plaintext"])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| if resp != nil { | ||
| defer resp.Body.Close() | ||
| } | ||
| if resp.StatusCode == 403 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this is structured can cause this to panic if err is non-nil. It should be something like:
resp, err := c.client.RawRequest(req)
if err != nil {
return nil, err
}
defer resp.Body.close()
if resp.StatusCode == 403 {
return nil, &forbiddenError{version: c.version, err: err}
}
| ) | ||
|
|
||
| func init() { | ||
| KMSPluginRegistry.Register("vault", vault.KMSFactory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant should be a variable in the vault package and used as the key when stripping / adding it to data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stripping and adding of the provider name prefix to data stored is done in the core kmstransformer code which is not owned by us, It takes the provider name from here and use it for that particular provider.
Separate from this, there is also prefix processing in client.go, it just strips and adds the prefix used by vault itself.
Both prefixes can vary seperately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhorwit2 Thanks for reviewing the code, it was very helpful, I have made most of the changes suggested by you. Does the PR look good to you? Once internal review is over, I will create a new branch which has only 2 commits (one for code changes, one for dependencies and build) sync with latest upstream master and run all local verify/test/integration-test again.
| return err | ||
| } | ||
|
|
||
| func (c *clientWrapper) tlsToken(config *EnvelopeConfig) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel these token methods should return (string, error) and the token should only be set in refreshToken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
| } | ||
|
|
||
| // The function type for clientWrapper.encrypt and clientWrapper.decrypt. | ||
| type encryptOrDecryptFunc func(*clientWrapper, string, string) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern here and withRefreshToken seems odd to me. Why doesn't the clientWrapper store the config that it needs to do authentication? That way you could just call encrypt/decrypt on the clientWrapper and it handles the refreshToken retry logic for you. It appears to me you always want to do the encrypt/decrypt action with the refresh token, so it should go down one level into the client wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, put retry logic to client is another choice. For current implementation, it expects vault.go handle provider logic (for example parsing config, handle keyNames), client.go handle all details that communicates with vault server. I think "refresh token and retry" belongs to provider logic, because using retry or how many time to retry can be configured by provider if we want.
|
|
||
| // Get token by login and set the value to api.Client. | ||
| func (c *clientWrapper) refreshToken(config *EnvelopeConfig, version uint) error { | ||
| c.rwmutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain there is a race condition here. If the token can only be used once before having to be renewed that would cause encrypt/decrypt calls to fail since they only retry once and multiple goroutines may block waiting on the same refresh token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for those tokens with small number of use, encrypt/decrypt may fail. Actually even with more retry (for example retry 5 times), the failure may also happen. Not use retry is just for simplicity. We can make it configurable if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this planning to be addressed prior to opening upstream? That seems like a pretty severe issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will address this. There will be performance impact if num of use/expiry time is short, but it will not fail
| // We may update token for api.Client, but there is no sync for api.Client. | ||
| // Read lock for encrypt/decrypt requests, write lock for login requests which | ||
| // will update token for api.Client. | ||
| rwmutex sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for these tokens to be renewable? If so, shouldn't we be able to do all this w/o locks by calling Renew on the client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all vault tokens are renewable, so we don't consider Renew on vault client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it support renewable tokens as "first class" though instead of defaulting to a refresh approach? That seems like the best performance especially in a large cluster and the least complexity. (no locking required). If not now, I imagine we would want to in the future, so thinking about that in the current design would help make that easier. If you moved the refresh logic into the client you could have a refreshClientWrapper and in the future we could add a renewalClientWrapper with the same interface that's constructed dynamically based on info returned on the original token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would defer this to follow-up PR
| client: client, | ||
| encryptPath: "/" + path.Join("v1", transit, "encrypt") + "/", | ||
| decryptPath: "/" + path.Join("v1", transit, "decrypt") + "/", | ||
| authPath: "/" + path.Join(auth) + "/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to prefix or suffix this with slashes it can just be path.Join(stuff...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| } | ||
|
|
||
| func (c *clientWrapper) tlsToken(config *EnvelopeConfig) (string, error) { | ||
| resp, err := c.client.Logical().Write(c.authPath+"cert/login", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path.Join -- same goes for appRoleToken, decrypt and other requests calls if there are any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| } | ||
| return secret, nil | ||
| } | ||
| return nil, fmt.Errorf("Unexpected response code: %v received for POST request to %v", resp.StatusCode, path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors should start with a lowercase (this goes for other errors like below): https://github.com/golang/go/wiki/CodeReviewComments#error-strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| return result, err | ||
| } | ||
|
|
||
| secret, requestErr := f(s.client, key, data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this still appears like it can fail due to race conditions if the token has a limited number of uses and/or there are enough goroutines making calls to encrypt/decrypt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, i see what you changed here. This shouldn't technically error anymore; however, it's going to have potentially bad performance when multiple goroutines all fail during the read lock calls and block on the write mutex. I think in its current state it's fine, but I feel there may be push back upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the usage, the contention is less likely:
- There is caching of DEK in the framework code, so most decrypt operations will not require call to KMS providers.
- Encrypt operations will not be frequent as encrypt will happen only during creation or update of a secret.
- It is not expected that multiple secrets will be created or updated concurrently frequently.
- If token policy is chosen to have a reasonable num of use/expiry, 403 will not be that frequent.
As long as technically it is guaranteed to work in rare worst case scenario at a degraded performance, I think it should be good. I can put the explanation in comments so that someone reads and validates the assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 great explanation. Thanks!
| } | ||
|
|
||
| for _, testCase := range invalidConfigs { | ||
| _, err := serviceTestFactory(testCase.config, server.URL, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use https://golang.org/pkg/testing/#T.Run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
@vineet-garg I still have some worries about the locking with race conditions under enough load causing errors since it only retries once. nit: I feel the tests could be cleaned up and consolidated using table driven tests which is the de facto standard in k8s. For example, Also, the tests are fairly integration testy. edit: the example above was just something i threw together. Idk if it's correct for those appear very similar. |
|
@jhorwit2 |
|
Closing this pull requests as it has a large number of unorganized commits. review comments incorporated and commits organized in new pull request: #4 |
Adding recent upstream changes to k8s.
update from kubernetes master
What this PR does / why we need it: Implements encryption provider based on Vault based KMS as described in proposal: PR:888
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close that issue when PR gets merged): fixes # 49817Special notes for your reviewer:
Release note:encryption provider based on Vault based KMS