Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
261 changes: 261 additions & 0 deletions keps/sig-cloud-provider/20191004-out-of-tree-credential-providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
---
title: Out-of-Tree Credential Providers
authors:
- "@mcrute"
- "@nckturner"
owning-sig: sig-cloud-provider
participating-sigs:
- sig-node
- sig-auth
reviewers:
- "@andrewsykim"
- "@cheftako"
- "@tallclair"
- "@mikedanese"
approvers:
- "@andrewsykim"
- "@cheftako"
- "@tallclair"
- "@mikedanese"
editor: TBD
creation-date: 2019-10-04
last-updated: 2019-12-10
status: implementable
---

# Out-of-Tree Credential Providers

## Table of Contents

<!-- toc -->
- [Release Signoff Checklist](#release-signoff-checklist)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [External Credential Provider](#external-credential-provider)
- [Example](#example)
- [Alternatives Considered](#alternatives-considered)
- [API Server Proxy](#api-server-proxy)
- [Sidecar Credential Daemon](#sidecar-credential-daemon)
- [Bound Service Account Token Flow](#bound-service-account-token-flow)
- [Pushing Credential Management into the CRI](#pushing-credential-management-into-the-cri)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Graduation Criteria](#graduation-criteria)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
- [Version Skew Strategy](#version-skew-strategy)
- [Implementation History](#implementation-history)
- [Infrastructure Needed](#infrastructure-needed)
<!-- /toc -->

## Release Signoff Checklist

- [x] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
- [x] KEP approvers have set the KEP status to `implementable`
- [x] Design details are appropriately documented
- [x] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [x] Graduation criteria is in place
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

## Summary

This KEP replaces the existing in-tree container image registry credential providers with an external and pluggable credential provider mechanism and removes the in-tree credential providers.

## Motivation

Kubelet uses cloud provider specific SDKs to obtain credentials when pulling container images from cloud provider specific registries. The use of cloud provider specific SDKs from within the main Kubernetes tree is deprecated by [KEP-0002](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/20180530-cloud-controller-manager.md) and all existing uses need to be migrated out-of-tree. This KEP supports that migration process by removing this SDK usage.

### Goals
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a goal to continue to support programmatic access?

The current credentialprovider package is depended on by ggcr (github.com/google/go-containerregistry) and, indirectly, by the the consumers of ggcr. This package has become difficult to consume, hence kubernetes/kubernetes#82396.

If programmatic access is dropped, ggcr and the projects based on it will need to include cloud provider specific binaries in their releases, which will complicate the packaging of these releases and make them much less convenient to install.

It may be tempting to limit this KEP to kubelet's needs, but some of the above projects provide important infrastructure in support of Kubernetes. For example, an image relocation project based on ggcr is used to copy images to a private registry - useful when running Kubernetes in disconnected environments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW both Knative and Tekton build on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this exactly why we discourage depending on kubernetes/kubernetes? Since we want to be able to make design decisions based on what is best for Kubernetes, not the library. This seems like it would be a non-goal to me.

That said, we could take steps to make this less painful. By providing a common interface or reference implementation, it would be a lot easier to pull the credential providers together into a single library (assuming they're opensource and implemented in go, using the provided interface).

Copy link
Contributor Author

@nckturner nckturner Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @tallclair. I added "Continuing to support projects that import the credential provider package." as a non-goal. Once the cloud provider specific code is outside kubernetes/kubernetes, they can choose to import each cloud provider package or use the exec-based interface kubelet will use. We can publish an interface and encourage providers to use it to make programmatic usage easier, but I don't know that needs to be included in the KEP.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can publish an interface and encourage providers to use it to make programmatic usage easier

For what it's worth, this would satisfy my needs.


* Develop/test/release an interface for kubelet to obtain registry credentials from a cloud provider specific binary
* Update/test/release the credential acquisition logic within kubelet
* Build user documentation for out-of-tree credential providers
* Support migration from existing in-tree credential providers to the new credential provider interface, along with dynamic roll back.
* Remove in-tree credential provider code from Kubernetes core

### Non-Goals

* Broad removal of cloud SDK usage falls under the [KEP for removing in-tree providers](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/2019-01-25-removing-in-tree-providers.md).
* Continuing to support projects that import the credential provider package.

## Proposal

### External Credential Provider

An executable capable of providing container registry credentials will be pre-installed on each node so that it exists when kubelet starts running. This binary will be executed by the kubelet to obtain container registry credentials in a format compatible with container runtimes. Credential responses may be cached within the kubelet.

This architecture is similar to the approach taken by the exec based credential plugin architecture already present in client-go and CNI, and is a well understood pattern. The API types are modeled after the ExecConfig and ExecCredential in client-go which define exec based credential retrieval for similar use cases.

A `RegistryCredentialConfig` and `RegistryCredentialProvider` configuration API type (similar to [clientauthentication](https://github.com/kubernetes/kubernetes/tree/0273d43ae9486e9d0be292c01de2dd4143522b86/staging/src/k8s.io/client-go/pkg/apis/clientauthentication/v1beta1)) will be added to Kubernetes:

```go
type RegistryCredentialConfig struct {
metav1.TypeMeta `json:",inline"`

Providers []RegistryCredentialProvider `json:"providers"`
}

// RegistryCredentialProvider is used by the kubelet container runtime to match the
// image property string (from the container spec) with exec-based credential provider
// plugins that provide container registry credentials.
type RegistryCredentialProvider struct {
metav1.TypeMeta `json:",inline"`

// ImageMatchers is a list of strings used to match against the image property
// (sometimes called "registry path") to determine which images to provide
// credentials for. If one of the strings matches the image property, then the
// RegistryCredentialProvider will be used by kubelet to provide credentials
// for the image pull.

// The image property of a container supports the same syntax as the docker
// command does, including private registries and tags. A registry path is
// similar to a URL, but does not contain a protocol specifier (https://).
//
// Each ImageMatcher string is a pattern which can optionally contain
// a port and a path, similar to the image spec. Globs can be used in the
// hostname (but not the port or the path).
//
// Globs are supported as subdomains (*.k8s.io) or (k8s.*.io), and
// top-level-domains (k8s.*). Matching partial subdomains is also supported
// (app*.k8s.io). Each glob can only match a single subdomain segment, so
// *.io does not match *.k8s.io.
//
// The image property matches when it has the same number of parts as the
// ImageMatcher string, and each part matches. Additionally the path of
// ImageMatcher must be a prefix of the target URL. If the ImageMatcher
// contains a port, then the port must match as well.
ImageMatchers []string `json:"imageMatchers"`

// Exec specifies a custom exec-based plugin. This type is defined in
// https://github.com/kubernetes/client-go/blob/62f256057db7571c5ed1aba47eea291f72dd557a/tools/clientcmd/api/types.go#L184
Exec clientcmd.ExecConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is it assumed the credential provider binary is in the $PATH seen by the kubelet or do we pass in absolute paths in ExecConfig? If the latter do we need to add flags to the kubelet to discover these binaries or assume a default path?

Copy link
Contributor Author

@nckturner nckturner Jan 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we could use a kubelet flag to determine its search path for binaries (and have a default path). What were you thinking?

}
```

The RegistryCredentialConfig will be encoded in YAML and located in a file on disk. The exact path of the credential provider configuration file will be passed to kubelet via a new configuration option `RegistryCredentialConfigPath`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


We will create new types `RegistryCredentialPluginRequest` and `RegistryCredentialPluginResponse` which will define the interface between the plugin and the kubelet runtime. After the kubelet matches the image property string to a RegistryCredentialProvider, the kubelet will exec the plugin binary, and pass the JSON encoded request to the plugin via stdin. This includes the image that is to be pulled. The plugin will report back the response, which includes the credentials that kubelet needs to pull the image.

In the in-tree implementation, the docker keyring, which has N credential providers, returns an `[]AuthConfig` on a Lookup(image string) call. This struct will be populated by the plugin rather than the in-tree provider.

```go
// RegistryCredentialPluginRequest is passed to the plugin via stdin, and includes the image that will be pulled by kubelet.
type RegistryCredentialPluginRequest struct {
// Image is used when passed to registry credential providers as part of an
// image pull
Image string `json:"image"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intended to match the pod.spec.containers[*].image string, or is this a transformed/resolved value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton @mikedanese - was it a goal to have service account name/namespace info available here for dynamic image pull credentials, or am I misremembering?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was it a goal to have service account name/namespace info available here for dynamic image pull credentials, or am I misremembering?

I'm also ok with deferring that, if that information is not available to global credential providers today, and moving forward with just the image would unblock dropping cloud provider dependencies

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re kubernetes/kubernetes#68810: if we choose to implement that as a image credential provider, a service account token would be ideal. I was envisioning that this flow would be requested via the API though (e.g. something like image pull secret). I don't think it's feasible to expect something like that to be enabled on a dime via node level configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for deferring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the CRI implementation was able to validate (and make authorization decisions based on) a bound SA token, then we could just have the kubelet pass such a token and skip the exec logic all together?

The CRI implementation would need to make kube API calls or we would have to encode more claims within the token to allow for authz checks.

}

// RegistryCredentialPluginResponse holds credentials for the kubelet runtime
// to use for image pulls. It is returned from the plugin as stdout.
type RegistryCredentialPluginResponse struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all of the current credential providers already return response of type "DockerConfig" -
https://github.com/kubernetes/kubernetes/blob/master/pkg/credentialprovider/provider.go#L30-L39

https://github.com/kubernetes/kubernetes/blob/master/pkg/credentialprovider/config.go#L50-L57

I feel it's easier to maintain the "DockerConfig" as response type for all providers during the transition instead of introducing new response type and expect every provider to accommodate those changes. Thoughts ?

metav1.TypeMeta `json:",inline"`

// +optional
ExpirationTimestamp *metav1.Time `json:"expirationTimestamp,omitempty"`

// +optional
Username *string `json:"username,omitempty"`
// +optional
Password *string `json:"password,omitempty"`

// IdentityToken is used to authenticate the user and get
// an access token for the registry.
// +optional
IdentityToken *string `json:"identitytoken,omitempty"`

// RegistryToken is a bearer token to be sent to a registry
// +optional
RegistryToken *string `json:"registrytoken,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all fields are present, should the kubelet make a request with all the credentials (username, password, identity and registry token)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by comment, since I dug into this recently. My understanding of these fields is thus:

RegistryToken -> Use this directly as Bearer auth against the registry (skip the handshake).
IdentityToken -> Use this to perform an oauth2 handshake.
Username/Password -> Use this to perform a token handshake.

This logic doesn't really seem like something the kubelet should handle, though, so it may be sufficient to document that this gets passed through to CRI as AuthConfig?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of making the kubelet as simple as possible... pass through to CRI auth config sounds good to me.

Would this be usable in combination with dockershim? Given there is work in progress to remove that, I would lean toward saying external credential providers just get used with external container runtimes.

How would this interact with imagePullSecrets? Would the presence of an imagepullsecret override a matching registrycredentialprovider for a given image pull?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that image pull secrets (can?) include a configuration specifying when they should be used (similar to the registry cred provider configuration). Is the ordering when multiple credential providers match well defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be usable in combination with dockershim? Given there is work in progress to remove that, I would lean toward saying external credential providers just get used with external container runtimes.

No plan for dockershim, so that sounds good to me.

How would this interact with imagePullSecrets? Would the presence of an imagepullsecret override a matching registrycredentialprovider for a given image pull?

I think this is how it works today, with the in-tree providers. Here, too I was thinking we would model after the current behavior.

}

```

Copy link
Member

@andrewsykim andrewsykim Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be super helpful to show real examples of RegistryCredentialProvider, RegistryCredentialPluginSpec and RegistryCredentialPluginStatus that we would use for one of the existing providers (ECR?)

### Example

A registry credential provider configuration for Amazon ECR could look like the following:

```yaml
kind: credentialprovider
apiVersion: v1alpha1
providers:
-
imageMatchers:
- *.dkr.ecr.*.amazonaws.com
- *.dkr.ecr.*.amazonaws.com.cn
exec:
command: ecr-creds
args: token
apiVersion: v1alpha1
```

Where ecr-creds is a binary that vends ecr credentials. This would execute the binary `ecr-creds` with the argument `token` for the image `012345678910.dkr.ecr.us-east-1.amazonaws.com/my-image`.

### Alternatives Considered

#### API Server Proxy

The API server would act as a proxy to an external container registry credential provider that may support multiple cloud providers. The credential provider service will return container runtime compatible responses of the type currently used by the credential provider infrastructure within the kubelet along with credential expiration information to allow the API server to cache credential responses for a period of time.

This limits the cloud-specific privileges required for each node for the purpose of fetching credentials. Centralized caching helps to avoid cloud-specific rate limits for credential acquisition by consolidating that credential acquisition within the API server.

We chose not to follow this approach because although less privileges on each node and centralized caching are good, we have not seen enough evidence that these features are commonly requested by users. Also, it is outside the stated goals of this KEP. Lastly, taking the time to design such a system would probably take long enough to push back the date that we could extract the in-tree cloud providers completely from Kubernetes.

#### Sidecar Credential Daemon

Each node would run a sidecar credential daemon that can obtain cloud-specific container registry credentials and may support multiple cloud providers. This service will be available to the kubelet on the local host and will return container runtime responses compatible with those currently used by the credential provider infrastructure within kubelet. Each daemon will perform its own caching of credentials for the node on which it runs.

The added complexity of running a daemon over executing a binary made this option less desirable to us. If a daemon implementation is necessary for a cloud provider, the binary can talk to one to retrieve credentials upon each execution.

#### Bound Service Account Token Flow

Suggested in https://github.com/kubernetes/kubernetes/issues/68810, an image pull flow built on bound service account tokens would provide kubelet with credentials to pull images for pods running as a specific service account.

This approach might be better suited as a future enhancement to either the credential provider or ImagePullSecrets, but is out of scope for extracting the cloud provider specific code.

#### Pushing Credential Management into the CRI

Another possibility is moving the credential management logic into the CRI, so that Kubelet doesn't provide any credentials for image pulls. Similarly, this approach is also out of scope for extracting cloud provider code because it would be a more significant redesign but should be considered for a future enhancement.

### Risks and Mitigations

This is a critical feature of kubelet and pods cannot start if it does not work correctly. This functionality will be labeled alpha and hidden behind a feature gate in v1.18. It will use DynamicKubeletConfig so that it can be disabled during runtime if any problems occur.

## Design Details

### Test Plan

* Unit tests for image matching logic.
* E2E tests for image pulls from cloud providers.

### Graduation Criteria

Successful Alpha Criteria
* Multiple plugin implentations created.
* One E2E test implemented.

### Upgrade / Downgrade Strategy

Upgrading
* Add any cloud provider plugin binaries for image repositories that you use to your worker nodes.
* Enable this feature in kubelet with a feature flag.

Downgrading
* Disable this feature in kubelet with a feature flag.

### Version Skew Strategy

TODO

## Implementation History

TODO

## Infrastructure Needed

* New GitHub repos for existing credential providers (AWS, Azure, GCP)