Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition ServiceAccount admission controller to improved service account token volumes #70679

Open
mikedanese opened this issue Nov 6, 2018 · 24 comments

Comments

@mikedanese
Copy link
Member

@mikedanese mikedanese commented Nov 6, 2018

Over the next N releases, we would like to transition the ServiceAccount admission controller to injecting service account token volumes based on the TokenRequest API and volume projection designed here:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/svcacct-token-volume-source.md

I propose we use the feature mechanism, with plenty of lead time and noisy announcements, to phase this change in.

Proposed Plan

This following is pretty close to what the announcement will look like.

With the enablement of a new feature, SecureServiceAccountTokenVolume, we will switch the service account admission control to injecting service account token volumes that use the new TokenRequest API to provision tokens. We will loudly announce a new feature that will be Alpha in 1.13 called SecureServiceAccounTokenVolume.

The new service account token volumes are preferred over the old volumes because:

  • Tokens provisioned into the new volumes are only valid for a short TTL and are rotated automatically, minimizing the impact of a token exfiltration.
  • Tokens provisioned into the new volumes are bound to the pod, which greatly reduces the difficulty of a forced credential rotation. Operators need only redeploy their application (e.g. trigger a deployment rolling update). This operation is complex to execute without application disruption with the old volumes, and we don’t have any official guidance on the process.
  • Tokens provisioned into the new volumes are audience bound to the API server which should dissuade integrators from accepting them as identity proof elsewhere, thus improving the security posture of authentication to the Kubernetes API server.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets and are not persisted in any database, reducing the impact and value of a read compromise of the secrets API.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets that must be replicated per service account instance which will solve long standing scalability issues with the implementation of service accounts once the migration is complete. See #48408
  • Tokens provisioned into the new volumes are not referable in environment variables via the Downward API.
  • Tokens provisioned into the new volumes have stricter file permission. They must be either 0600 or 0640 when using the "fsGroup" feature. They do not allow overrides to more permissive permissions.

The SecureServiceAccounTokenVolume feature will:

*Switch the service account admission controller to using a projected volume (with a token projection) to implement the old service account volume’s filesystem API.

  • Make --service-account-issuer a required flag thus requiring that TokenRequest will be enabled.

Users who use the alpha release track to canary what’s coming down the pike will encounter this and will notice that:

  • Their API servers won’t start until they add the flags required to enable TokenRequest
  • In-cluster clients of the kubernetes apiserver that don’t reload service account tokens will start failing an hour after deployment. They will need to update these clients.
  • PodSecurityPolicies that allowed secret volumes but not projected volumes will no longer be usable with newly created pods that auto-mount service account volumes. They will need to allow projected volumes now.
  • Pre-1.11 Kubelets (assuming they also enable alpha features) will no longer run new pods that mount service account volumes.
  • Pods running as non root need an fsGroup or they will not have permission to read the token.

We will continue to announce this new behavior each release. We will leave this feature in alpha for a release or more then graduate it to beta and enable it by default. At this point, users will either have to make the changes to upgrade or explicitly disable the beta feature. After a long deprecation, we will promote the feature to GA and remove the ability for clusters to opt out.

Notified Parties

  • sig-apimachinery to consult on plan
  • sig-arch to consult on plan
  • kubernetes-dev to notify of intent to make this change
  • kubernetes-announce to notify of intent to make this change

/kind feature
/sig auth
/sig api-machinery

@kubernetes/sig-auth-feature-requests
@kubernetes/sig-api-machinery-feature-requests
@kubernetes/sig-architecture-feature-requests
@liggitt @tallclair @cjcullen @smarterclayton

@WanLinghao

This comment has been minimized.

Copy link
Member

@WanLinghao WanLinghao commented Nov 6, 2018

track

@zjj2wry

This comment has been minimized.

Copy link
Member

@zjj2wry zjj2wry commented Nov 7, 2018

/cc

1 similar comment
@aasmall

This comment has been minimized.

Copy link

@aasmall aasmall commented Nov 7, 2018

/cc

@fejta-bot

This comment has been minimized.

Copy link

@fejta-bot fejta-bot commented Feb 14, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@mikedanese

This comment has been minimized.

Copy link
Member Author

@mikedanese mikedanese commented Feb 15, 2019

This is done.

@mikedanese mikedanese closed this Feb 15, 2019
@mikedanese

This comment has been minimized.

Copy link
Member Author

@mikedanese mikedanese commented Feb 28, 2019

Eh, I don't know why I said that.

@enj

This comment has been minimized.

Copy link
Member

@enj enj commented May 1, 2019

/priority important-longterm

@dims

This comment has been minimized.

Copy link
Member

@dims dims commented May 14, 2019

@mikedanese i'll remove arch from this now. please re-add when there is a KEP or some other plan for us to look at. is this ok?

/remove-sig architecture

@mikedanese

This comment has been minimized.

Copy link
Member Author

@mikedanese mikedanese commented May 14, 2019

@dims will do. Thanks.

@linki

This comment has been minimized.

Copy link
Member

@linki linki commented Jan 17, 2020

We recently started experimenting with expiring BoundServiceAccountTokens and reported #87028. Thanks for providing a fix so quickly!

We're now thinking of how we'd roll this out to our clusters in a safe manner knowing that there's probably applications that rely on the fact that ServiceAccount tokens didn't expire in the past. A common pattern we see out there is reading a token at application startup and then never reading it again or by using older version of client-go etc.

Proposed Solution

One way to solve this would be the following: Kubernetes rotates ServiceAccount tokens at least once per day. During the transition to BoundServiceAccountTokens we could provision ServiceAccount tokens with a longer expiration time than 1h, say 30 days. We could then monitor the tokens received at the API server and if we see ServiceAccount tokens that are older than 1h we know that there's an application not reading a fresh token from the mounted volume. We can then track it down and update it.

Meanwhile, that application would keep working just fine for at least the next 30 days since the token is considered valid by the API server. Depending on the time window you need to upgrade all your applications you would pick a suitable expiration time for the automatically mounted tokens. Since previous tokens were valid for eternity even a very long expiration time wouldn't compromise security.

Once you're certain all relevant clients refresh their tokens you can reduce the expiration time of the automatically mounted tokens to the proposed default of 1h.

To achieve that all we need is to be able to configure the value hard-coded value via a flag or a field in the kubelet config.

There are at least three alternatives to this approach but I feel they are inferior:

  • Disable rejection of expired ServiceAccount tokens in the auth subsystem of the apiserver itself. (basically allowing expired tokens during the transition)
  • Disable the built-in AdmissionPlugin and rebuild the same functionality as an out-of-tree AdmissionController but using a different expiration time
  • Changing the hard-coded value in-place and running a fork for a while

/cc @liggitt @mikedanese @mikkeloscar @szuecs

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Jan 20, 2020

One way to solve this would be the following: Kubernetes rotates ServiceAccount tokens at least once per day. During the transition to BoundServiceAccountTokens we could provision ServiceAccount tokens with a longer expiration time than 1h, say 30 days. We could then monitor the tokens received at the API server and if we see ServiceAccount tokens that are older than 1h we know that there's an application not reading a fresh token from the mounted volume. We can then track it down and update it.

That is similar to an approach we discussed in sig-auth, and I think I'm in favor of it (I don't see another realistic way to safely roll this out). I would actually go further and make the "stale token used" metric trigger when a token older than the minimum bound token age (10 minutes) is used, since that means it is a client that would have problems with the default injected configuration.

Offhand, the specific pieces we need for this would probably be:

  1. An option to the kubelet to mint projected token volumes for longer than requested
  2. A metric in the API server that counts use of legacy service account tokens (would indicate presence of clients that are extracting/using tokens from Secret objects)
  3. A metric in the API server that counts use of pod-bound service account tokens older than 10 minutes (would indicate presence of clients using injected tokens and not refreshing them)
  4. Addition of audit annotations for both scenarios so that audit logs can be swept to locate the specific namespace/serviceaccount being used with legacy tokens or stale bound tokens (@logicalhan can confirm, but I would not expect to put the service account name or namespace in the metrics because of cardinality concerns)

Then, our instructions to cluster admins would be:

  1. Enable projected tokens, and enable the "mint longer than requested" option in the kubelet
  2. Monitor your clusters for legacy token use or stale token use metrics
  3. On clusters with either of those metrics present, sweep audit logs to identify the offending application
  4. Repeat 2-3 until metrics are clear
  5. Disable the "mint longer than requested" option in the kubelet
@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Jan 20, 2020

@linki - for 1.18, I'm focusing on getting the CSR API positioned to graduate to v1, but if you wanted to work with @mikedanese to update the bound service account token KEP with a description of this rollout, and help to get these additions in place, that could help accelerate progress here

@lavalamp

This comment has been minimized.

Copy link
Member

@lavalamp lavalamp commented Jan 21, 2020

(Drive by idle thought: the suggestions to provide visibility are very similar to what we'd need for ratcheting validation.)

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Jan 23, 2020

@mikedanese: The label(s) sig/arch cannot be applied, because the repository doesn't have them

In response to this:

Over the next N releases, we would like to transition the ServiceAccount admission controller to injecting service account token volumes based on the TokenRequest API and volume projection designed here:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/svcacct-token-volume-source.md

I propose we use the feature mechanism, with plenty of lead time and noisy announcements, to phase this change in.

Proposed Plan

This following is pretty close to what the announcement will look like.

With the enablement of a new feature, SecureServiceAccountTokenVolume, we will switch the service account admission control to injecting service account token volumes that use the new TokenRequest API to provision tokens. We will loudly announce a new feature that will be Alpha in 1.13 called SecureServiceAccounTokenVolume.

The new service account token volumes are preferred over the old volumes because:

  • Tokens provisioned into the new volumes are only valid for a short TTL and are rotated automatically, minimizing the impact of a token exfiltration.
  • Tokens provisioned into the new volumes are bound to the pod, which greatly reduces the difficulty of a forced credential rotation. Operators need only redeploy their application (e.g. trigger a deployment rolling update). This operation is complex to execute without application disruption with the old volumes, and we don’t have any official guidance on the process.
  • Tokens provisioned into the new volumes are audience bound to the API server which should dissuade integrators from accepting them as identity proof elsewhere, thus improving the security posture of authentication to the Kubernetes API server.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets and are not persisted in any database, reducing the impact and value of a read compromise of the secrets API.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets that must be replicated per service account instance which will solve long standing scalability issues with the implementation of service accounts once the migration is complete. See #48408
  • Tokens provisioned into the new volumes are not referable in environment variables via the Downward API.
  • Tokens provisioned into the new volumes have stricter file permission. They must be either 0600 or 0640 when using the "fsGroup" feature. They do not allow overrides to more permissive permissions.

The SecureServiceAccounTokenVolume feature will:

*Switch the service account admission controller to using a projected volume (with a token projection) to implement the old service account volume’s filesystem API.

  • Make --service-account-issuer a required flag thus requiring that TokenRequest will be enabled.

Users who use the alpha release track to canary what’s coming down the pike will encounter this and will notice that:

  • Their API servers won’t start until they add the flags required to enable TokenRequest
  • In-cluster clients of the kubernetes apiserver that don’t reload service account tokens will start failing an hour after deployment. They will need to update these clients.
  • PodSecurityPolicies that allowed secret volumes but not projected volumes will no longer be usable with newly created pods that auto-mount service account volumes. They will need to allow projected volumes now.
  • Pre-1.11 Kubelets (assuming they also enable alpha features) will no longer run new pods that mount service account volumes.
  • Pods running as non root need an fsGroup.

We will continue to announce this new behavior each release. We will leave this feature in alpha for a release or more then graduate it to beta and enable it by default. At this point, users will either have to make the changes to upgrade or explicitly disable the beta feature. After a long deprecation, we will promote the feature to GA and remove the ability for clusters to opt out.

Notified Parties

  • sig-apimachinery to consult on plan
  • sig-arch to consult on plan
  • kubernetes-dev to notify of intent to make this change
  • kubernetes-announce to notify of intent to make this change

/kind feature
/sig auth
/sig api-machinery
/sig arch

@kubernetes/sig-auth-feature-requests
@kubernetes/sig-api-machinery-feature-requests
@kubernetes/sig-architecture-feature-requests
@liggitt @tallclair @cjcullen @smarterclayton

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Jan 23, 2020

@mikedanese: The label(s) sig/arch cannot be applied, because the repository doesn't have them

In response to this:

Over the next N releases, we would like to transition the ServiceAccount admission controller to injecting service account token volumes based on the TokenRequest API and volume projection designed here:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/svcacct-token-volume-source.md

I propose we use the feature mechanism, with plenty of lead time and noisy announcements, to phase this change in.

Proposed Plan

This following is pretty close to what the announcement will look like.

With the enablement of a new feature, SecureServiceAccountTokenVolume, we will switch the service account admission control to injecting service account token volumes that use the new TokenRequest API to provision tokens. We will loudly announce a new feature that will be Alpha in 1.13 called SecureServiceAccounTokenVolume.

The new service account token volumes are preferred over the old volumes because:

  • Tokens provisioned into the new volumes are only valid for a short TTL and are rotated automatically, minimizing the impact of a token exfiltration.
  • Tokens provisioned into the new volumes are bound to the pod, which greatly reduces the difficulty of a forced credential rotation. Operators need only redeploy their application (e.g. trigger a deployment rolling update). This operation is complex to execute without application disruption with the old volumes, and we don’t have any official guidance on the process.
  • Tokens provisioned into the new volumes are audience bound to the API server which should dissuade integrators from accepting them as identity proof elsewhere, thus improving the security posture of authentication to the Kubernetes API server.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets and are not persisted in any database, reducing the impact and value of a read compromise of the secrets API.
  • Tokens provisioned into the new volumes do not depend on data stored in Kubernetes secrets that must be replicated per service account instance which will solve long standing scalability issues with the implementation of service accounts once the migration is complete. See #48408
  • Tokens provisioned into the new volumes are not referable in environment variables via the Downward API.
  • Tokens provisioned into the new volumes have stricter file permission. They must be either 0600 or 0640 when using the "fsGroup" feature. They do not allow overrides to more permissive permissions.

The SecureServiceAccounTokenVolume feature will:

*Switch the service account admission controller to using a projected volume (with a token projection) to implement the old service account volume’s filesystem API.

  • Make --service-account-issuer a required flag thus requiring that TokenRequest will be enabled.

Users who use the alpha release track to canary what’s coming down the pike will encounter this and will notice that:

  • Their API servers won’t start until they add the flags required to enable TokenRequest
  • In-cluster clients of the kubernetes apiserver that don’t reload service account tokens will start failing an hour after deployment. They will need to update these clients.
  • PodSecurityPolicies that allowed secret volumes but not projected volumes will no longer be usable with newly created pods that auto-mount service account volumes. They will need to allow projected volumes now.
  • Pre-1.11 Kubelets (assuming they also enable alpha features) will no longer run new pods that mount service account volumes.
  • Pods running as non root need an fsGroup or they will not have permission to read the token.

We will continue to announce this new behavior each release. We will leave this feature in alpha for a release or more then graduate it to beta and enable it by default. At this point, users will either have to make the changes to upgrade or explicitly disable the beta feature. After a long deprecation, we will promote the feature to GA and remove the ability for clusters to opt out.

Notified Parties

  • sig-apimachinery to consult on plan
  • sig-arch to consult on plan
  • kubernetes-dev to notify of intent to make this change
  • kubernetes-announce to notify of intent to make this change

/kind feature
/sig auth
/sig api-machinery
/sig arch

@kubernetes/sig-auth-feature-requests
@kubernetes/sig-api-machinery-feature-requests
@kubernetes/sig-architecture-feature-requests
@liggitt @tallclair @cjcullen @smarterclayton

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mikedanese

This comment has been minimized.

Copy link
Member Author

@mikedanese mikedanese commented Jan 23, 2020

Thanks for writing this up. This approach can probably be implemented entirely inside the kube-apiserver.

I'm happy to tackle this portion of the problem independently but we'll need a decision on what to do about file permissions as well before we can turn this on by default.

@happinesstaker

This comment has been minimized.

Copy link

@happinesstaker happinesstaker commented Mar 4, 2020

I'm going to draft a proposal on safely rolling out projected token while not breaking in-cluster clients.

Offhand, the specific pieces we need for this would probably be:

  1. An option to the kubelet to mint projected token volumes for longer than requested
  2. A metric in the API server that counts use of legacy service account tokens (would indicate presence of clients that are extracting/using tokens from Secret objects)
  3. A metric in the API server that counts use of pod-bound service account tokens older than 10 minutes (would indicate presence of clients using injected tokens and not refreshing them)

I believe for step 1, we could implement the exception in kube apiserver by providing a flag to treat expired token as valid for up to 30 days. Make all metric/exception code in one place would make it easier to manage.
I would rather define "stale token" as expired token used. To include more "potential breaking" cases, tokens older then 24 hours or older than 80% of their life time could also be considered.

Please remind me if implementing this option in kube apiserver would cause essential tradeoff instead of implementing in kubelet.

  1. Addition of audit annotations for both scenarios so that audit logs can be swept to locate the specific namespace/serviceaccount being used with legacy tokens or stale bound tokens (Han Kang can confirm, but I would not expect to put the service account name or namespace in the metrics because of cardinality concerns)

I agree with not exposing SA name or namespace in metric, maybe we could log these info for stale token in case we want to trace down towards which in-cluster client is causing problem.

@tedyu

This comment has been minimized.

Copy link
Contributor

@tedyu tedyu commented Mar 5, 2020

 to treat expired token as valid for up to 30 days.

Doesn't this (serving expired token) expose some kind of security hole ?

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Mar 5, 2020

I would rather define "stale token" as expired token used.

I would not expect us to intentionally honor/validate tokens past their expiration

@happinesstaker

This comment has been minimized.

Copy link

@happinesstaker happinesstaker commented Mar 5, 2020

Doesn't this (serving expired token) expose some kind of security hole ?

Well this option is served as a temporary transition state from current SA token to projected token. Current token has is not time-bound, so anyway (either issue a 30 day token, or treat expired token up to 30 days as valid) this is not reducing security level.

I would not expect us to intentionally honor/validate tokens past their expiration

Essentially this should have the same effect as issue a token longer than expected in kubelet (both way the token is valid longer then customers thought), unless there are other tokens sent to kube apiserver for authentication in additional to in-cluster clients.
Though making an exception in authentication might looks worse, I must agree.
By issue token longer than requested, do you mean always +30 days (for example) to what customer requested? I thought people already using alpha feature would already set some expiration time longer than 10 minutes.

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Mar 5, 2020

By issue token longer than requested, do you mean always +30 days (for example) to what customer requested? I thought people already using alpha feature would already set some expiration time longer than 10 minutes.

The expiration time for an admission-injected token is currently 1 hour.

When the API server is running in this compatibility/warning mode, if a tokenrequest is made for a pod-bound token, by a kubelet, with an expiration time matching the admission-injected projected token volume (which we could make more unique if we wanted... e.g. 1 hour + n seconds), the server can mint the token for a longer duration and embed the original requested expiration time in the token as a private "kubernetes.io":{"warn-after":...} claim.

The service account token authenticator can annotate the audit log and increment the warning metric if it encounters a token with a warn-after claim used past the warning time.

@mikedanese

This comment has been minimized.

Copy link
Member Author

@mikedanese mikedanese commented Mar 5, 2020

I'm 👍 on anything that can help us minimize kubelet changes so we can get this out there. The expirationSeconds magic number sounds reasonably worthwhile.

@happinesstaker

This comment has been minimized.

Copy link

@happinesstaker happinesstaker commented Mar 5, 2020

Sounds good to mint longer token in kube apiserver if we see specific expiration time requested, look less hacky.

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Mar 6, 2020

note that to make the kubelet refresh on schedule, we'd want the server to set status.expiration on the token response to the original 1 hour timeframe, despite having minted the token for longer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.