Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain life cycle of automatically created secrets in ServiceAccount #24928

Closed
streamnsight opened this issue Apr 28, 2016 · 10 comments
Closed
Assignees
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/docs Categorizes an issue or PR as relevant to SIG Docs.

Comments

@streamnsight
Copy link

Hello,

I faced this issue: #24829 with DNS, and it ended up being a problem with my ServiceAccount secret token that was invalid.

I had 3 secrets in my kube-system ServiceAccount for some reason.

I understand that when secrets are deleted, a new token is generated and associated with the ServiceAccount.

I just don't understand how I ended up with 3 secrets, and how one (maybe 2?) was invalid.

  • How are secrets invalidated?
  • How are they created besides when there are none? Is it on apiserver start? on kubelet start?
  • It's obviously possible to have multiple tokens for a service account, but how can this happen automatically?

When invalidated, shouldn't these automatically generated tokens also be automatically removed from the ServiceAccount?

Thanks

@mikedanese mikedanese added kind/documentation Categorizes issue or PR as related to documentation. team/control-plane labels Apr 29, 2016
@erictune erictune self-assigned this May 3, 2016
@erictune
Copy link
Member

erictune commented May 3, 2016

I think this is what happens:

token controller (pkg/controller/serviceaccount/tokens_controller.go) lists all the secrets referenced by a service account, and then follows those references to secrets, and checks each secret to see if it is a "service account token. A secret IsServiceAccountToken if:

  • type: kubernetes.io/service-account-token
  • AND it has annotation kubernetes.io/service-account.name set to the service account name in question, which is default in your example.
  • AND it has annotation kubernetes.io/service-account.uid equal to the service account in question.

So, if you happened to delete your serviceAccount (or your upgrade or startup script did, not sure what is happening) and then created one that is identical to the original one, then the new one would have a new uid, so the old secrets won't match, so the token controller will want to make a new one.

I think if you check your system, you may see that each of your 3 secrets has a different value for annotation key kubernetes.io/service-account.uid. Can you confirm that is what is going on?

I'm not sure what is the right thing to do here. Say the system deletes the secret while there are still pods that reference that secret. Things that could maybe go wrong:

  • If a node now reboots, then it will try to pull the Pod again, and run it again, but it needs to also pull the secret again, which will have been auto-deleted. So, the pod will fail. I think this is okay, but I want to check with @dchen1107 if it is okay for a kubelet to fail to pull a pod's secrets after a reboot.
  • Also, @liggitt could token controller, or a generic controllerRef garbage collection scheme, delete a secret that is for an "old" UID of the "current" service account? Is there a race where you can't tell whether it is "older" or "newer". Also low-priorty FYI to @bgrant0607 because he likes controllerRef.

@liggitt
Copy link
Member

liggitt commented May 3, 2016

I had 3 secrets in my kube-system ServiceAccount for some reason.

What version of kubernetes was this cluster running? Prior to 1.2.0, there were conditions around startup or cache-relists where the token controller could accidentally create duplicate secrets. (fixed in #21706 and #22160)

How are secrets invalidated?

  • By invalidating the signature by changing the keys used to verify them (which your bring-up scripts might have been doing without realizing it... the server TLS key is used if no --service-account-key-file is given to the api server)
  • By deleting the associated secrets (if the api server is configured to verify the service account and secret still exists with --service-account-lookup=true)

Also, @liggitt could token controller, or a generic controllerRef garbage collection scheme, delete a secret that is for an "old" UID of the "current" service account? Is there a race where you can't tell whether it is "older" or "newer".

the token controller removes service account token secrets which reference a service account name+uid that doesn't exist. older/newer doesn't matter

@bgrant0607
Copy link
Member

cc @caesarxuchao @lavalamp @gmarek re. cascading deletion

@caesarxuchao
Copy link
Member

the token controller removes service account token secrets which reference a service account name+uid that doesn't exist. older/newer doesn't matter

if that's the case, then @erictune's theory won't explain why there are 3 secrets, since the invalidated ones will be deleted by the token controller.

Regarding cascading deletion, if the token controller sets the OwnerReference of the Secret to point to the ServiceAccount, then the garbage collector will delete the Secret when the ServiceAccount is deleted. There will be no need to note down the name and uid in the annotation.

@liggitt
Copy link
Member

liggitt commented May 3, 2016

I'd like clarification about this statement:

I had 3 secrets in my kube-system ServiceAccount for some reason.

I am guessing there were three secrets generated and added to the same service account, which is the bug fixed by the referenced PRs

@streamnsight
Copy link
Author

@liggitt yes i had 3 secrets and the 3 of them were listed in the ServiceAccount description.
I have upgraded this cluster a few times now. I tend to do this manually so I don't think scripts overwrote anywthing.
I do insert my registry credentials in the ServiceAccount: I would get the ServiceAccount (kubectl get serviceaccount default -o yaml, then remove the uuid ref and add the registry secret info, then kubectl replace -f the ServiceAccount (that is the instruction I found at the time to insert my registry credentials in the ServiceAccount as explained here:http://kubernetes.io/docs/user-guide/service-accounts/
under imagePullSecret

would replacing the ServiceAccount that way cause this issue?

After I deleted the secrets, the ServiceAccount got a new one, and I inserted my registry creds in the ServiceAccount again without creating new secrets.

@0xmichalis
Copy link
Contributor

/sig docs

@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Jun 20, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 20, 2017
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 29, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 28, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/docs Categorizes an issue or PR as relevant to SIG Docs.
Projects
None yet
Development

No branches or pull requests

10 participants