Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TokenRequest API instead of calico-nodes service account token for CNI kubeconfig. #5910

Conversation

ScheererJ
Copy link
Contributor

@ScheererJ ScheererJ commented Apr 14, 2022

Description

Use TokenRequest API instead of calico-nodes service account token for CNI kubeconfig.

With projected service account tokens in kubernetes, the service account tokens of pods
are bound to the lifetime of the corresponding pod. Therefore, it may lead to problems
if an external process re-uses the token of a pod.
The CNI binaries used the token of calico-node. However, in case calico-node got stopped
the corresponding token lost its validity and hence could no longer be used for CNI
operations. Usually, this automatically resolves over time, but there are some edge cases
where this is not possible, e.g. if calico-node is autoscaled in terms of resources and
the new resource requests would require preemption/eviction of an existing pod the CNI
operation to delete the network sandbox will fail due to the no longer valid token
(as calico-node was stopped beforehand).
This change switches over to using the TokenRequest API instead, i.e. creating new
tokens with limited validity. It would have been good to bind the token to an object,
e.g. to the corresponding node, but as of now only secret and pod are supported types
for binding tokens. Hence, the tokens are only limited in time and not bound to any
other kubernetes object.

Related issues/PRs

fixes #4857
fixes #5712

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Calico now uses the TokenRequest API to generate and refresh a token for the CNI plugin.  This ensures that the token remains valid even when the calico-node daemonset is restarted.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@ScheererJ ScheererJ requested a review from a team as a code owner April 14, 2022 13:24
@marvin-tigera marvin-tigera added this to the Calico v3.24.0 milestone Apr 14, 2022
@marvin-tigera marvin-tigera added docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small) labels Apr 14, 2022
@lwr20
Copy link
Member

lwr20 commented Apr 14, 2022

/sem-approve

@fasaxc fasaxc added docs-not-required Docs not required for this change and removed docs-pr-required Change is not yet documented labels Apr 14, 2022
@fasaxc
Copy link
Member

fasaxc commented Apr 14, 2022

Do you know when the TokenRequest API was added to k8s? I'm wondering if this will break compatibility with older releases.

@ScheererJ
Copy link
Contributor Author

Do you know when the TokenRequest API was added to k8s? I'm wondering if this will break compatibility with older releases.

I do not know the exact release, but it looks like even v1.10 had the relevant logic:
https://github.com/kubernetes/kubernetes/blob/v1.10.0/pkg/registry/core/serviceaccount/storage/token.go

How far back would be interesting for you?

@ScheererJ
Copy link
Contributor Author

Please note that I have not adapted the calico-node clusterrole, yet, which is necessary to be able to create new tokens. (I tested the code locally with an admin kubeconfig.)
I will be off for a week and add the required clusterrole change to this pull request afterwards, i.e. beginning of calendar week 17.

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments - let me know what you think about them. Feel free to riff on my suggestions.

The main thing is that we need to be paying attention to the returned expiry time, not just the requested expiry time, and it would be nice to package this up to share some code.

@@ -496,3 +504,46 @@ func setSuidBit(file string) error {

return nil
}

func createCNIToken(kubecfg *rest.Config) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're duplicating very similar logic here and within the node directory.

Probably would be better to do this in once place - we can use a single constant for the validity duration, as well as a single function for refreshing the token.

I'd recommend creating a new structure to handle this logic, as there is a little bit of complexity that we need to deal with. Specifically, the returned expiry time may not be the time that we requested, so we need to cache that and update our ticker based on the value that's returned (not just the value we requested).

ExpirationSeconds is the requested duration of validity of the request. The token issuer may return a token with a different validity duration so a client needs to check the 'expiration' field in a response.

From here: https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/

e.g., something like this:

type TokenRefresher struct {
  cfg *rest.Config
  clientset *kubernetes.ClientSet
  ...
}

type TokenUpdate struct {
  Token string
  Error error
}

func (t *TokenRefresher) TokenChan() chan TokenUpdate {}
func (t* TokenRefresher) UpdateToken() TokenUpdate {}

The code in this file (install.go) can just call TokenUpdate() to get a new token.

The code in node that updates the token can do something like this:

tokenChan := t.TokenChan()
for {
  select {
    case <- tokenChan:
      // Update CNI config with new token.
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to think about.. would be whether or not the install.go code should bother generating a token at all - we need to have the node code running, and updating tokens anyway, and it needs to do it ASAP because otherwise it won't know what the returned expiry time is on token generated by the CNI plugin installer.

I'll need to think about this one a bit more - LMK what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the actual validity period is a good point. I will adapt the code accordingly. When should we try to replace the token? In the initial change I just statically assumed 1/3 of the (expected) validity period so that we still have enough time for another attempt if the token request fails.
Concerning the code duplication, I completely agree, but did not want to touch/change too many things at once. Writing the kubeconfig is anyway completely duplicated in slightly different form. It would for sure be easier to have the code only in calico-node, but I understand it originated in the cni installation and only moved to calico-node when token got restricted validity.
In case you want to continue to have the kubeconfig writing logic in two places, where should I put the token update code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should we try to replace the token? In the initial change I just statically assumed 1/3 of the (expected) validity period so that we still have enough time for another attempt if the token request fails.

I think 1/3 of the period is a fine enough way to do it - I might even do a token refresh every hour for the base case, and then switch to doing 1/3 of the period if the returned expiry is less than an hour long.

That actually makes me realize that these tokens are valid cluster-wide, not within the context of a particular node, so putting the logic to refresh the token within calico/node might not be right - it means we might get 5000 nodes all trying to refresh their token at the same time.

Think we might need a controller in kube-controllers which refreshes the token, and then watchers within calico/node that spot when the token has changed and update the config for that node.

Writing the kubeconfig is anyway completely duplicated in slightly different form

Yep, longer term I'd like to consolidate this as well. It's remnant of cni-plugin and node being separate repositories in the past. I'd like to see us consolidate all of this into one place or the other eventually.

In case you want to continue to have the kubeconfig writing logic in two places, where should I put the token update code?

If the above is true, I think we probably want the token refreshing code to live in kube-controllers, and then all calico/node and the CNI plugin code needs to do is query the generated token via a Get()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main argument against having the token code in kube-controllers is that kube-controllers is running in the pod network as far as I know. Therefore, it needs a running CNI to setup its network sandbox. Hence, it cannot be the precondition for CNI. Not sure how to follow up here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caseydavenport I changed the code as suggested to have just one location (in calico-node) and reuse it from within the cni installer. Let me know you feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also included the check of the validity of the returned token.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main argument against having the token code in kube-controllers is that kube-controllers is running in the pod network as far as I know.

Yeah, good call. This would be a problem. I'll think on this some more.

One option would be to start running kube-controllers as host networked, but I'd really prefer not to do that.

cni-plugin/pkg/install/install.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
@ScheererJ
Copy link
Contributor Author

Please note that I have not adapted the calico-node clusterrole, yet, which is necessary to be able to create new tokens. (I tested the code locally with an admin kubeconfig.) I will be off for a week and add the required clusterrole change to this pull request afterwards, i.e. beginning of calendar week 17.

The clusterrole change is now included. I am not sure if I got all occurrences as the clusterrole seems to be duplicated several times.

@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch from 74f749b to 9079de3 Compare April 26, 2022 14:56
Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, thanks again for working on this.

Had a few more comments / thoughts / questions, so let me know what you think.

node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
node/pkg/cni/token_watch.go Outdated Show resolved Hide resolved
@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch 2 times, most recently from e375136 to c9ecda7 Compare April 27, 2022 12:39
Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment but this is looking pretty good! Let's see how the tests do.

We should probably have some unit tests for this logic as well. We run some tests with a real API server so we could probably just have the test code in this package create a token refresher with a short expiry time and make sure it gives us a few tokens as a quick and dirty functional verification. (as part of this suite:

calico/node/Makefile

Lines 278 to 284 in faa1d3c

## Run the ginkgo FVs
fv: run-k8s-apiserver
$(DOCKER_RUN) \
-v $(CERTS_PATH):/home/user/certs \
-e KUBECONFIG=/go/src/github.com/projectcalico/calico/hack/test/certs/kubeconfig \
-e ETCD_ENDPOINTS=http://$(LOCAL_IP_ENV):2379 \
$(CALICO_BUILD) ginkgo -cover -r -skipPackage vendor pkg/lifecycle/startup pkg/allocateip $(GINKGO_ARGS)
)


clientset, _ := InClusterClientSet()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should handle this error rather than ignore it (static checks will fail for this anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really sure if the handling I added is what you had in mind.
The TokenRefresher cannot work without the clientset. Hence, I used a fatal error in that case.

Please let me know what you think.

@caseydavenport
Copy link
Member

/sem-approve

@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch from c9ecda7 to 46a5026 Compare April 28, 2022 08:19
@ScheererJ
Copy link
Contributor Author

One minor comment but this is looking pretty good! Let's see how the tests do.

We should probably have some unit tests for this logic as well. We run some tests with a real API server so we could probably just have the test code in this package create a token refresher with a short expiry time and make sure it gives us a few tokens as a quick and dirty functional verification. (as part of this suite:

calico/node/Makefile

Lines 278 to 284 in faa1d3c

## Run the ginkgo FVs
fv: run-k8s-apiserver
$(DOCKER_RUN) \
-v $(CERTS_PATH):/home/user/certs \
-e KUBECONFIG=/go/src/github.com/projectcalico/calico/hack/test/certs/kubeconfig \
-e ETCD_ENDPOINTS=http://$(LOCAL_IP_ENV):2379 \
$(CALICO_BUILD) ginkgo -cover -r -skipPackage vendor pkg/lifecycle/startup pkg/allocateip $(GINKGO_ARGS)

)

@caseydavenport I took a look at the failed cni-plugin tests in semaphoreci before looking into further tests. It looks like the tests fail because of the missing namespace file (/var/run/secrets/kubernetes.io/serviceaccount/namespace). This can be easily fixed in cni-plugin/pkg/install/install_test.go by creating a corresponding file. However, after doing so the tests only proceed to try to create a token and fail then as there is no valid kube apiserver endpoint configured. How should this be handled?

@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch from 46a5026 to 90b5cd8 Compare April 28, 2022 13:12
@ScheererJ
Copy link
Contributor Author

I added simple tests for the code using the approach suggested.
Please note that the tests only test the creation of one token. The minimum token validity allowed by kube apiserver seems to be 10 minutes.

@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch from 90b5cd8 to af199c8 Compare April 28, 2022 13:34
@ScheererJ
Copy link
Contributor Author

Now there is also a small test checking the creation of multiple tokens in a series.
@caseydavenport Let me know if this is sufficient.

@caseydavenport
Copy link
Member

/sem-approve

…r CNI kubeconfig.

With projected service account tokens in kubernetes, the service account tokens of pods
are bound to the lifetime of the corresponding pod. Therefore, it may lead to problems
if an external process re-uses the token of a pod.
The CNI binaries used the token of calico-node. However, in case calico-node got stopped
the corresponding token lost its validity and hence could no longer be used for CNI
operations. Usually, this automatically resolves over time, but there are some edge cases
where this is not possible, e.g. if calico-node is autoscaled in terms of resources and
the new resource requests would require preemption/eviction of an existing pod the CNI
operation to delete the network sandbox will fail due to the no longer valid token
(as calico-node was stopped beforehand).
This change switches over to using the TokenRequest API instead, i.e. creating new
tokens with limited validity. It would have been good to bind the token to an object,
e.g. to the corresponding node, but as of now only secret and pod are supported types
for binding tokens. Hence, the tokens are only limited in time and not bound to any
other kubernetes object.
@ScheererJ ScheererJ force-pushed the enhancement/unbound-token-requests branch from b95cdf0 to 2b3469b Compare May 17, 2022 07:00
@ScheererJ
Copy link
Contributor Author

�[91mFailed to create kubernetes config: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable�[0m

Getting these errors in the new tests. I think it's because there are two suites of tests (fv, which run an apiserver, and ut, which do not). Right now the new tests are being run in both, but we don't want to run them in the UTs (because they require an apiserver).

I think just need to update this line:

UT_PACKAGES_TO_SKIP?=pkg/lifecycle/startup,pkg/allocateip,pkg/status

UT_PACKAGES_TO_SKIP?=pkg/lifecycle/startup,pkg/allocateip,pkg/status,pkg/cni

@caseydavenport Thanks for the hint. I excluded the new test from the unit tests as suggested. At least locally, that part now succeeds.

@caseydavenport
Copy link
Member

/sem-approve

@caseydavenport caseydavenport merged commit 020b09c into projectcalico:master May 17, 2022
@caseydavenport
Copy link
Member

@ScheererJ 🎉 it's merged!

Thanks for the great work on this.

@skmatti
Copy link
Contributor

skmatti commented Jun 3, 2022

@caseydavenport Is there a plan to backport the fix to v3.23?.

@caseydavenport
Copy link
Member

@skmatti mulling on that at the moment. We definitely want to let it soak a bit more before release, but it's also a substantial change with implications for permissions, etc, which normally would be a no-no for a patch release, so it's a hard sell as a backport.

caseydavenport added a commit that referenced this pull request Jun 14, 2022
…n-release-v3.23

[release-v3.23] Auto pick #5910: Use TokenRequest API instead of calico-nodes service account
@ralgozino
Copy link

Hi @caseydavenport ,

Sorry for commenting on a merged PR. I just wanted to add my 2 cts for backporting the fix to 3.23.

The expiring tokens were introduced by default in Kubernetes v1.21 with default expiration time of 1h + <extend_period>, where extend_period is by default 1 year (it can be changed or disabled). This means in practice that after a year all calico pods won't be able to talk to the API server. Being that we are close to a year after the release of Kubernetes 1.21 one could argue that there might be clusters close to having calico pod's token expire.

You can test this by deploying calico on a 1.21 cluster, enable audit logs in the API server, waiting >1 hour and create a pod. You should see a message like this in the audit:

{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "Metadata",
    "auditID": "4b3aa577-8407-4383-adff-c10f5954fd01",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/namespaces/default/pods/bad-pod/status",
    "verb": "patch",
    "user": {
        "username": "system:serviceaccount:kube-system:calico-node",
        "uid": "049fd9b3-f537-4fd3-b5ba-59350407e999",
        "groups": [
            "system:serviceaccounts",
            "system:serviceaccounts:kube-system",
            "system:authenticated"
        ],
        "extra": {
            "authentication.kubernetes.io/pod-name": [
                "calico-node-c7jgk"
            ],
            "authentication.kubernetes.io/pod-uid": [
                "adef62fc-d593-4687-bf78-47ccf44f8785"
            ]
        }
    },
    "sourceIPs": [
        "10.7.255.126"
    ],
    "userAgent": "calico/v0.0.0 (linux/amd64) kubernetes/$Format",
    "objectRef": {
        "resource": "pods",
        "namespace": "default",
        "name": "bad-pod",
        "apiVersion": "v1",
        "subresource": "status"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestReceivedTimestamp": "2022-06-21T14:09:07.779004Z",
    "stageTimestamp": "2022-06-21T14:09:07.786404Z",
    "annotations": {
        "authentication.k8s.io/stale-token": "subject: system:serviceaccount:kube-system:calico-node, seconds after warning threshold: 12006",
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"calico-node\" of ClusterRole \"calico-node\" to ServiceAccount \"calico-node/kube-system\""
    }
}

From the milestone seems to me that 3.24 is not going to be released soon, but a path release of 3.23 including the proper management of expiring tokens could be a quick win.

Of course, I'm assuming a lot and not being familiar with your internal processes maybe this doesn't make sense. Just sharing an end-user POV.

Thanks a lot!

@ScheererJ
Copy link
Contributor Author

Hi @ralgozino,

this pull request was backported to v3.23 with #6218, but there is no release, yet.

Best regards,
Johannes.

@lwr20
Copy link
Member

lwr20 commented Jun 22, 2022

As you can see from #6218 - this fix is already backported to the v3.23 branch. As soon as its soaked in our testing for a bit, we will release it in a new v3.23 patch.

This means in practice that after a year all calico pods won't be able to talk to the API server

This is a good point, thank you for raising.

@ralgozino
Copy link

Thank you both for the quick answer and sorry that I missed #6218 in my research 🙌
We'll be waiting for the 3.23 patch release :-)
Have a nice day!

@caseydavenport
Copy link
Member

Thanks for the pokes everyone - we are aiming to cut a v3.23 with this included soon, so stay tuned!

@hakman
Copy link
Contributor

hakman commented Jun 29, 2022

I am trying to upgrade kOps to use v3.23.2 and getting the following error for Canal:

==== START logs for container install-cni of pod kube-system/canal-7r5dd ====
...
W0629 06:52:14.819520       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2022-06-29 06:52:14.824 [ERROR][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=
  serviceaccounts "calico-node" is forbidden: User "system:serviceaccount:kube-system:canal" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system"
2022-06-29 06:52:14.824 [FATAL][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=
  serviceaccounts "calico-node" is forbidden: User "system:serviceaccount:kube-system:canal" 

Looking at the code, seems that the service account name is hardcoded to calico-node.
For Canal it should probably be canal based on https://docs.projectcalico.org/v3.23/manifests/canal.yaml.

Any thoughts @caseydavenport @lwr20 @ScheererJ ?

@ScheererJ
Copy link
Contributor Author

@hakman You are correct that the service account name is hardcoded. I was not aware of the canal use case. Apparently, the tests also missed it.

It would be fairly straight forward to adapt the code to use an environment variable if set instead of the default (calico-node). The environment variable could be set in the calico-node daemon set like this:

...
spec:
  template:
    spec:
      ...
      containers:
      - ...
        env:
        - name: SERVICE_ACCOUNT
          valueFrom:
            fieldRef:
              fieldPath: spec.serviceAccountName
...

If that would be fine I could open a corresponding pull request for it.

@caseydavenport / @lwr20 What do you think?

@lwr20
Copy link
Member

lwr20 commented Jun 29, 2022

@caseydavenport this is the bug we spotted in our end-to-end tests yesterday. I'll leave you to assist @ScheererJ to fix.

@hakman
Copy link
Contributor

hakman commented Jun 29, 2022

Cool. Thanks for the quick response!

@caseydavenport
Copy link
Member

Yep, this was my bad for not spotting this in the code review!

I think an adaptation to allow specification via an env var would be appropriate. It can be populated via the k8s downward API quite easily:

        - name: MY_POD_SERVICE_ACCOUNT
          valueFrom:
            fieldRef:
              fieldPath: spec.serviceAccountName

ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/

@caseydavenport
Copy link
Member

caseydavenport commented Jun 30, 2022

I'm working on a fix for this ^

EDIT: PR here: #6302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required Docs not required for this change release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
8 participants