Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Login and secret sync pattern per pod in replica set #151

Closed
lukpep opened this issue Mar 23, 2022 · 3 comments
Closed

Login and secret sync pattern per pod in replica set #151

lukpep opened this issue Mar 23, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@lukpep
Copy link

lukpep commented Mar 23, 2022

Related to #150
with auto-rotation enabled this pattern of login and secret /GET is done every rotation-poll-interval (2 minutes) for every pod in a replica set - which is extremely suboptimal in my opinion.

  • it's not hard to imagine a replica set of 100 pods... and every 2 minutes we will be getting the same secrets 100x time
  • this rotation-poll-interval counter starts with pod lifecycle - so for different portions of this 100 pod replica it happens at different times in this 2-minute window - which results in an inconsistent secret state in deployment - some number of pods will use an old secret, some portion new one - up to rotation-poll-interval difference in worse case scenario

Expected behaviour:
I believe secrets should be synced once for the entire deployment/replica set - current state of things + #150 means we have a new lease object every 2 mins for every pod in cluster = vault sever with over > 10 GB MEM usage.

Tested on 0.4.0 and on 1.0.0 - same behaviour
This issue is a result of a discussion from here

@tomhjp
Copy link
Contributor

tomhjp commented Jun 21, 2022

I've commented on #150, which is fairly straightforward in terms of design. It's pretty obvious the TTL should be respected and we'll fix that - I think in most deployment scenarios, that fix is going to be the more significant one in terms of reducing the number of open leases.

The design decision around per-pod vs per-deployment secrets is a little more nuanced though. If we have the same dynamic secret in every pod, then revoking the lease for one revokes it for all of them, and similarly an expired lease will affect all pods simultaneously.

In terms of identity, the CSI driver token we can receive from kubelet comes with claims something like this:

{
  "aud": [
    "vault"
  ],
  "exp": 1655601118,
  "iat": 1655597518,
  "iss": "https://kubernetes.default.svc.cluster.local",
  "kubernetes.io": {
    "namespace": "test",
    "pod": {
      "name": "nginx-db",
      "uid": "547578bd-ffc5-4cba-a479-111a2e2523bf"
    },
    "serviceaccount": {
      "name": "nginx-db",
      "uid": "40ec4d69-1152-4b46-b126-45a568a6e4c6"
    }
  },
  "nbf": 1655597518,
  "sub": "system:serviceaccount:test:nginx-db"
}

i.e. it's directly tied to both the pod and the service account, but not the owner of the pod (e.g. deployment). If we wanted to drop the pod-specific reference, we'd have to revert to generating our own token again. Done naively, that could get a little messy if you have multiple deployments running with the same service account. As such, it feels like a more natural fit for any secrets derived from our Kubernetes token to also be tied to both the pod and service account. But I can certainly see the appeal of fetching a single set of credentials per deployment too.

WDYT about the above?

@Freyert
Copy link

Freyert commented Jul 11, 2022

@tomhjp I'm really glad to find this issue and your interest in it. I'm highly invested in having per deployment credentials and here's why: some dynamic secret plugins can handle O(N) requests, but some can't.

For example, I'm very interested in using vault-csi-provider to manage MongoDB Atlas Database credentials. However, every time a new pod comes up, it requests a new database credential from Atlas, which in turn runs some blocking automation against all N clusters in a project. This automation can block other more critical operations such as scaling or backups.

Another example is the static GCP Service Account plugin. GCP Service accounts can only have 10 keys.

I think these are great data points to consider because it shows that deployment scoped CSI driver is not just a preference, it's a technical requirement for many plugins.


Regarding the following:

i.e. [The CSI token is] directly tied to both the pod and the service account, but not the owner of the pod (e.g. deployment)

I think another way around CSI tokens being pod scoped is to add a service account field to the CSI CRD and perhaps some flag that tells the CSI Driver to return the same credentials for each request to the CSI Driver.

I think attaching a service account to a particular CSI CRD makes a lot of sense as most people (I would imagine) are using k8s service accounts to authn/authz with Vault. Attaching the service account to the CSI CRD helps us practice the principle of least privilege by isolating Vault access to the CSI and not O(N) application pods that are also using the service account.

I'm probably getting a lot of things wrong here because I haven't looked at this project in ages, but I hope this is useful!

@tomhjp
Copy link
Contributor

tomhjp commented Apr 28, 2023

Thanks very much for the feedback here. I'll be releasing 1.4.0 later today, which wraps up on respecting the TTLs of returned tokens and leases.

For the other component of this ask, I'm going to close it in favour of #149 as they are both asking for the same "1 lease per deployment" pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants