Skip to content

KEP-5284: Constrained Impersonation #5285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 19, 2025

Conversation

qiujian16
Copy link
Contributor

  • One-line PR description: Introduce authorization rules to restrict impersonating on specified resource with specified actions.
  • Other comments:

Signed-off-by: Jian Qiu <jqiu@redhat.com>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 7, 2025
@k8s-ci-robot k8s-ci-robot requested review from deads2k and enj May 7, 2025 02:49
@k8s-ci-robot
Copy link
Contributor

Hi @qiujian16. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 7, 2025
@enj enj added this to SIG Auth May 7, 2025
@enj enj moved this to Needs Triage in SIG Auth May 7, 2025
@liggitt liggitt mentioned this pull request May 7, 2025
4 tasks
@liggitt
Copy link
Member

liggitt commented May 7, 2025

cc @ahmedtd @vinayakankugoyal

@aramase aramase moved this from Needs Triage to In Review in SIG Auth May 12, 2025
@aramase
Copy link
Member

aramase commented May 12, 2025

/assign enj

-->

## Alternatives

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of alternative ideas come to mind:

• permission boundaries, AWS style

In that story, a principal is constrained and cannot perform some actions. Done right - which isn't trivial - the principal can create new principals, grant permission to those principals, and impersonate (act as) those principals, but only within a bounding set of allowed actions.


CEL. The constraint on impersonation could be done through an access-granting mechanism that evaluates an expression before permitting it. Clients would explicitly declare which access rule they intend to rely on.

For example, the SubjectAccessReview might look like:

{
  "apiVersion": "authorization.k8s.io/v1beta1",
  "kind": "SubjectAccessReview",
  "spec": {
    "resourceAttributes": {
      "namespace": "default",
      "verb": "get",
      "group": "example.org",
      "resource": "something"
    },
    "accessRule": {
      "kind": "ClusterAccessRule",
      "group version": "example.org/v1",
      "name": "elevation-of-privilege"
    },
    "user": "lmktfy",
    "group": []
    ]
  }
}

In this story, the ClusterRoleBinding allows impersonate only with a particular Access Rule name (similar, or identical to, the existing resourceNames mechanism).

However an access rule might day you can impersonate that principal but only to perform deletes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We file alternatives we think of, so long as they are reasonable, even when we plan to pick the original proposal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see us expanding SAR / authz for this KEP. The approach described in the KEP is far simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this as an alternative.

Copy link
Member

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass.

Comment on lines 966 to 969
Some concerns on using this approch:
- impersonating on any APIGroup is hard to describe.
- core APIGroup has to be specially treated.
- the APIGroup might be too long for a CRD, e.g. cluster.x-k8s.io.imperonsation.k8s.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am fine with the verb based approach described in this KEP, I don't really find any of these concerns to be compelling. At the end of the day we are just talking about different string based options here.

I would say the most compelling reason is that the existing impersonation flows are verb based, so making the new flow verb based as well feels more consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated

- `impersonate:user` limits the impersonator to impersonate users with
certain names/groups/userextras. The resources must be `users`/`groups`/`userextras`.
The resource names must be user names, group names or values in the user extras accoringly.
- `impersonate:scheduled-node` that limits the impersonator to impersonate the node the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to describe how the workload learns the node that it is scheduled on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more details in Design Details section

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is outstanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That section describes what the header is supposed to look like, I was mostly saying that the workload would need to use the downward API via an env var to figure out its node name.

      env:
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. Updated.

- `impersonate:user` limits the impersonator to impersonate users with
certain names/groups/userextras. The resources must be `users`/`groups`/`userextras`.
The resource names must be user names, group names or values in the user extras accoringly.
- `impersonate:scheduled-node` that limits the impersonator to impersonate the node the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include the exact set of impersonation headers the workload must set for the node case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more details in Design Details section

-->

## Alternatives

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see us expanding SAR / authz for this KEP. The approach described in the KEP is far simpler.

@github-project-automation github-project-automation bot moved this from In Review to Changes Requested in SIG Auth May 14, 2025

## Proposal

Introduce a set of verbs with prefix of `impersonate-on:`, e.g. `impersonate-on:create` and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I am wary of making verbs that you have to parse.

For example, the impersonate:user verb sets the scene for other verbs, such as list:metadata-only or proxy:port:443.

We should we sure what direction of travel we want if we are to establish this kind of precedent, and clearly indicate why the alternatives are overall less viable.

Signed-off-by: Jian Qiu <jqiu@redhat.com>
@qiujian16 qiujian16 changed the title KEP-5284: Limit impersonate permissions KEP-5284: Constrained Impersonation May 16, 2025
- The impersonator cannot impersonate on updating nodes

- SAR check on impersonate scheduled node with permissions. The impersonator has the
user extra info of `"authentication.kubernetes.io/node-name": "node1"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can any client provide this? What protects against abuse?

The important examples here are actually:
• a compromised node
• an attacker aiming to elevate privileges by impersonating a node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an attacker intends to elevate privileges by impersonating a node, it needs to have an user extra "authentication.kubernetes.io/node-name": "someNode" in the userInfo, and it needs to have both permissions with verbs "impersonated:scheduled-nodes" and "impersonated:someVerb".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What protects against abuse?

only authenticators can indicate this value. If you have the power of the authenticator, you can add system:masters.

in the proposed apporch, the impersonator's permission is clearer that it can only perform
the action when impersonating.

### Expand RBAC/SAR
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cluster does not use RBAC, is this alternative still feasible?


# The following PRR answers are required at beta release
metrics:
- my_feature_metric
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(placeholder - please change)

Signed-off-by: Jian Qiu <jqiu@redhat.com>
Comment on lines 196 to 200
- `impersonate:user-info` limits the impersonator to impersonate users with
certain names/groups/userextras. The resources must be `users`/`groups`/`userextras`/`uids`.
The resource names must be user names, group names or values in the user extras accoringly.
- `impersonate:serviceaccount` that limits the impersonator to impersonate the serviceaccount with
the certain name/namespace. The resources must be `serviceaccounts`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does impersonate:serviceaccount for some resourceName imply access to the the system:serviceaccounts: groups associated with those serviceaccounts as well?
Does this verb behave differently than the current impersonate verb when serviceaccounts is supplied as the resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same as impersonate verb with serviceaccount as the resource, but it will require additional impersonate-on:{action} verb. The reason that not to reuse the original impersonate verb is to make it unchanged for backward compatible.

Signed-off-by: Jian Qiu <jqiu@redhat.com>
Signed-off-by: Jian Qiu <jqiu@redhat.com>
@qiujian16 qiujian16 requested a review from enj June 13, 2025 03:15
Signed-off-by: Jian Qiu <jqiu@redhat.com>
that might indicate a serious problem?
-->

Authz latency for impersonation is too long.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookup the actual metric please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.


Authz latency for impersonation is too long.

###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs answer. Without any persisted state, there is no lasting history to clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok added some test plan.


The impersonator does not need the permission for the target action.

#### workflow when the feature is enabled
Copy link
Member

@enj enj Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the mermaid diagram to show all the checks, i.e. impersonate:user-info is a different check than impersonate:node, etc.

Signed-off-by: Jian Qiu <jqiu@redhat.com>
Comment on lines 1028 to 1029
authorization_attempts_total shows greatly increased number.
authorization_duration_seconds_bucket shows greatly increased number of request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tell you about the WithAuthorization request handler, not actual authz calls. Those are only handled by the InstrumentedAuthorizer which only has apiserver_authorization_decisions_total today. We will need to enhance that to include apiserver_authorization_duration_seconds. Maybe we would add labels to distinguish between impersonation calls based on the verb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, it seems that we have to introduce a new metrics with verb label? And added the part when webhook authorizer is used.

@deads2k
Copy link
Contributor

deads2k commented Jun 18, 2025

Once Mo's comments are addressed, I'm happy with this for sig and PRR.

/approve

I'll leave lgtm with Mo since he has minor comments outstanding.

/assign @enj

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2025
Copy link
Member

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the KEP @qiujian16! I'm excited to see more access controls in general, especially around impersonation 👍

One alternative approach that could be done, would be to use conditional authorization for this use-case, as that is more generic. With regards to other impersonation-related issues we have, conditional authorization would also allow us to express the following limitations we have today:

  1. Inability to say "principal P is able to impersonate username U and group G only together". Today, if we give principal P ability to impersonate U and G, they can do so independently as well, or in other permutations. Basically, we could enforce an AND here.
  2. Inability to perform suffix/prefix matching. Before one argues that this is not a good idea (ref: kubernetes/kubernetes#56582, yes, it is possible to use poorly), the absence of suffix matching is why impersonate:node here is proposed as a special case
  3. Inability to combine data from the impersonator (principal) and impersonated user (resource), which is why we have impersonate:scheduled-node as a hard-coded special case.

I just finished my MSc thesis devoted to the topic of conditional authorization, and I presented it two weeks ago at the SIG Auth meeting, but I haven't had time to make a KEP out of it yet. I can/will do so if/as SIG Auth is interested in it.

In the context of authorizing an impersonation, we have the following data, roughly:

{
  "principal": {
     "username": "system:serviceaccount:foo:bar",
     "groups": ["system:serviceaccounts", "system:serviceaccounts:foo"],
     "uid": "123",
     "extra": {"authentication.k8s.io/nodeName": ["node-1"]}
  },
  "action": "impersonate",
  "resource": {
     "username": "system:node:node-1",
     "groups": ["system:nodes"],
     "uid": "456",
     "extra": {"baz": ["1234"]}
  },
  "context": {
     "requestAttr": {
         "apiGroup": "",
         "resource": "pods",
         "verb": "get",
         "namespace": "default",
         "name": "pod-1"
     }
  }
}

If conditional authorization is supported in SAR, then the condition for a SAR actually ending up being allowed is returned from the authorizer. So to express the first example in another way:

apiVersion: authorization.k8s.io/v1
kind: SubjectAccessReview
spec:
  resourceAttributes:
    resource: users
    name: someUser
    verb: impersonate
  user: impersonator

would return (if allowed as per this proposal)

Allow, only if (CEL expr): context.requestAttr.resource == "pods" && context.requestAttrs.verb in ["get", "list"]

From this condition, the k8s API server can quickly check that the CEL expression evaluates to true, and does not have to do two SAR requests.

Impersonating a node would resemble a policy like (CEL):

(principal conditions, e.g. that it should be a given SA) && action == "impersonate" && resource.username.startsWith("system:node:") && "system:nodes" in resource.groups && (add here whatever conditions one want to restrict on what type of request it should be)

One common out-of-core use-case for impersonation of prefixed userinfo, is for an impersonating front-proxy authenticator, where structured auth config is not available (e.g. some public clouds). The proxy is often internet-accessible, and you want to limit the attack surface, and make sure that it cannot impersonate system:masters by prefixing all userinfo strings. But that requires better expression support in authz here. E.g. https://github.com/TremoloSecurity/kube-oidc-proxy

Impersonating the scheduled node would be as the last one, but in addition:

resource.username = "system:node:" + principal.extra["authentication.k8s.io/nodeName"][0]

And on top of this, we can return conditions to say that "you can only impersonate username U and group G together, not separately".

My worry here is that we're building powerful features (like suffix match for CSR signers and here, node names, and the node authorizer) for Kubernetes, because we need them, but at the same time we tell our users that they can't have them. I understand the complexity of all of this (I devoted 60 pages to it here), and know it is far from trivial, but this is another step in the direction of building something that gets closer and closer to a generic query/expression language, without actually being one.

If we turn it around; if we had a fully generic authorization expression language that we could analyze; we could maybe instead provide users with good tooling that avoids them encountering the pitfalls/gotchas we are afraid of them doing.

I know that this kind of thing is a bit like boiling the ocean, but as I start writing a KEP on where conditional authorization is useful, I wanted to bring all these use-cases up in the design phase of this one. One thing to point out also, is that while I showed CEL expressions there, and that being the intermediate representation for condition expressions, the user experience does not have to be low-level like that: the UX can (and should) still give users user-friendly options like "node with name node-1" and "scheduled node"; however, that user-friendly "frontend" can just compile into the CEL intermediate representation, while allowing many other non-k8s core-oriented use-cases at the same time.

For example, I think that these "special" impersonate:<type> and impersonate-on:<verb> RBAC representations could be also easily turned into conditional authorization rules by the RBAC authorizer, allowing this exact "frontend", but through only one SAR (not two), and keeping the door open for more expressiveness in the future, without having to rebuild the impersonation chain again.

[issue](https://github.com/kubernetes/kubernetes/issues/27152)

There are use cases that needs a controller to be able to impersonate:
1. A controller impersonates a node the controller is running on. This happens when
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why we want others to be able to take advantage of the node authorizer's special features, but an alternative towards this end (which is fairly orthogonal to this KEP) is to generalize the features of the node authorizer so the controller can take advantage of it directly, instead of having to go through impersonation in the first place; which has the side-effect that the controller's identity is more or less "lost", so admission controllers have a hard time understanding if a request with username=system:node:foo really is node foo or some agent acting with its permissions.

I see below that it is considered a feature that the admission chain runs against the impersonated principal and not the impersonator, but yeah. However, this is a net-new add, there are still other, complementary ways to achieve the goal, even with this feature existing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which has the side-effect that the controller's identity is more or less "lost", so admission controllers have a hard time understanding if a request with username=system:node:foo really is node foo or some agent acting with its permissions

This is intentional. We could embed the original user-info or otherwise make it available in the authz/admission layer, but we are choosing not to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand that this is desired as per the KEP, just highlighted that it's not suitable for all use-cases.

The litmus test I was thinking of here, is that if it was possible to give the controller SA direct access to exactly the resources (pods/PVs/secrets/etc.) e.g. referenced from the node the controller is running on, would you still recommend this impersonating path, or would that generalized node authorizer solve the issue faced (of locking down what the node-scoped controller can do)?

1. A controller impersonates a node the controller is running on. This happens when
per-node agents (CNI plugins for instance), want to read pods on a given node instead
of unrestricted pod access.
2. A controller runs as a deputy, receives request from the user, and impersonates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some more concrete plan of how this will work in practice? How do we recommend people to store the "ownership" of the object? Simply using an owner label?

In the past, we have recommended to not make objects owned by specific people, as usually there can be multiple writers. I guess SSA helps us here in some regard though, it the field manager can be tied to an authenticated identity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think ownership relates to the deputy mode here. The owner will be the target user not the impersonator. In deputy mode, controller just is impersonating as the target user, but not own the object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was more thinking what ownership model of the user you had in mind. How does the controller (impersonator) know which user to impersonate when performing a request? A special label called owner on the object? A field on the CRD?

Introduce verbs `impersonate:user-info`, `impersonate:serviceaccount`, `impersonate:node` and
`impersonate:scheduled-node`:
- `impersonate:user-info` limits the impersonator to impersonate users with
certain names/groups/userextras. The resources must be `users`/`groups`/`userextras`/`uids`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that the existing API surface for impersonation is inconsistent, as for the normal impersonate verb we have the following resources and groups

apiGroup resource
"" users
"" groups
authentication.k8s.io uids
authentication.k8s.io userextras

Good that we now want to make it uniformly under the authentication.k8s.io API group, but just noting it might need some extra documentation / explanation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of API groups did not exist when the original impersonation for users and groups was written...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know, it's unfortunate.

I was more getting at how we communicate it to the users. Do we send a warning if we see that someone tries to do impersonate authentication.k8s.io users and/or impersonate:user-info "" users, i.e. applying the new way to the old verb, or old way to the new verb?

Or do we strive to unify this by actually sending a SAR with the new, consistent format (impersonate authentication.k8s.io users) by default, and then falling back to the legacy (impersonate "" users) for backwards compability, such that we can make documentation uniform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be the the latter. When the feature is enabled, the impersonate authentication.k8s.io users will be done, and fallback to the legacy way for backward compatible.


### Verb `impersonate:serviceaccount`

Same as legacy impersonation, when the request has the header of `Impersonate-User`, and the value of the header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we discussed with @micahhausler the other day, if this check succeeds, group system:serviceaccounts (and the namespaced variant) is added automatically, without extra authorization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would continue to do that, and likely do the same for system:nodes for impersonated nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify if the impersonator required to fully-specify the serviceaccount+groups(+uid?), or forbidden from specifying any impersonation headers other than the Impersonate-User: "system:serviceaccounts:$namespace:$name", and if the latter, clarify that the service account groups and uid (which I guess requires looking up the service account?) are auto-added into the impersonated user info

Copy link
Member

@enj enj Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we would do the same as we do for SAs today (for both SAs and nodes) -> no lookup to get UID (so no UID in the request), groups are hard coded. Client is only allowed to set the user header, else we won't try the new flow at all (this does mean SA and node impersonation cannot use custom groups, which seems okay to me).


#### Story 1

As a controller, I want to impersonate node I am running on to list/get pods on the node. The service account
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wish we solved this in a way that we did not require a code change for this scenario, there are a lot of daemonsets that need to do this and now we would have to update their code where the caller would need to be aware which context they need to make the call in. IMHO the more "ideal" approach would be to node restrict the service account so that application code does not have to be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly set the impersonation headers in the kubeconfig, and a daemonset running all or some of the operations as the node already has to have logic to read the node's kubeconfig. So this is effectively a configuration change.

@enj
Copy link
Member

enj commented Jun 18, 2025

One alternative approach that could be done, would be to use conditional authorization for this use-case

To be clear, I am aware of this - Anish and I did the initial POCs of RBAC++ and we played around with conditional authz bits there as well. One of the alternate names for this KEP was "conditional impersonation."

I know that this kind of thing is a bit like boiling the ocean

While I don't agree with all of the user stories you stated, I am aware of the set of changes needed to make something like conditional authorization work. That will take multiple KEPs and releases to iron out, so its not going to happen anytime soon. I am okay making incremental progress here with this KEP as-is. It does not prevent us from doing conditional authorization in the future, or enhancing impersonation again to leverage it.

@qiujian16
Copy link
Contributor Author

One alternative approach that could be done, would be to use conditional authorization for this use-case

Added this in the alternatives section.

Signed-off-by: Jian Qiu <jqiu@redhat.com>
@luxas
Copy link
Member

luxas commented Jun 19, 2025

Agreed @enj, I'm all for moving ahead with this now in parallel to the other, bigger topics that will take (far) longer due to their complexity. This KEP helps the most common use-cases indeed. 👍

I know you looked at conditional authz as well, but I was unsure of the omission of that alternative in the KEP here was intentional or not. Thanks @qiujian16 for adding it 👍

I plan to start drafting a KEP for conditional authz after summer holiday, we can continue this discussion there then.

Comment on lines 1440 to 1444
Conditional authorization is the emerging work to provide more complicated authorization policy
with CEL expressions. Potentially it would be able to reduce the number of permission checks for the impersonation
in this proposal. The work is still in very early stage, and will bring many changes in the exising authorization
model. It is possible to enhance constrained impersonation in this proposal with conditional authorization in the
future.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think) we should be wary of merging code for the solution in this KEP, even as alpha, without developing our thoughts more around this alternative.

We can still pick this option, but I'd like more than one paragraph about the alternative direction. Or we move forward with alpha and commit to revisit the alternatives before beta; that works too.


Here's an example of an API object we might use to constrain impersonation.

apiVersion: authorization.k8s.io/v1alpha0
kind: ClusterAccessRule
metadata:
  name: example
rule:
  filterExpression: >-
    (request.attributes.verb in ["impersonate"]) &&
    (group == "principals.external-authz.example")
  entries:
  - effect: Deny
    description: >-
      Restrict impersonation to real users
    condition: >-
      request.attributes.resource == "groups" ||
      (request.attributes.resource == "users" && 
       request.impersonation.target.type in ["ServiceAccount"])
  - effect: Allow
    description: >-
      Alice can impersonate Bob or Corai
    condition: >-
      (request.attributes.resource == "users" && 
       request.impersonation.target.name in ["bob","corai"])

I'm mentioning this not to advocate for this particular API, but to shape thinking about what to put in the second paragraph here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strive to start some of the more "formal" KEP design work around this in July, after my vacation. What I have now is https://speakerdeck.com/luxas/conditional-authorization-for-kubernetes-sig-auth-presentation, which I presented during the June 4 SIG Auth meeting. When the recording is up, you can check the discussion out.

I think it's key to balance expressiveness with analyzability (like this), such that we can build good introspecting / debugging / semantic analysing tools around this, to aid people in doing the right thing. E.g. we could issue warnings if the user does something potentially dangerous, and based on admin policy either block, or say "are you sure, this might be a bad idea?"

- Determine if additional tests are necessary
- Ensure reliability of existing tests
- Determine if some mechanism should be introduced to reduce the extra permission checks

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Alternatives have been reviewed, and we have consensus not to switch to an alternative
approach.

Do we want anything about checking this works with non-RBAC authz?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you mean other authorizer like webhook authorizer, it is in the alpha scope as the subjectaccessreview will be sent to the authorizer.

Signed-off-by: Jian Qiu <jqiu@redhat.com>
@enj
Copy link
Member

enj commented Jun 19, 2025

From Tim

(I think) we should be wary of merging code for the solution in this KEP, even as alpha, without developing our thoughts more around this alternative.

We can still pick this option, but I'd like more than one paragraph about the alternative direction. Or we move forward with alpha and commit to revisit the alternatives before beta; that works too.

I do not consider that a true alternative, as it involves a dramatic change to Kube authz (3+ KEPs easily). Taken to the extreme, that alternative could be used to generically build something like the node authorizer via config instead of being hard coded into the API server - but even if that existed does not mean we would not prefer the purpose built solution via the hard coded node authorizer. I expect this KEP and the future work to exist in parallel indefinitely into the future - both will be usable, likely in combination with each other.


From David

Once Mo's comments are addressed, I'm happy with this for sig and PRR.

...

I'll leave lgtm with Mo since he has minor comments outstanding.

This KEP is more than good enough for an alpha in its current state.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 19, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, enj, qiujian16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 02d2850 into kubernetes:master Jun 19, 2025
4 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone Jun 19, 2025
@github-project-automation github-project-automation bot moved this from Changes Requested to Closed / Done in SIG Auth Jun 19, 2025

### Verb `impersonate:serviceaccount`

Same as legacy impersonation, when the request has the header of `Impersonate-User`, and the value of the header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify if the impersonator required to fully-specify the serviceaccount+groups(+uid?), or forbidden from specifying any impersonation headers other than the Impersonate-User: "system:serviceaccounts:$namespace:$name", and if the latter, clarify that the service account groups and uid (which I guess requires looking up the service account?) are auto-added into the impersonated user info

name: someNode
verb: impersonate:nodes
user: impersonator
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the impersonator required to fully-specify the node+groups(+uid?), or forbidden from specifying any impersonation headers other than the Impersonate-User: "system:nodes:$name"? If the latter, are the node groups auto-added into the impersonated user info?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


`impersonate:scheduled-node` is a special verb to check when two conditions are met:
1. The impersonator is impersonating a node by setting the header `Impersonate-User` and the value has a prefix
of `system:node:` on the request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the impersonator required to fully-specify the node+groups(+uid?), or forbidden from specifying any impersonation headers other than the Impersonate-User: "system:nodes:$name"? If the latter, are the node groups auto-added into the impersonated user info?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`impersonate:scheduled-node` is a special verb to check when two conditions are met:
1. The impersonator is impersonating a node by setting the header `Impersonate-User` and the value has a prefix
of `system:node:` on the request.
2. The user info of the impersonator has an extra with key `authentication.kubernetes.io/node-name`, and the value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the impersonator required to be a service account, or just have the matching authentication.kubernetes.io/node-name extra info value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required to be SA.

namespace: deputy-ns
```

### Risks and Mitigations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's potential for more than 2 extra checks:

If impersonation headers are for non-serviceaccount/non-node username with N username+groups+uid+extra attributes, there are now up to 2N + 1 authz checks the filter has to do (N+1 more than before)?

  • impersonate:user-info (N checks) && impersonate-on:...
  • impersonate (legacy) (N checks)

If impersonation headers are for a service account, there are now up to 3 authz checks the filter has to do (2 more than before)?

  • impersonate:serviceaccount && impersonate-on:...
  • impersonate (legacy)

If impersonation headers are for a node, there are now up to 4 authz checks the filter has to do (3 more than before)?

  • (impersonate:scheduled-node || impersonate:node) && impersonate-on:...
  • impersonate (legacy) - N checks for the various facets

Are we just documenting the extra load, or do we have benchmarks / targets that we consider acceptable for this that would inform whether we need to optimize authz paths (rbac indexing or batch authz or something) as part of graduating this?

Currently, all existing working impersonation configurations are using the legacy system. If this KEP switches to do constrained authz checks first, it immediately doubles the impersonation authorization cost for all consumers. As adoption of constrained impersonation increases, I'm not sure how we'll decide which check to place first (constrained or unconstrained) to be the most efficient.

```go
kubeConfig, _ := clientcmd.BuildConfigFromFlags("", "")
kubeConfig.Impersonate = rest.ImpersonationConfig{
UserName: "system:node:" + os.Getenv("MY_NODE_NAME"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would they also have to add Groups: ["system:authenticated","system:nodes"] or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ok since all these permission for constrained impersonation are strictly additive which
means it should be added when certain impersonation requests are needed.

## Design Details
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because impersonate-on:... permissions are granted independent of impersonate:user-info / impersonate:serviceaccount / impersonate:node, an impersonator cannot be granted independent impersonate:... + impersonate-on:... combinations.

Instead, their abilities expand to the union of all impersonate:... identities, constrained by union of all impersonate-on:... permissions.

@deads2k @enj is that well understood / considered acceptable? I didn't see any discussion of that in the design.

Copy link
Member

@liggitt liggitt Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

practically, this means if you granted something like this to the same service account:

Node pod status update stuff
* impersonate:scheduled-node
* impersonate-on:update "pods/status"

Controller secret read stuff
* impersonate:serviceaccount "myns/my-ingress-controller"
* impersonate-on:get "secrets"

what that would actually grant is read access to every secret the scheduled-node had access to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the union reasonable and not surprising. It's similar to the union that happens for resources and verbs in a single rule.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I thought about the union case. Seems like it would be better to further scope the -on check such that you would need to do:

Node pod status update stuff
* impersonate:scheduled-node
* impersonate-on:scheduled-node:update "pods/status"

Controller secret read stuff
* impersonate:serviceaccount "myns/my-ingress-controller"
* impersonate-on:serviceaccount:get "secrets"

@deads2k WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it is still not a true intersection, as this causes a union still:

Controller pod read stuff
* impersonate:serviceaccount "myns/my-pod-controller"
* impersonate-on:serviceaccount:get "pods"

Controller secret read stuff
* impersonate:serviceaccount "myns/my-ingress-controller"
* impersonate-on:serviceaccount:get "secrets"

Maybe we move the name into verb itself like impersonate[-on]:serviceaccount:<namespace/name>?

Controller pod read stuff
* impersonate:serviceaccount:myns/my-pod-controller
* impersonate-on:serviceaccount:myns/my-pod-controller:get "pods"

Controller secret read stuff
* impersonate:serviceaccount:myns/my-ingress-controller
* impersonate-on:serviceaccount:myns/my-ingress-controller:get "secrets"

The implementation is no more difficult, but the RBAC is kind of ugly to write.

Introduce verbs `impersonate:user-info`, `impersonate:serviceaccount`, `impersonate:node` and
`impersonate:scheduled-node`:
- `impersonate:user-info` limits the impersonator to impersonate users with
certain names/groups/userextras. The resources must be `users`/`groups`/`userextras`/`uids`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the flowchart of authz checks that will be done, make it clear what happens if a request that impersonates multiple attributes arrives.

For example:

GET /api/v1/namespaces/ns/pods
Headers:
  Impersonate-User: a
  Impersonate-Uid: b
  Impersonate-Group: c
  Impersonate-Group: d
  Impersonate-Extra-foo: e
  Impersonate-Extra-foo: f
  Impersonate-Extra-bar: g
  Impersonate-Extra-bar: h

For that request, I think would result in these checks:

  • constrained: impersonate:user-info (8 authz checks) && impersonate-on:list pods in namespace ns (1 authz check)
  • unconstrained: impersonate (8 authz checks)

Questions I wasn't sure about:

  • What order are constrained / unconstrained checks done in?
  • For constrained, do we do the 8 impersonate:user-info checks or the single impersonate-on check first?
  • If any of the 8 impersonate:user-info or impersonate checks fail, do we short-circuit the others (I hope we do)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the single impersonate-on check first

Seems like we should do this part first for each of the new cases.

Copy link
Contributor Author

@qiujian16 qiujian16 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of the 8 impersonate:user-info or impersonate checks fail, do we short-circuit the others (I hope we do)

yes if any fails, it will fallback to legacy mode.

If any of the 8 impersonate:user-info or impersonate checks fail, do we short-circuit the others (I hope we do)

yes

This is the worst case:

  1. enable the featuregate
  2. the legacy impersonate permission is set with UserExtra as the resource.
  3. all impersonate-as header is set
    And with such, comparing with legacy mode, it will introduce N+1 checks.

What order are constrained / unconstrained checks done in?

The plan is do constrained at first than unconstrained. The reason is that if user enables this feature, they should understand that they need to use the constrained permission. And if they does that, unconstrained checks will fail for every request even if it passes constrained checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that if user enables this feature, they should understand that they need to use the constrained permission. And if they does that, unconstrained checks will fail for every request even if it passes constrained checks.

That doesn't sound correct / compatible… can you describe what you mean by that? The feature will be enabled by default as it graduates to stable, and we won't break the ability to grant unconstrained impersonation

Copy link
Contributor Author

@qiujian16 qiujian16 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't mean we will break the ability to grant unconstrained impersonation.

I think we should check the constrained impersonation first and if fails, fallback to unconstrained impersonation. And the reason is based on the assumption that user will likely to use constrained permission with this feature enabled and remove the legacy unconstrained permission. In such case, checking constrained permission would be appropriate. Because, the unconstrained permission check will fail given user removes unconstrained permission, and even if user creates constrained permission already, each request will still go through unconstrained permission check at first.

If the assumption is most likely user enables this feature but still use legacy permission, checking unconstrained permission would be more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: Closed / Done
Development

Successfully merging this pull request may close these issues.