-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create KEP to harden the default RBAC discovery clusterrolebindings #720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the KEP. I think this is very likely to break any clients out there that are depending on access to these APIs, and the rollout needs very careful consideration.
Can you include more details about how these APIs might have been used before, and how to reconcile those use cases?
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
### Risks and Mitigations | ||
|
||
1. If a user upgrades a cluster which has already limited access to the `GET /healthz`, `/version` or `/version/` APIs, auto-reconciliation will re-enable universal access via `system:public-info-viewer` | ||
1. We'll explicitly call this out as an `ACTION REQUIRED` with a sample `kubectl apply` which can be used to effectively disable the `system:public-info-viewer` ClusterRoleBinding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would those roles be reset everytime the apiserver restarts though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the bootstrap code (https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/rbac/rest/storage_rbac.go), I think it does reconciliation which would mean that modifications would be clobbered on restart. @liggitt can probably give a more definitive answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if the rbac.authorization.kubernetes.io/autoupdate
is set to true
1. If a user upgrades a cluster which has already limited access to the `GET /healthz`, `/version` or `/version/` APIs, auto-reconciliation will re-enable universal access via `system:public-info-viewer` | ||
1. We'll explicitly call this out as an `ACTION REQUIRED` with a sample `kubectl apply` which can be used to effectively disable the `system:public-info-viewer` ClusterRoleBinding | ||
1. Some use-cases might require the existing permissions to be preserved for unauthenticated calls | ||
1. In the release notes, we can include an easy 'escape hatch' to re-enable unauthenticated access to the APIs, e.g. `kubectl create clusterrolebinding public-discovery --clusterrole=system:discovery --group=system:unauthenticated` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need to follow the deprecation policy, and roll out over multiple releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guidance accepted...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the proposed approach lets upgraded clusters work as-is without any behavior change, I think this can be implemented immediately.
|
||
### Risks and Mitigations | ||
|
||
1. If a user upgrades a cluster which has already limited access to the `GET /healthz`, `/version` or `/version/` APIs, auto-reconciliation will re-enable universal access via `system:public-info-viewer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about downgrades? Any concerns there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downgrades would result in a redundant binding (i.e. system:public-info-viewer
being a strict subset of the permissions granted by system:discovery
). This can also be called out, and remedied with a one-line kubectl delete
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
1. If a user upgrades a cluster which has already limited access to the `GET /healthz`, `/version` or `/version/` APIs, auto-reconciliation will re-enable universal access via `system:public-info-viewer` | ||
1. We'll explicitly call this out as an `ACTION REQUIRED` with a sample `kubectl apply` which can be used to effectively disable the `system:public-info-viewer` ClusterRoleBinding | ||
1. Some use-cases might require the existing permissions to be preserved for unauthenticated calls | ||
1. In the release notes, we can include an easy 'escape hatch' to re-enable unauthenticated access to the APIs, e.g. `kubectl create clusterrolebinding public-discovery --clusterrole=system:discovery --group=system:unauthenticated` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the proposed approach lets upgraded clusters work as-is without any behavior change, I think this can be implemented immediately.
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
|
||
### Non-Goals | ||
|
||
Disabling unauthenticated and/or anonymous access to all APIs by default. There are several non-sensitive and legitimate use-cases for unauthenticated calls, such as loadbalancer or kubelet liveliness checks and `kubectl version` that we don't wish to break. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or kubelet liveliness checks
What is this use case?
kubectl version
If you're using kubectl, you're probably authenticated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have an apiserver Pod managed by a kubelet, running an https liveness check on :443/healthz
, I don't think there's a way to make that authenticated... right?
I can't remember why there was a push to keep /version widely open. @liggitt?
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
In the interest of getting a review from non-Googlers, and since you have context with the CVE, PTAL @deads2k |
At a first read, this seems like it would produce severe usability issues. Instead of trying to limit access to discovery out of fear, let's fix the discovery call itself. The idea of caching it instead of proxying it has merit. The permissions changes being proposed here do not address our underlying causes. Any distribution that wishes to do this can do it already. I'd rather see progress on our underlying issues than trying to segregate a class of user when we know that clusters today are multi-tenant and have multiple classes of authenticated users. If we fix discovery proxying, we can still do something like this, but we'll have made progress on the actual problem. |
Can you give examples?
I'm in favor of doing both. An unauthenticated escalation is categorically worse than an authenticated one, and is simpler to protect against.
I don't understand the "we can still do something like this" statement. Are you proposing doing this, but after changing discovery proxying? |
If we break this KEP into pieces, I see two assertions
To the first I would say that there are better solutions for managing potential discovery problems that don't require changing permission. The notion of caching discovery for instance. To the second, I would say that if we want to explore the idea of having a server with more controlled unauthenticated access, let's create a KEP about that. This appears to be starting at the conclusion of such a KEP. I see three different levels of unauthenticated access restriction.
This KEP appears to be a middle ground that I'm not sure anyone is at. It remains very easy for an unprivileged user to grant unauthenticated access, but that same user cannot make it possible for unauthenticated to actually use the permission that is granted. it's possible, just not easy. Rather than introduce this asymmetry without consideration of how to build out a usable system, let's consider the problem as a whole and figure out how to create a usable system that satisfies the need. @liggitt I think I answered your question |
Current guidance explicitly discourages users from modifying the default ClusterRoleBindings, which allow unrestricted access to these APIs. If I removed language re: the CVE and expounded further on why we considered removing anonymous auth, unbinding system:unauthenticated by default, or otherwise semantically forking the behavior of discovery to be more problematic than the proposed solution, would that be acceptable? |
I'm still looking for something that explains the considering of the different levels of unauthenticated restriction. The core of this proposal creates a situation where unauthenticated permissions can still be granted by namespace level admins, creating a case where someone has permission, but can't easily use it. How about considering the problem of unauthenticated access more completely (like I described above)? I don't see case as a good outcome, let's give them the tools to make effective use of changes to default policy before we changing it in a way that promotes weird/bad states. |
I'm not opposed to also pursuing caching discovery, but I consider the default exposure of discovery to unauthenticated users problematic. When the current policy was determined, custom resources did not exist, so knowing types in a cluster was not considered confidential. It now can be. This addresses that issue, and limits the blast radius for discovery-related handler errors to authenticated users. I agree more can (and should) be done there, but I think this is an important and worthwhile step.
s/unprivileged/low privilege/?
Yes, and we're reconsidering what those permissions should be. The project's posture has always been that the default permissions granted are ones we consider safe (both in terms of escalation and in terms of information disclosure), and individual clusters were free to open up those permissions wider if they wanted. Recent events proved the discovery APIs could be an escalation path, and they (by design) expose information that custom resources make more and more cluster-specific and potentially confidential.
I disagree. The most common question I was asked over the last two months was "how do I limit anonymous access safely"? My only answer was to tell them to override the default configuration and detach that aspect of their cluster policy from auto-reconciliation, which seems backwards to me. Clusters that want to open themselves up to anonymous API users are certainly free to do so, but should not be the default. |
The idea that the presence of CRDs is sensitive suggests levels of discovery access and this proposal doesn't address that. It creates two tiers, but there's no reason to believe there are only two. Claiming this solves that is spurious. Is it so hard to consider how to control unauthenticated access that we're trying to avoid it? There are existent examples of how to control exactly that. This really appears to address a single narrow case that produces the potential for conflict desires. |
spoke via slack. If the intro to this KEP can clearly indicate what is and is not solved here like
Then I'd feel better about taking this approach. RBAC was built for additive permissions, but we need to be sure that people aren't confused about what this actually protects them from and the challenges with usability. |
I clarified & elaborated on the goals, non-goals, risks, and mitigations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally makes sense to me. Would be good to hear of anyone relying on those permissions today, e.g., for permission checks for system:basic-user.
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
@liggitt done! |
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
keps/sig-auth/0034-20190123-harden-default-discovery-bindings.md
Outdated
Show resolved
Hide resolved
I still find the distinction between the "ok" endpoints and "not-ok" endpoints highly artificial (really, why is access to the /hold cancel |
rebase |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dekkagaijin, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@liggitt @tallclair PTAL
See also: kubernetes/kubernetes#72115