Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoleBindings deleted by namespace finalizer before other objects using those permissions finish stopping/finalizing #115070

Closed
karlkfi opened this issue Jan 14, 2023 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/auth Categorizes an issue or PR as relevant to SIG Auth.

Comments

@karlkfi
Copy link
Contributor

karlkfi commented Jan 14, 2023

What happened?

When you delete a Namespace, Kubernetes automatically deletes all the objects in that namespace.

However, if you have a workload in that namespace (e.g. a custom resource) AND the workload has a finalizer AND the controller that manages that finalizer is granted permission by a RoleBinding in the same namespace, then when the Namespace is deleted, the workload object and the RoleBinding are deleted at roughly the same time. So the finalizer controller may (often) deadlock, unable to use the permissions that have been revoked and unable to remove the finalizer on the workload object. This then also blocks the namespace finalizer and namespace deletion.

The only way to recover fromt his situation is the manually remove the finalizer on the workload object. But removing finalizers manually is often unsafe and can cause memory leaks or worse. Worse, namespace tenants won't be able to delete the finalizer either, because their RoleBindings will have been deleted too; a cluster admin would have to do it.

What did you expect to happen?

This behavior is expected, given the current implimentation, but it could be improved and the problem solved by delaying the deletion of RoleBindings (and probably also Secrets, ServiceAccounts, and Roles) until after all other objects in the namespace are fully deleted (not found).

How can we reproduce it (as minimally and precisely as possible)?

This problem was observed while testing a new feature in Config Sync called deletion propagation, which deletes managed objects that have previously been applied, when the RootSync or RepoSync is deleted. To reproduce that exactly would require deploying Config Sync from HEAD of main.

But conceptually, you would follow these steps:

  1. Create a Namespace "X"
  2. Create a RepoSync in namespace X
  3. Create a Secret, Role, and RoleBinding as required for the RepoSync's "reconciler" (a Deployment spun up in another namespace by Config Sync to manage the objects specified for this namespace, as configured by the RepoSync object)
  4. Configure the RepoSync to watch a Git repo
  5. Add some object manifests to that Git repo & wait for Config Sync to apply them into namespace X
  6. Apply the configsync.gke.io/deletion-propagation-policy: Foreground annotation to the RepoSync
  7. Delete Namespace X

The namespace finalizer will delete all the objects in the namespace, and the RepoSync will block, because the "reconciler" won't have permission to delete the objects in namespace X, nor check that they exist, nor even update the RepoSync to remove the finalizer.

Anything else we need to know?

An alternative solution would be to enhance ClusterRoleBinding to allow scoping to a namespace without being IN that namespace (like a RoleBinding). That way the RBAC would exist outside the namespace and not be subject to deletion when the namespace is deleted.

It might also be possible to reproduce this problem more simply with a Deployment that blocks its own deletion with a finalizer until it can do some cleanup (tho that's not a very realistic use case).

Kubernetes version

Any

Cloud provider

Any

OS version

Any

Install tools

No response

Container runtime (CRI) and version (if applicable)

No response

Related plugins (CNI, CSI, ...) and versions (if applicable)

No response

@karlkfi karlkfi added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2023
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 14, 2023
@k8s-ci-robot
Copy link
Contributor

@karlkfi: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 14, 2023
@neolit123
Copy link
Member

/sig auth

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 15, 2023
@enj
Copy link
Member

enj commented Jan 23, 2023

@karlkfi since you are already using finalizers, what prevents you from setting those on the RBAC objects themselves?

My general recommendation would be to avoid having the cleanup code live in the same namespace, or to use things like owner refs to handle cleanup. Avoiding finalizers and having some kind of external GC is also an option.

RBAC makes no attempt at protecting your access (you can delete the bindings that give you access for example). RBAC is not required or even enabled by default in Kube, so I certainly do not see core bits like namespace deletion as treating it as "special." With that I am going to close this issue (since I do not think it is actionable).

/close

@k8s-ci-robot
Copy link
Contributor

@enj: Closing this issue.

In response to this:

@karlkfi since you are already using finalizers, what prevents you from setting those on the RBAC objects themselves?

My general recommendation would be to avoid having the cleanup code live in the same namespace, or to use things like owner refs to handle cleanup. Avoiding finalizers and having some kind of external GC is also an option.

RBAC makes no attempt at protecting your access (you can delete the bindings that give you access for example). RBAC is not required or even enabled by default in Kube, so I certainly do not see core bits like namespace deletion as treating it as "special." With that I am going to close this issue (since I do not think it is actionable).

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lavalamp
Copy link
Member

@enj other auth systems besides RBAC don't have this problem since they don't reside in the namespace. I think it is reasonable to view this as a deficiency in RBAC. People who do happen to use RBAC and run the rules and the workloads out of the same namespace (which is reasonable and/or encouraged by RBAC) seem likely to have a bad time.

It is not obvious how to solve this by adding finalizers and/or owner refs to the RBAC bindings. Can you expand that comment? If it's trivial then maybe that's an OK solution, if it's not trivial at the very least users deserve documentation.

I don't think the namespace deletion code is very well thought out and it wouldn't offend me very much to make it delete RBAC rules last, and I don't think it'd be that hard.

@karlkfi
Copy link
Contributor Author

karlkfi commented Jan 24, 2023

It is not obvious how to solve this by adding finalizers and/or owner refs to the RBAC bindings. Can you expand that comment? If it's trivial then maybe that's an OK solution, if it's not trivial at the very least users deserve documentation.

There's three strategies I can think of:

Client-side dependency apply/delete:
cli-utils implements dependency ordering for applies and deletes. This allows us to do things like apply a ServiceAccount & RoleBinding before applying the Deployment that uses them. Then when they're deleted, the Deployment is deleted first, it watches/polls to confirm deletion (delete watch event or Not Found), and then the dependencies are deleted. These dependencies are specified using an annotation on the dependent object (ex: the Deployment) which references one or more dependency objects by GKNN (group, kind, namepsace, name).

Server-side dependency deletion enforcement:
It should be possible to make a webhook (or controller) that looks for a dependency annotation on any object, patches the referenced object to add a finalizer and annotate it with a list of incoming dependencies. Then when the dependency with the finalizer is deleted, a controller would handle checking if all incoming dependencies are deleted. The controller would then only delete the finalizer on the dependency when all incoming dependencies were deleted. The controller wouldn't perform those deletions, just block until the desired state is true.

Simpler standalone option:
The namespace garbage collector could simply delete all objects EXCEPT RBAC first and watch/poll for confirmation (delete watch event or Not Found), then delete all the RBAC objects. This wouldn't solve the more general dependency management problem, but it would solve the issue with namespace deletion and RBAC specifically.

@lavalamp
Copy link
Member

/reopen

I think this deserves another triage pass :)

@k8s-ci-robot
Copy link
Contributor

@lavalamp: Reopened this issue.

In response to this:

/reopen

I think this deserves another triage pass :)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jan 28, 2023
@enj
Copy link
Member

enj commented Jan 28, 2023

I think this deserves another triage pass :)

Sure. I got busy and was not able to respond, will try again next week.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2023
@aramase
Copy link
Member

aramase commented May 1, 2023

/remove-lifecycle stale

/assign @enj
(assigning so we can triage this again, sorry!)

@deads2k
Copy link
Contributor

deads2k commented Jan 17, 2024

If a particular resource is required to complete a finalizer task, it is logical to place the same finalizer on the other resource. In a world with CRDs, cleanup (and other dependencies) can logically exist between many different resources. Additionally, such a solution would at least make a client aware should that client attempt to delete a rolebinding regardless of namespace cleanup status.

It's also worth noting that liens is a good design for improving the handling dependent resources if you're looking for something even better than finalizers, though the sig isn't actively pursuing it at this time.

/close

@k8s-ci-robot
Copy link
Contributor

@deads2k: Closing this issue.

In response to this:

If a particular resource is required to complete a finalizer task, it is logical to place the same finalizer on the other resource. In a world with CRDs, cleanup (and other dependencies) can logically exist between many different resources. Additionally, such a solution would at least make a client aware should that client attempt to delete a rolebinding regardless of namespace cleanup status.

It's also worth noting that liens is a good design for improving the handling dependent resources if you're looking for something even better than finalizers, though the sig isn't actively pursuing it at this time.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/auth Categorizes an issue or PR as relevant to SIG Auth.
Projects
Archived in project
Development

No branches or pull requests

8 participants