Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deleting namespace stuck at "Terminating" state #60807

Closed
shean-guangchang opened this issue Mar 5, 2018 · 180 comments
Closed

deleting namespace stuck at "Terminating" state #60807

shean-guangchang opened this issue Mar 5, 2018 · 180 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Milestone

Comments

@shean-guangchang
Copy link

I am using v1.8.4 and I am having the problem that deleted namespace stays at "Terminating" state forever. I did "kubectl delete namespace XXXX" already.

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Mar 5, 2018
@dims
Copy link
Member

dims commented Mar 7, 2018

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 7, 2018
@nikhita
Copy link
Member

nikhita commented Mar 10, 2018

@shean-guangchang Do you have some way to reproduce this?

And out of curiosity, are you using any CRDs? We faced this problem with TPRs previously.

@nikhita
Copy link
Member

nikhita commented Mar 10, 2018

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 10, 2018
@oliviabarrick
Copy link

oliviabarrick commented Mar 14, 2018

I seem to be experiencing this issue with a rook deployment:

➜  tmp git:(master) ✗ kubectl delete namespace rook
Error from server (Conflict): Operation cannot be fulfilled on namespaces "rook": The system is ensuring all content is removed from this namespace.  Upon completion, this namespace will automatically be purged by the system.
➜  tmp git:(master) ✗ 

I think it does have something to do with their CRD, I see this in the API server logs:

E0314 07:28:18.284942       1 crd_finalizer.go:275] clusters.rook.io failed with: timed out waiting for the condition
E0314 07:28:18.287629       1 crd_finalizer.go:275] clusters.rook.io failed with: Operation cannot be fulfilled on customresourcedefinitions.apiextensions.k8s.io "clusters.rook.io": the object has been modified; please apply your changes to the latest version and try again

I've deployed rook to a different namespace now, but I'm not able to create the cluster CRD:

➜  tmp git:(master) ✗ cat rook/cluster.yaml 
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: rook
  namespace: rook-cluster
spec:
  dataDirHostPath: /var/lib/rook-cluster-store
➜  tmp git:(master) ✗ kubectl create -f rook/
Error from server (MethodNotAllowed): error when creating "rook/cluster.yaml": the server does not allow this method on the requested resource (post clusters.rook.io)

Seems like the CRD was never cleaned up:

➜  tmp git:(master) ✗ kubectl get customresourcedefinitions clusters.rook.io -o yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  creationTimestamp: 2018-02-28T06:27:45Z
  deletionGracePeriodSeconds: 0
  deletionTimestamp: 2018-03-14T07:36:10Z
  finalizers:
  - customresourcecleanup.apiextensions.k8s.io
  generation: 1
  name: clusters.rook.io
  resourceVersion: "9581429"
  selfLink: /apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/clusters.rook.io
  uid: 7cd16376-1c50-11e8-b33e-aeba0276a0ce
spec:
  group: rook.io
  names:
    kind: Cluster
    listKind: ClusterList
    plural: clusters
    singular: cluster
  scope: Namespaced
  version: v1alpha1
status:
  acceptedNames:
    kind: Cluster
    listKind: ClusterList
    plural: clusters
    singular: cluster
  conditions:
  - lastTransitionTime: 2018-02-28T06:27:45Z
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: 2018-02-28T06:27:45Z
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  - lastTransitionTime: 2018-03-14T07:18:18Z
    message: CustomResource deletion is in progress
    reason: InstanceDeletionInProgress
    status: "True"
    type: Terminating
➜  tmp git:(master) ✗ 

@oliviabarrick
Copy link

I have a fission namespace in a similar state:

➜  tmp git:(master) ✗ kubectl delete namespace fission
Error from server (Conflict): Operation cannot be fulfilled on namespaces "fission": The system is ensuring all content is removed from this namespace.  Upon completion, this namespace will automatically be purged by the system.
➜  tmp git:(master) ✗ kubectl get pods -n fission     
NAME                          READY     STATUS        RESTARTS   AGE
storagesvc-7c5f67d6bd-72jcf   0/1       Terminating   0          8d
➜  tmp git:(master) ✗ kubectl delete pod/storagesvc-7c5f67d6bd-72jcf --force --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (NotFound): pods "storagesvc-7c5f67d6bd-72jcf" not found
➜  tmp git:(master) ✗ kubectl describe pod -n fission storagesvc-7c5f67d6bd-72jcf
Name:                      storagesvc-7c5f67d6bd-72jcf
Namespace:                 fission
Node:                      10.13.37.5/10.13.37.5
Start Time:                Tue, 06 Mar 2018 07:03:06 +0000
Labels:                    pod-template-hash=3719238268
                           svc=storagesvc
Annotations:               <none>
Status:                    Terminating (expires Wed, 14 Mar 2018 06:41:32 +0000)
Termination Grace Period:  30s
IP:                        10.244.2.240
Controlled By:             ReplicaSet/storagesvc-7c5f67d6bd
Containers:
  storagesvc:
    Container ID:  docker://3a1350f6e4871b1ced5c0e890e37087fc72ed2bc8410d60f9e9c26d06a40c457
    Image:         fission/fission-bundle:0.4.1
    Image ID:      docker-pullable://fission/fission-bundle@sha256:235cbcf2a98627cac9b0d0aae6e4ea4aac7b6e6a59d3d77aaaf812eacf9ef253
    Port:          <none>
    Command:
      /fission-bundle
    Args:
      --storageServicePort
      8000
      --filePath
      /fission
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /fission from fission-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from fission-svc-token-zmsxx (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  fission-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  fission-storage-pvc
    ReadOnly:   false
  fission-svc-token-zmsxx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fission-svc-token-zmsxx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>
➜  tmp git:(master) ✗ 

Fission also uses CRDs, however, they appear to be cleaned up.

@barakAtSoluto
Copy link

@shean-guangchang - I had the same issue. I've deleted everything under the namespaces manually, deleted and purged everything from "helm" and restarted the master nodes one by one and it fixed the issue.

I imagine what i've encountered has something to do with "ark", "tiller" and Kuberenets all working together (i bootstraped using helm and backed-up using ark) so this may not be a Kuberenets issue per say, on the other hand, it was pretty much impossible to troubleshot because there are no relevant logs.

@xetys
Copy link

xetys commented Mar 23, 2018

if it is the rook one, take a look at this: rook/rook#1488 (comment)

@oliviabarrick
Copy link

I guess that makes sense, but it seems buggy that it's possible to get a namespace into an undeletable state.

@OguzPastirmaci
Copy link

I have a similar environment (Ark & Helm) with @barakAtSoluto and have the same issue. Purging and restarting the masters didn't fix it for me though. Still stuck at terminating.

@barakAtSoluto
Copy link

I had that too when trying to recreate the problem. I eventually had to create a new cluster....
Exclude - default, kube-system/public and all ark related namespaces from backup and restore to prevent this from happening...

@jaxxstorm
Copy link

I'm also seeing this too, on a cluster upgraded from 1.8.4 to 1.9.6. I don't even know what logs to look at

@adampl
Copy link

adampl commented Jun 27, 2018

The same issue on 1.10.1 :(

@ghost
Copy link

ghost commented Jun 28, 2018

Same issue on 1.9.6

Edit: The namespace couldn't be deleted because of some pods hanging. I did a delete with --grace-period=0 --force on them all and after a couple of minutes the namespace was deleted as well.

@xetys
Copy link

xetys commented Jun 28, 2018

Hey,

I've got this over and over again and it's most of the time some trouble with finalizers.

If a namespace is stuck, try to kubectl get namespace XXX -o yaml and check if there is a finalizer on it. If so, edit the namespace and remove the finalizer (by passing an empty array) and then the namespace gets deleted

@adampl
Copy link

adampl commented Jun 29, 2018

@xetys is it safe? in my case there is only one finalizer named "kubernetes".

@xetys
Copy link

xetys commented Jul 2, 2018

That's strange, I've never seen such a finalizer. I just can speak based in my experience. I did that several time in a production cluster and it's still alive

@andraxylia
Copy link

Same issue on 1.10.5. I tried all advice in this issue without result. I was able to get rid of the pods, but the namespace is still hanging.

@andraxylia
Copy link

Actually, the ns too got deleted after a while.

@andraxylia
Copy link

It would be good to understand what causes this behavior, the only finalizer I had is kubernetes. I also have dynamic webhooks, can these be related?

@adampl
Copy link

adampl commented Jul 3, 2018

@xetys Well, finally I used your trick on the replicas inside that namespace. They had some custom finalizer that probably no longer existed, so I couldn't delete them. When I removed the references to that finalizer, they disappeared and so did the namespace. Thanks! :)

@bobhenkel
Copy link

Same issue on an EKS 1.10.3 cluster:

Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-28T20:13:43Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

@ManifoldFR
Copy link

Having the same problem on a bare metal cluster:

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

My namespace looks like so:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"creneaux-app","namespace":""}}
  name: creneaux-app
spec:
  finalizers:
  - kubernetes

It's actually the second namespace I've had have this problem.

@adampl
Copy link

adampl commented Jul 23, 2018

Try this to get the actual list of all things in your namespace: kubernetes/kubectl#151 (comment)

Then for each object do kubectl delete or kubectl edit to remove finalizers.

@pauloeliasjr
Copy link

removing the initializer did the trick for me...

@ManifoldFR
Copy link

When I do kubectl edit namespace annoying-namespace-to-delete and remove the finalizers, they get re-added when I check with a kubectl get -o yaml.

Also, when trying what you suggested @adampl I get no output (removing --ignore-not-found confirms no resources are found in the namespace, of any type).

@slassh
Copy link

slassh commented Jul 28, 2018

@ManifoldFR , I had the same issue as yours and I managed to make it work by making an API call with json file .
kubectl get namespace annoying-namespace-to-delete -o json > tmp.json
then edit tmp.json and remove"kubernetes"

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

and it should delete your namespace,

@whyvez
Copy link

whyvez commented Jul 15, 2020

In my case, I had to manually delete my ingress load balancer from the GCP Network Service console. I had manually created the load balancer frontend directly in the console. Once I deleted the load balancer the namespace was automatically deleted.

I'm suspecting that Kubernetes didn't want to delete since the state of the load balancer was different than the state in the manifest.

I will try to automate the ingress frontend creation using annotations next to see if I can resolve this issue.

@salluu
Copy link

salluu commented Jul 25, 2020

Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.

kubectl get namespace linkerd -o json > linkerd.json

# Where:/api/v1/namespaces/<your_namespace_here>/finalize
kubectl replace --raw "/api/v1/namespaces/linkerd/finalize" -f ./linkerd.json

After running that command, the namespace should now be absent from your namespaces list.. And it works for me.

Not only namespace but also support the other resources.

you are a star it worked

@Navaneeth-pk
Copy link

Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.

kubectl get namespace linkerd -o json > linkerd.json

# Where:/api/v1/namespaces/<your_namespace_here>/finalize
kubectl replace --raw "/api/v1/namespaces/linkerd/finalize" -f ./linkerd.json

After running that command, the namespace should now be absent from your namespaces list.. And it works for me.

Not only namespace but also support the other resources.

Tried a lot of solutions but this is the one that worked for me. Thank you!

@alexcpn
Copy link

alexcpn commented Aug 7, 2020

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

Better https://stackoverflow.com/a/59667608/429476

@oze4
Copy link

oze4 commented Aug 21, 2020

This should really be the "accepted" answer - it completely resolved the root of this issue!

Take from the link above:

This is not the right way, especially in a production environment.

Today I got into the same problem. By removing the finalizer you’ll end up with leftovers in various states. You should actually find what is keeping the deletion from complete.

See #60807 (comment)

(also, unfortunately, ‘kubetctl get all’ does not report all things, you need to use similar commands like in the link)

My case — deleting ‘cert-manager’ namespace. In the output of ‘kubectl get apiservice -o yaml’ I found APIService ‘v1beta1.admission.certmanager.k8s.io’ with status=False . This apiservice was part of cert-manager, which I just deleted. So, in 10 seconds after I ‘kubectl delete apiservice v1beta1.admission.certmanager.k8s.io’ , the namespace disappeared.

Hope that helps.


With that being said, I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

@oze4
Copy link

oze4 commented Aug 21, 2020

I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

@savealive
Copy link

savealive commented Aug 22, 2020

Yet another oneliner:

for ns in $(kubectl get ns --field-selector status.phase=Terminating -o jsonpath='{.items[*].metadata.name}'); do  kubectl get ns $ns -ojson | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f -; done

But deleting stuck namespaces is not a good solution. Right way is to find out why it's stuck. Very common reason is there's an unavailable API service(s) which prevents cluster from finalizing namespaces.
For example here I haven't deleted Knative properly:

$ kubectl get apiservice|grep False
NAME                                   SERVICE                             AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io          knative-serving/autoscaler          False (ServiceNotFound)   278d

Deleting it solved the problem

k delete apiservice v1beta1.custom.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.custom.metrics.k8s.io" deleted
$  k create ns test2
namespace/test2 created
$ k delete ns test2
namespace "test2" deleted
$ kgns test2
Error from server (NotFound): namespaces "test2" not found  

@ciiiii
Copy link

ciiiii commented Aug 31, 2020

I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

good job.

@nmarus
Copy link

nmarus commented Sep 10, 2020

I had a similar issue on 1.18 in a lab k8s cluster and adding a note to maybe help others. I had been working with the metrics API and with custom metrics in particular. After deleting those k8s objects to recreate it, it stalled on deleting the namespace with an error that the metrics api endpoint could not be found. Putting that back in on another namespace, everything cleared up immediately.

This was in the namespace under status.conditions.message:

Discovery failed for some groups, 4 failing: unable to retrieve the
complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently
unable to handle the request, custom.metrics.k8s.io/v1beta2: the server is currently
unable to handle the request, external.metrics.k8s.io/v1beta1: the server is
currently unable to handle the request, metrics.k8s.io/v1beta1: the server is

@oze4
Copy link

oze4 commented Sep 14, 2020

Yet another oneliner:

for ns in $(kubectl get ns --field-selector status.phase=Terminating -o jsonpath='{.items[*].metadata.name}'); do  kubectl get ns $ns -ojson | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f -; done

But deleting stuck namespaces is not a good solution. Right way is to find out why it's stuck. Very common reason is there's an unavailable API service(s) which prevents cluster from finalizing namespaces.
For example here I haven't deleted Knative properly:

$ kubectl get apiservice|grep False
NAME                                   SERVICE                             AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io          knative-serving/autoscaler          False (ServiceNotFound)   278d

Deleting it solved the problem

k delete apiservice v1beta1.custom.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.custom.metrics.k8s.io" deleted
$  k create ns test2
namespace/test2 created
$ k delete ns test2
namespace "test2" deleted
$ kgns test2
Error from server (NotFound): namespaces "test2" not found  

Definitely the cleanest one liner! It's important to note that none of these "solutions" actually solve the root issue.

See here for the correct solution

That is the message would should be spreading 😄 not "yet another one liner".

@thyarles
Copy link

efinitely the cleanest one liner! It's important to note that none of these "solutions" actually solve the root issue.

This solution solves one of the all possibilities. To look for all possible root causes and fix them, I use this script: https://github.com/thyarles/knsk

@oze4
Copy link

oze4 commented Sep 14, 2020

@thyarles very nice!

@chinazj
Copy link

chinazj commented Sep 21, 2020

Please do not use modify finalize to delete namespace. That will cause an error

image

Please find out the cause of namespace terminating. Currently known troubleshooting directions

  • pod terminating
  • cert-manager webhook block secrte

@wendaotao
Copy link

wendaotao commented Oct 3, 2020

I encounter the same problem:

# sudo kubectl get ns
NAME                   STATUS        AGE
cattle-global-data     Terminating   8d
cattle-global-nt       Terminating   8d
cattle-system          Terminating   8d
cert-manager           Active        8d
default                Active        10d
ingress-nginx          Terminating   9d
kube-node-lease        Active        10d
kube-public            Active        10d
kube-system            Active        10d
kubernetes-dashboard   Terminating   4d6h
local                  Active        8d
p-2sfgk                Active        8d
p-5kdx9                Active        8d
# sudo kubectl get all -n kubernetes-dashboard
No resources found in kubernetes-dashboard namespace.
# sudo kubectl get namespace kubernetes-dashboard  -o json 
{
    "apiVersion": "v1",
    "kind": "Namespace",
    "metadata": {
        "annotations": {
            "cattle.io/status": "{\"Conditions\":[{\"Type\":\"ResourceQuotaInit\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2020-09-29T01:15:46Z\"},{\"Type\":\"InitialRolesPopulated\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2020-09-29T01:15:46Z\"}]}",
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Namespace\",\"metadata\":{\"annotations\":{},\"name\":\"kubernetes-dashboard\"}}\n",
            "lifecycle.cattle.io/create.namespace-auth": "true"
        },
        "creationTimestamp": "2020-09-29T01:15:45Z",
        "deletionGracePeriodSeconds": 0,
        "deletionTimestamp": "2020-10-02T07:59:52Z",
        "finalizers": [
            "controller.cattle.io/namespace-auth"
        ],
        "managedFields": [
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            "f:cattle.io/status": {},
                            "f:lifecycle.cattle.io/create.namespace-auth": {}
                        },
                        "f:finalizers": {
                            ".": {},
                            "v:\"controller.cattle.io/namespace-auth\"": {}
                        }
                    }
                },
                "manager": "Go-http-client",
                "operation": "Update",
                "time": "2020-09-29T01:15:45Z"
            },
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            ".": {},
                            "f:kubectl.kubernetes.io/last-applied-configuration": {}
                        }
                    }
                },
                "manager": "kubectl-client-side-apply",
                "operation": "Update",
                "time": "2020-09-29T01:15:45Z"
            },
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:status": {
                        "f:phase": {}
                    }
                },
                "manager": "kube-controller-manager",
                "operation": "Update",
                "time": "2020-10-02T08:13:49Z"
            }
        ],
        "name": "kubernetes-dashboard",
        "resourceVersion": "3662184",
        "selfLink": "/api/v1/namespaces/kubernetes-dashboard",
        "uid": "f1944b81-038b-48c2-869d-5cae30864eaa"
    },
    "spec": {},
    "status": {
        "conditions": [
            {
                "lastTransitionTime": "2020-10-02T08:13:49Z",
                "message": "All resources successfully discovered",
                "reason": "ResourcesDiscovered",
                "status": "False",
                "type": "NamespaceDeletionDiscoveryFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All legacy kube types successfully parsed",
                "reason": "ParsedGroupVersions",
                "status": "False",
                "type": "NamespaceDeletionGroupVersionParsingFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content successfully deleted, may be waiting on finalization",
                "reason": "ContentDeleted",
                "status": "False",
                "type": "NamespaceDeletionContentFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content successfully removed",
                "reason": "ContentRemoved",
                "status": "False",
                "type": "NamespaceContentRemaining"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content-preserving finalizers finished",
                "reason": "ContentHasNoFinalizers",
                "status": "False",
                "type": "NamespaceFinalizersRemaining"
            }
        ],
        "phase": "Terminating"
    }

#  sudo kubectl version

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

@chinazj
Copy link

chinazj commented Oct 11, 2020

You can use etcdctl to find undeleted resources

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
get /registry --prefix | grep <namespace>

@grebois
Copy link
Contributor

grebois commented Nov 4, 2020

Just copy and paste in your terminal

for NS in $(kubectl get ns 2>/dev/null | grep Terminating | cut -f1 -d ' '); do
  kubectl get ns $NS -o json > /tmp/$NS.json
  sed -i '' "s/\"kubernetes\"//g" /tmp/$NS.json
  kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f /tmp/$NS.json
done

@1vanzamarripa
Copy link

/tmp/$NS.json

this worked for me, and I ran after verifying there were no dangling k8s objects in the ns. Thanks!

@survivant
Copy link

I used this to remove a namespace stuck at Terminated

example :

kubectl get namespace openebs -o json | jq -j '.spec.finalizers=null' > tmp.json 
kubectl replace --raw "/api/v1/namespaces/openebs/finalize" -f ./tmp.json

@gondaz
Copy link

gondaz commented Dec 3, 2020

For all the googlers who bumped into stuck namespaces at Terminating on Rancher specific namespaces (e.g cattle-system), the following modified command (grebois's original) worked for me:

for NS in $(kubectl get ns 2>/dev/null | grep Terminating | cut -f1 -d ' '); do
  kubectl get ns $NS -o json > /tmp/$NS.json
  sed -i "s/\"controller.cattle.io\/namespace-auth\"//g" /tmp/$NS.json
  kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f /tmp/$NS.json
done

@lavalamp
Copy link
Member

lavalamp commented Dec 3, 2020

Folks, just FYI, when the video for this kubecon talk is out I plan to link to it and some of the helpful comments above, and lock this issue.

@lavalamp
Copy link
Member

lavalamp commented Dec 4, 2020

I recorded a 10 minute explanation of what is going on and presented it at this SIG Deep Dive session.

Here's a correct comment with 65 upvotes

Mentioned several times above, this medium post is an example of doing things the right way. Find and fix the broken api service.

All the one liners that just remove the finalizers on the namespace do address the root cause and leave your cluster subtly broken, which will bite you later. So please don't do that. The root cause fix is usually easier anyway. It seems that people like to post variations on this theme even though there's numerous correct answers in the thread already, so I'm going to lock the issue now, to ensure that this comment stays at the bottom.

@kubernetes kubernetes locked as resolved and limited conversation to collaborators Dec 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests