Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod stuck in terminating #8

Closed
GregoryVds opened this issue Jan 20, 2020 · 5 comments
Closed

Pod stuck in terminating #8

GregoryVds opened this issue Jan 20, 2020 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@GregoryVds
Copy link

Hi,

First of all, thanks for this nice script :-) (even though it is sad that we require suck hack...)
I am using knsk in my CI pipeline and it helped for stuck namespaces.

I have another issue though: namespace is gone, but I still have a Pod stuck on "terminating" in a namespace that no longer exists.

Is this something that other people have experienced?

I was able to delete the Pod with --force.
I am wondering if this is something that could be handled by knsk?

~/git
❯ kubectl get namespace
NAME              STATUS   AGE
default           Active   4d22h
kube-node-lease   Active   4d22h
kube-public       Active   4d22h
kube-system       Active   4d22h

~/git
❯ kubectl get pod -n dataplane
NAME                             READY   STATUS        RESTARTS   AGE                               
mypod-daemon-7f85d5ffc9-kj62j   0/2     Terminating   0          2d12h    
@thyarles thyarles self-assigned this Jan 20, 2020
@thyarles thyarles added the enhancement New feature or request label Jan 20, 2020
@thyarles
Copy link
Owner

Hi @GregoryVds,

Thanks for reporting. Late last week we made a huge change on the script, now it try to delete the pod by kubectl force mode and, if it not work, by patching the finalizer of stuck resource. We can add more tries to avoid orphan resources.

Please, could you give us the output of kubectl get pod -n dataplane -o json?

@GregoryVds
Copy link
Author

Hi @thyarles,

Thanks for the quick answer! Good to know, I will update to the latest version of knsk in my CI build to get the latest improvements.

When it happened I was actually running commit 257a664 of knsk, which is 3 weeks old, so I certainly doesn't have those latest improvements.

I then tried to run by hand the master version on the cluster and it didn't clean the Pod. But maybe that is just because the 257a664 version had already wiped the Namespace itself, and thus Master version couldn't track resources in that Namespace that was already gone?

Anyway, I am emailing you the JSON dump.

@thyarles
Copy link
Owner

@GregoryVds ,

Looking into your JSON I saw:

[...]
"deletionTimestamp": "2020-01-21T00:30:22Z",
[...]
"blockOwnerDeletion": true,
[...]

I think there is something blocking the deletion. What do you think?

@thyarles
Copy link
Owner

Please, take a look at https://github.com/thyarles/knsk/tree/%238-terminating-resources. Don't use the option --delete-orphans yet, it is not intensive tested.

@GregoryVds
Copy link
Author

@thyarles Thanks for the help.

I think there is something blocking the deletion. What do you think?

=> To be honest I have no idea :-) I didn't have time to investigate much more...

I updated my CI script with the latest Master version of knsk.
I will close for now and make sure to reopen this issue in case I see the problem again.

Thanks again for knsk, that's very helpful :-)

thyarles added a commit that referenced this issue Jan 27, 2020
Check for terminating and orphan resources / add a dry-run option
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants