Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleted pod stuck in "Terminating" state #39

Closed
myechuri opened this issue Mar 10, 2020 · 3 comments
Closed

Deleted pod stuck in "Terminating" state #39

myechuri opened this issue Mar 10, 2020 · 3 comments

Comments

@myechuri
Copy link
Contributor

myechuri commented Mar 10, 2020

Workflow:

  • Create nginx deployment
  • Delete nginx deployment
  • Delete VK pod (system spun out a new VK pod to honor deployment's replicas=1)
  • Create nginx deployment

kubectl get pods shows one nginx pod stuck in Terminating state. There is no cell corresponding to this pod.

$ kubectl get pods
NAME                                READY   STATUS        RESTARTS   AGE
nginx-deployment-66f967f649-nlzr4   1/1     Terminating   0          42m
nginx-deployment-66f967f649-twqz9   1/1     Running       0          6m17s

$ kubectl get cells
NAME                                   POD NAME                            POD NAMESPACE   NODE              LAUNCH TYPE   INSTANCE TYPE   INSTANCE ID           IP
1581bec3-4551-47d7-8229-b3f31c780e69   nginx-deployment-66f967f649-twqz9   default         virtual-kubelet   On-Demand     t3.nano         i-09ff4cb4d55c7391b   10.0.28.184
4009e78e-bf75-4148-9e3b-409e80afeca3   registry-creds-gqkm9                kube-system     virtual-kubelet   On-Demand     t3.nano         i-09fc50a0280bc3be9   10.0.26.188
d0f95a9f-88a7-4b3f-82b2-6b9ba6324de4   kube-proxy-tw4k5                    kube-system     virtual-kubelet   On-Demand     t3.nano         i-08d26143b91ad4c94   10.0.22.36

VK + KIP log:

I0310 20:13:45.449670       1 opencensus.go:138] Deleting pod in provider name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default
I0310 20:13:45.449885       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.450001       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.450187       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:45.460426       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.460513       1 opencensus.go:138] Deleting pod in provider phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.460604       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.460669       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.460732       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:45.480927       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.481014       1 opencensus.go:138] Deleting pod in provider reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running
I0310 20:13:45.481106       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.481197       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.481270       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:45.521513       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.521632       1 opencensus.go:138] Deleting pod in provider reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running
I0310 20:13:45.521779       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.521858       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.521940       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:45.602156       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.602261       1 opencensus.go:138] Deleting pod in provider phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.602376       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.602466       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.602563       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:45.762807       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:45.762889       1 opencensus.go:138] Deleting pod in provider reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running
I0310 20:13:45.763047       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:45.763137       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:45.763232       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:46.083583       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:46.083680       1 opencensus.go:138] Deleting pod in provider namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45
I0310 20:13:46.083770       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:46.083840       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:46.083919       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:46.724259       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:46.724378       1 opencensus.go:138] Deleting pod in provider name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default
I0310 20:13:46.724525       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:46.724591       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:46.724653       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:48.004860       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:48.004905       1 opencensus.go:138] Deleting pod in provider uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason=
I0310 20:13:48.004984       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:48.005048       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:48.005112       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:50.565386       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:50.565515       1 opencensus.go:138] Deleting pod in provider namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45
I0310 20:13:50.565632       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:50.565702       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:50.565873       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:13:55.686143       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:13:55.686330       1 opencensus.go:138] Deleting pod in provider namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45
I0310 20:13:55.686431       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:13:55.686500       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:13:55.686562       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:14:05.926792       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:14:05.926885       1 opencensus.go:138] Deleting pod in provider namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45
I0310 20:14:05.927025       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:14:05.927111       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:14:05.927193       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:14:26.407457       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:14:26.407524       1 opencensus.go:138] Deleting pod in provider uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason=
I0310 20:14:26.407605       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:14:26.407668       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:14:26.407738       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:15:07.368161       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:15:07.368299       1 opencensus.go:138] Deleting pod in provider namespace=default name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45
I0310 20:15:07.368418       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:15:07.368508       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:15:07.368576       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
I0310 20:16:29.288918       1 opencensus.go:138] sync handled key=default/nginx-deployment-66f967f649-nlzr4
I0310 20:16:29.289049       1 opencensus.go:138] Deleting pod in provider name=nginx-deployment-66f967f649-nlzr4 phase=Running reason= uid=5fb0f1d5-8b0f-47ed-b952-11dfb91eae45 namespace=default
I0310 20:16:29.289144       1 server.go:604] DeletePod "nginx-deployment-66f967f649-nlzr4"
E0310 20:16:29.289214       1 server.go:613] DeletePod "nginx-deployment-66f967f649-nlzr4": Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
W0310 20:16:29.289280       1 opencensus.go:175] requeuing "default/nginx-deployment-66f967f649-nlzr4" due to failed sync error=failed to delete pod "default/nginx-deployment-66f967f649-nlzr4" in the provider: Could not delete pod default_nginx-deployment-66f967f649-nlzr4: Key not found in store
@justnoise
Copy link
Contributor

We need to update our DeletePod code in pkg/server/server.go to correctly report back when the pod no longer exists (a virtual-kubelet type: errordefs/ErrNotFound). We do this for GetPod but not DeletePod. I can correct this and also see if there are other locations we need to return ErrNotFound.

@justnoise
Copy link
Contributor

Here's the PR: #42

@myechuri
Copy link
Contributor Author

Thanks for the quick fix, @justnoise !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants