Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky "Delete Grace Period should be submitted and removed" with IPv6 #85762

Open
aojea opened this issue Nov 30, 2019 · 12 comments

Comments

@aojea
Copy link
Contributor

@aojea aojea commented Nov 30, 2019

Which jobs are failing:

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/85745/pull-kubernetes-conformance-kind-ipv6/1200730330403704832/

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/85727/pull-kubernetes-e2e-kind-ipv6/1200694468873818112

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/85727/pull-kubernetes-conformance-kind-ipv6/1200730455817588736/

Which test(s) are failing:

[k8s.io] [sig-node] Pods Extended [k8s.io] Delete Grace Period should be submitted and removed [Conformance]

Since when has it been failing:

The ipv6 jobs have flakiness, but since #85745 I had 3 flakies in a row with this test

Testgrid link:

https://testgrid.k8s.io/sig-testing-kind#pull-kubernetes-conformance-kind-ipv6&width=20
https://testgrid.k8s.io/sig-testing-kind#kind%20(IPv6),%20master%20(dev)%20%5Bnon-serial%5D&width=5&sort-by-flakiness=

Reason for failure:

It fails because is not able to delete de pod in 30 seconds, seems that this is the cause of that

Nov 30 07:41:07 kind-worker kubelet[677]: W1130 07:41:07.240438 677 manager.go:1131] Failed to process watch event {EventType:0 Name:/docker/f35390ea97c6bae92f61d7d297fe7eacb4cfb3f2effafb23f3aca00e1a9d2341/kubepods/besteffort/podd8d966f8-2a03-4529-9d70-9f5d868bafc3/243828b3fed0edf1137a9fa5ade6ffbbe468b79bbf212206480bbd37ab20fc7d WatchSource:0}: task 243828b3fed0edf1137a9fa5ade6ffbbe468b79bbf212206480bbd37ab20fc7d not found: not found

Anything else we need to know:

The job view seems to report the same test failures twice but the log output shows different timestamps 🤔

@aojea

This comment has been minimized.

Copy link
Contributor Author

@aojea aojea commented Nov 30, 2019

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node and removed needs-sig labels Nov 30, 2019
@aojea

This comment has been minimized.

Copy link
Contributor Author

@aojea aojea commented Nov 30, 2019

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Dec 3, 2019

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Dec 3, 2019

seems that this is the cause of that

Nov 30 07:41:07 kind-worker kubelet[677]: W1130 07:41:07.240438 677 manager.go:1131] Failed to process watch event {EventType:0 Name:/docker/f35390ea97c6bae92f61d7d297fe7eacb4cfb3f2effafb23f3aca00e1a9d2341/kubepods/besteffort/podd8d966f8-2a03-4529-9d70-9f5d868bafc3/243828b3fed0edf1137a9fa5ade6ffbbe468b79bbf212206480bbd37ab20fc7d WatchSource:0}: task 243828b3fed0edf1137a9fa5ade6ffbbe468b79bbf212206480bbd37ab20fc7d not found: not found

that message is from cadvisor... @dashpole, that ring a bell or seem plausible as a reason for the kubelet delete hanging?

@alenkacz

This comment has been minimized.

Copy link
Contributor

@alenkacz alenkacz commented Dec 4, 2019

this one is failing quite a lot in the 1.17 windows job https://testgrid.k8s.io/sig-release-1.17-informing#gce-windows-1.17 - could someone confirm if it's a bug or a test issue? Thanks

@alenkacz

This comment has been minimized.

Copy link
Contributor

@alenkacz alenkacz commented Dec 4, 2019

or is it actually a completely different failure on windows? Since it's being tracked here as well #84610

@aojea

This comment has been minimized.

Copy link
Contributor Author

@aojea aojea commented Dec 4, 2019

will close this issue because the patch addressing the issue was merged and to avoid having duplicates
We can reopen if necessary
/close

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Dec 4, 2019

@aojea: Closing this issue.

In response to this:

will close this issue because the patch addressing the issue was merged and to avoid having duplicates
We can reopen if necessary
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aojea

This comment has been minimized.

Copy link
Contributor Author

@aojea aojea commented Dec 4, 2019

by the way, @alenkacz this one is unrelated to the windows one :-)

@alenkacz

This comment has been minimized.

Copy link
Contributor

@alenkacz alenkacz commented Dec 4, 2019

@aojea ACK, thank you :)

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Dec 4, 2019

will close this issue because the patch addressing the issue was merged and to avoid having duplicates

Two failures seen on 896b77e after the related PR merged, that was only a cleanup, not a root cause.

/reopen

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Dec 4, 2019

@liggitt: Reopened this issue.

In response to this:

two failures seen on 896b77e after the related PR merged

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.