Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky test] When kubelet restarts Should test that a volume mounted to a pod that is deleted while the kubelet is down unmounts when the kubelet returns #75328

Closed
mariantalla opened this issue Mar 13, 2019 · 11 comments
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Milestone

Comments

@mariantalla
Copy link
Contributor

Which jobs are flaking:

  • ci-kubernetes-e2e-gce-new-master-upgrade-master
  • ci-kubernetes-e2e-gce-new-master-upgrade-cluster
  • ci-kubernetes-e2e-gce-master-new-downgrade-cluster
  • ci-kubernetes-e2e-gce-new-master-upgrade-cluster-new

Which test(s) are failing:
When kubelet restarts Should test that a volume mounted to a pod that is deleted while the kubelet is down unmounts when the kubelet returns.

Testgrid link:

Reason for failure:
Timeout:

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/generic_persistent_volume-disruptive.go:73
Expected pod to be not found.
Expected error:
    <*errors.errorString | 0xc00009b860>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/utils/utils.go:242

Anything else we need to know:

  • This test flakes on average ~10% of the time in the sig-release master-upgrade dashboards

/sig storage
/priority important-soon
/kind flake
/remove-kind failing test

@mariantalla mariantalla added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Mar 13, 2019
@k8s-ci-robot
Copy link
Contributor

@mariantalla: Those labels are not set on the issue: kind/failing, kind/test

In response to this:

Which jobs are flaking:

  • ci-kubernetes-e2e-gce-new-master-upgrade-master
  • ci-kubernetes-e2e-gce-new-master-upgrade-cluster
  • ci-kubernetes-e2e-gce-master-new-downgrade-cluster
  • ci-kubernetes-e2e-gce-new-master-upgrade-cluster-new

Which test(s) are failing:
When kubelet restarts Should test that a volume mounted to a pod that is deleted while the kubelet is down unmounts when the kubelet returns.

Testgrid link:

Reason for failure:
Timeout:

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/generic_persistent_volume-disruptive.go:73
Expected pod to be not found.
Expected error:
   <*errors.errorString | 0xc00009b860>: {
       s: "timed out waiting for the condition",
   }
   timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/utils/utils.go:242

Anything else we need to know:

  • This test flakes on average ~10% of the time in the sig-release master-upgrade dashboards

/sig storage
/priority important-soon
/kind flake
/remove-kind failing test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/flake Categorizes issue or PR as related to a flaky test. labels Mar 13, 2019
@mariantalla
Copy link
Contributor Author

@msau42 - sending this your way too for triage.
#75326, #75275, #75196 are higher priority in my head (because they flake more frequently)

Adding it to v1.14 for now.

/milestone v1.14

@k8s-ci-robot k8s-ci-robot added this to the v1.14 milestone Mar 13, 2019
@mariantalla
Copy link
Contributor Author

Yep, fair enough robot friend.

/remove-kind failing-test

@k8s-ci-robot k8s-ci-robot removed the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Mar 13, 2019
@msau42
Copy link
Member

msau42 commented Mar 13, 2019

This test is different than the CSI reconstruction issues because it's testing in-tree default storageclass.

@msau42
Copy link
Member

msau42 commented Mar 13, 2019

/assign @jingxu97

@jingxu97
Copy link
Contributor

I think the root cause is because of this issue #75345
will try to work on a fix soon.

@athenabot
Copy link

/sig node

These SIGs are my best guesses for this issue. Please comment /remove-sig <name> if I am incorrect about one.
🤖 I am an (alpha) bot run by @vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 16, 2019
@mariantalla
Copy link
Contributor Author

/remove-sig node

@mariantalla
Copy link
Contributor Author

Expected fix: #75458

@alejandrox1
Copy link
Contributor

This failure seems to have been resolved.
/close

@k8s-ci-robot
Copy link
Contributor

@alejandrox1: Closing this issue.

In response to this:

This failure seems to have been resolved.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

CI Signal team (SIG Release) automation moved this from Under investigation (prioritized) to Failed-test w/open PR-wait for >5 successes before "Resolved" Apr 19, 2019
@alejandrox1 alejandrox1 moved this from Failed-test w/open PR-wait for >5 successes before "Resolved" to Resolved (week April 15) in CI Signal team (SIG Release) Apr 19, 2019
@alejandrox1 alejandrox1 moved this from Resolved (week April 22) to Resolved (>2 weeks old) in CI Signal team (SIG Release) May 11, 2019
@alejandrox1 alejandrox1 moved this from Resolved (>2 weeks old) to Umbrella issues for flaking tests in CI Signal team (SIG Release) May 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
CI Signal team (SIG Release)
  
Umbrella issues for flaking tests
Development

No branches or pull requests

6 participants