Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e: ginkgo timeouts: use context provided by ginkgo #112923

Merged
merged 2 commits into from
Dec 17, 2022

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Oct 7, 2022

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Ginkgo can tell code to stop running by canceling the context that Ginkgo provides to callbacks when those accept one. This can be used to stop immediately when aborting manually via CTRL-C and to clean up properly in case of a timeout, because cleanup code then runs after the main test has stopped with a new context.

Special notes for your reviewer:

I started eliminating context.TODO in provisioning.go and then branched out from there: any function which had context.TODO needed an explicit context. The minimal goal for this PR is to clean up all code in test/e2e/framework because then further PRs probably can be more localized.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 7, 2022
@k8s-ci-robot k8s-ci-robot requested review from caesarxuchao, cheftako and a team October 7, 2022 17:10
@k8s-ci-robot k8s-ci-robot added area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes area/kubectl area/test sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 7, 2022
@enj enj added this to Needs Triage in SIG Auth Old Oct 10, 2022
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Dec 16, 2022

@pohly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-e2e-containerd-features-kubetest2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-e2e-containerd-features-kubetest2
pull-kubernetes-e2e-capz-ha-control-plane dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-ha-control-plane
pull-kubernetes-node-kubelet-serial-cpu-manager-kubetest2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-cpu-manager-kubetest2
pull-kubernetes-node-kubelet-serial-topology-manager-kubetest2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-topology-manager-kubetest2
pull-kubernetes-node-kubelet-serial-crio-cgroupv2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-crio-cgroupv2
pull-kubernetes-node-kubelet-serial-cpu-manager dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-cpu-manager
pull-kubernetes-node-e2e-containerd-alpha-features dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-e2e-containerd-alpha-features
pull-kubernetes-node-kubelet-serial-memory-manager dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-memory-manager
pull-kubernetes-node-kubelet-serial-topology-manager dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-topology-manager
pull-kubernetes-node-kubelet-serial-hugepages dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-hugepages
pull-kubernetes-node-e2e-containerd-kubetest2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-e2e-containerd-kubetest2
pull-kubernetes-e2e-capz-azure-file-vmss dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-azure-file-vmss
pull-kubernetes-e2e-capz-azure-file dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-azure-file
pull-kubernetes-e2e-capz-azure-disk-vmss dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-azure-disk-vmss
pull-kubernetes-e2e-capz-azure-disk dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-azure-disk
pull-kubernetes-e2e-capz-conformance dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-capz-conformance
pull-kubernetes-e2e-gce-device-plugin-gpu dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-node-kubelet-serial-containerd-kubetest2 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-containerd-kubetest2
pull-kubernetes-e2e-containerd-gce dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-containerd-gce
pull-kubernetes-node-kubelet-serial-crio-cgroupv1 dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-crio-cgroupv1
pull-kubernetes-e2e-gce-correctness dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-e2e-gce-correctness
pull-kubernetes-node-kubelet-serial-containerd dad35bf07123e207ab20909b036ef19c54e898a0 link false /test pull-kubernetes-node-kubelet-serial-containerd
pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2 43c8be198f67fa47c58f6af9036987a73e426b2a link false /test pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
@pohly
Copy link
Contributor Author

pohly commented Dec 16, 2022

/retest

@pohly
Copy link
Contributor Author

pohly commented Dec 16, 2022

/retest

"pull-kubernetes-e2e-capz-windows-containerd — Pod pending timeout"

@aojea
Copy link
Member

aojea commented Dec 16, 2022

I tried my best to get to all files but there are 418 files changes, that is impossible, most of them are mechanical changes s/context.TODO/context gingo/ and we can always revert.

I think that merging on weekend will give us some soak time to see if something breaks, so we can revert in that case

/lgtm
/approve

e2e framework should be able to stop cleanly

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2022
@k8s-ci-robot k8s-ci-robot merged commit a93eda9 into kubernetes:master Dec 17, 2022
SIG Node CI/Test Board automation moved this from PRs Waiting on Author to Done Dec 17, 2022
SIG Node PR Triage automation moved this from Waiting on Author to Done Dec 17, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.27 milestone Dec 17, 2022
@onsi
Copy link
Contributor

onsi commented Dec 17, 2022

🙌

ivelichkovich pushed a commit to ivelichkovich/kubernetes that referenced this pull request Dec 20, 2022
This is in preparation for
kubernetes#112923: DeferCleanup will
automatically do the right thing when testCleanup gets changed to require a
context parameter.
@pohly pohly mentioned this pull request Dec 21, 2022
2 tasks
jaehnri pushed a commit to jaehnri/kubernetes that referenced this pull request Jan 3, 2023
This is in preparation for
kubernetes#112923: DeferCleanup will
automatically do the right thing when testCleanup gets changed to require a
context parameter.
k8s-ci-robot pushed a commit that referenced this pull request May 2, 2023
* Fix flaky HPA e2e tests by not failing on context cancelled

Consume requests are sent during test execution in a loop in a separate goroutine. Once the test completes, it is expected that a consumption request may be pending. Cancelling the request during cleanup should not cause test failures.

Tests started being flaky since #112923 introduced passing test context that gets cancelled during cleanup.

* Use PollUntilContextTimeout and restructure error ignoring logic
rayowang pushed a commit to rayowang/kubernetes that referenced this pull request Feb 9, 2024
…es#117669)

* Fix flaky HPA e2e tests by not failing on context cancelled

Consume requests are sent during test execution in a loop in a separate goroutine. Once the test completes, it is expected that a consumption request may be pending. Cancelling the request during cleanup should not cause test failures.

Tests started being flaky since kubernetes#112923 introduced passing test context that gets cancelled during cleanup.

* Use PollUntilContextTimeout and restructure error ignoring logic
jkyros pushed a commit to jkyros/kubernetes that referenced this pull request Apr 30, 2024
…ancelled (kubernetes#117669)

* Fix flaky HPA e2e tests by not failing on context cancelled

Consume requests are sent during test execution in a loop in a separate goroutine. Once the test completes, it is expected that a consumption request may be pending. Cancelling the request during cleanup should not cause test failures.

Tests started being flaky since kubernetes#112923 introduced passing test context that gets cancelled during cleanup.

* Use PollUntilContextTimeout and restructure error ignoring logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/cloudprovider area/code-generation area/conformance Issues or PRs related to kubernetes conformance tests area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubeadm area/kubectl area/network-policy Issues or PRs related to Network Policy subproject area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Archived in project
Archived in project
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

10 participants