-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Fix flake in node e2e mirror pod tests #117000
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Mar 29 22:31:15 UTC 2023. |
/kind flake |
The fix makes sense in my opinion. However, as this is a Serial e2e test, it is heard that this is caused by the bad cleanup of another test case.
Adding this is no harm to other tests. |
LGTM label has been added. Git tree hash: 5787f5c106720bcb959a4e9fc510ef3c09d10b72
|
/assign @mrunalp @derekwaynecarr @dchen1107 |
/test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-serial |
The issue is that the tests themselves are not responsible for cleaning up their resources, this is the responsibility of the test framework. The test framework creates a new namespace for every test and when the test finishes, it deletes the namespace. But the namespace is deleted asynchronously, i.e. the test framework doesn't wait until all the resources are deleted until moving on to the next test (this is by design). This is fine for parallel tests, but for serial tests that assume they are the only thing running can sometimes be an issue. See related slack thread about this here. It might be worth to consider in future if for node e2e serial tests in particular, we should adapt the test framework to explicitly wait until the namespace is deleted. It may slow down test time, but maybe that is acceptable. |
The newly added `MirrorPodWithGracePeriod when create a mirror pod and the container runtime is temporarily down during pod termination` test is currently flaking because in some cases when it is run there are other pods from other tests that are still in progress of being terminated. This results in the test failing because it asserts metrics that assume that there is only one pod running on the node. To fix the flake, prior to starting the test, verify that no pods exist in the api server other then the newly created mirror pod. Signed-off-by: David Porter <david@porter.me>
/test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-serial |
/lgtm |
LGTM label has been added. Git tree hash: 715c7784ade78a57dc4626702d6502a7e1e6accb
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobbypage, oomichi, pacoxu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The newly added
MirrorPodWithGracePeriod when create a mirror pod and the container runtime is temporarily down during pod termination
test is currently flaking because in some cases when it is run there are other pods from other tests that are still in progress of being terminated. This results in the test failing because it asserts metrics that assume that there is only one pod running on the node.To fix the flake, prior to starting the test, verify that no pods exist in the api server.
What type of PR is this?
/kind flake
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #116998
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: