Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_e2e: refactor RunTogether function #124668

Conversation

bart0sh
Copy link
Contributor

@bart0sh bart0sh commented May 2, 2024

What type of PR is this?

/kind bug
/kind cleanup
/kind flake

What this PR does / why we need it:

This is a follow up PR for #124645 aiming to fix flaky container lifecycle test cases.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/flake Categorizes issue or PR as related to a flaky test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 2, 2024
@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/sig node
/sig testing

@k8s-ci-robot k8s-ci-robot added area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 2, 2024
@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/test all

@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/test pull-kubernetes-node-swap-fedora

@bart0sh bart0sh force-pushed the PR143-e2e-node-fix-containers-lifecycle branch from db583df to 6ecf0da Compare May 2, 2024 10:42
@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/test pull-kubernetes-node-swap-fedora

@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/test all

@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

So far so good: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-node-swap-fedora/1785987645336195072

I'm going to trigger this job again and again to see if it flakes or not.
/test pull-kubernetes-node-swap-fedora

@k8s-ci-robot k8s-ci-robot requested a review from mtaufen May 2, 2024 20:49
@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 2, 2024
@bart0sh
Copy link
Contributor Author

bart0sh commented May 2, 2024

/assign @SergeyKanzhelev
for approval

@bart0sh bart0sh moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage May 3, 2024
@bart0sh
Copy link
Contributor Author

bart0sh commented May 4, 2024

/test pull-kubernetes-node-swap-fedora

2 similar comments
@bart0sh
Copy link
Contributor Author

bart0sh commented May 5, 2024

/test pull-kubernetes-node-swap-fedora

@bart0sh
Copy link
Contributor Author

bart0sh commented May 6, 2024

/test pull-kubernetes-node-swap-fedora

@bart0sh
Copy link
Contributor Author

bart0sh commented May 6, 2024

/test pull-kubernetes-node-swap-fedora-serial

if lhsStart == -1 {
return fmt.Errorf("couldn't find that %s ever started, got\n%v", lhs, o)
}

rhsStart := o.findIndex(rhs, "Started", lhsStart+1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will exclude the case that rhs starts before the lhs but they run together.

Could you explain what is the reason for changing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is more reproducible IMHO and if rhs starts first, just swap rhs and lhs (as it's done in the other file)

Copy link
Member

@gjkim42 gjkim42 May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit worried about the ordering part as we cannot use this in cases that we cannot guarantee the ordering (e.g. regular containers run together part?).

How about adding another test function like RunTogetherWithOrdering instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well today we don't use that with regular containers... so we could add another one RunTogetherWithoutOrdering?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i am ok with that.

Anyway, I am just curious... could you explain why adding the ordering part deflakes the tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's to avoid weird situations when containers are restarting in a loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we expect that the first container is started before the second one as they must be started sequentially.

@gjkim42
Copy link
Member

gjkim42 commented May 6, 2024

/test pull-kubernetes-node-kubelet-serial-containerd-sidecar-containers
/test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-features
/test pull-kubernetes-cos-cgroupv1-containerd-node-e2e-features

@gjkim42
Copy link
Member

gjkim42 commented May 6, 2024

/lgtm

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bart0sh, matthyx, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2024
@matthyx
Copy link
Contributor

matthyx commented May 6, 2024

/test pull-kubernetes-node-swap-fedora

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 6, 2024

@bart0sh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-swap-fedora-serial 6ecf0da link false /test pull-kubernetes-node-swap-fedora-serial

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bart0sh
Copy link
Contributor Author

bart0sh commented May 6, 2024

/skip

@bart0sh
Copy link
Contributor Author

bart0sh commented May 6, 2024

/skip pull-kubernetes-node-swap-fedora-serial

@k8s-ci-robot k8s-ci-robot merged commit 65f8129 into kubernetes:master May 6, 2024
20 checks passed
SIG Node PR Triage automation moved this from Needs Approver to Done May 6, 2024
@k8s-ci-robot k8s-ci-robot added this to the v1.31 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants