Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E: {cpu,topology} manager: improve debuggability #107915

Merged

Conversation

ffromani
Copy link
Contributor

@ffromani ffromani commented Feb 2, 2022

/kind cleanup
/kind failing-test

What this PR does / why we need it:

Improve the debuggability of topology manager e2e tests which were failing lately on the crio lane.
xref: #107805

Which issue(s) this PR fixes:

Fixes N/A

Special notes for your reviewer:

Aiming to make the failures clearer in order to deliver the actual fix

Does this PR introduce a user-facing change?

NONE

Make sure to log out the cpu capacity and allocatable for
the node running the tests, to make the troubleshooting
of test failures easier.

Signed-off-by: Francesco Romani <fromani@redhat.com>
The existing cpu/topology manager tests correctly check for the
node resources and skip if the detected resources are not enough
to run the tests, to avoid false negatives.

Unfortunately they do the check against the node capacity, while
the correct approach is to check the allocatable resources.
The existing check is correct only on a narrow set of cases;
otherwise can still lead to false negatives.

This PR fixes that.

Signed-off-by: Francesco Romani <fromani@redhat.com>
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 2, 2022
@k8s-ci-robot
Copy link
Contributor

@fromanirh: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 2, 2022
@ffromani
Copy link
Contributor Author

ffromani commented Feb 2, 2022

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 2, 2022
@ffromani ffromani changed the title E2e tm cpum check node allocatable E2E: {cpu,topology} manager: improve debuggability Feb 2, 2022
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 2, 2022
Even though CI machines _usually_ have at least two cpus,
let's rather not assume this holds true, and let's actually
check the allocatable CPUs, skipping even the simplest
tests if the assumption is broken, to avoid false negatives.

Signed-off-by: Francesco Romani <fromani@redhat.com>
A cpu/topology manager e2e test wants to require one exclusive CPU
and a share of CPU time; let's round up the allocatable CPU requirements
(from 1 to 2) to reduce the chances of false negatives.

Signed-off-by: Francesco Romani <fromani@redhat.com>
@ffromani ffromani force-pushed the e2e-tm-cpum-check-node-allocatable branch from 3924d96 to 7004a71 Compare February 2, 2022 14:17
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just debuggability updates and more skip logic

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fromanirh, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2022
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Done in SIG Node CI/Test Board Feb 2, 2022
@k8s-ci-robot k8s-ci-robot merged commit 2d0fa78 into kubernetes:master Feb 2, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Feb 2, 2022
@ffromani ffromani deleted the e2e-tm-cpum-check-node-allocatable branch July 17, 2023 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants