Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[noderesourcetopology] rewrite accounting of numa-affine resources with scope=container #752

Merged
merged 4 commits into from
May 29, 2024

Conversation

ffromani
Copy link
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:

Rewrite and fix, adding a bunch of tests and better logs along the way, how we treat non-NUMA-affine aka host-level resources. The most prominent example is ephemeral storage, but we aim to cover the general case.

Which issue(s) this PR fixes:

Fixes #751

Special notes for your reviewer:

N/A

Does this PR introduce a user-facing change?

Rewrite handling of ephemeral storage and other non-NUMA-affine resources

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 27, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from Tal-or May 27, 2024 12:01
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 27, 2024
Copy link

netlify bot commented May 27, 2024

Deploy Preview for kubernetes-sigs-scheduler-plugins canceled.

Name Link
🔨 Latest commit 13e9bc7
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-scheduler-plugins/deploys/665704c9c7c84b000836aa46

@ffromani
Copy link
Contributor Author

note: at this stage the integration tests are expected to fail - for the right reasons though :)

@ffromani
Copy link
Contributor Author

note: at this stage the integration tests are expected to fail - for the right reasons though :)

like this, the expected failures:

    --- FAIL: TestTopologyMatchPlugin/[tier1]_multi_containers_all_requiring_ephemeral_storage_with_good_allocation_-_fit (20.03s)
        noderesourcetopology_test.go:1992: Start-topology-match-test [tier1] multi containers all requiring ephemeral storage with good allocation - fit
        noderesourcetopology_test.go:2002: Creating Pod "topology-aware-scheduler-pod-20"
        noderesourcetopology_test.go:2016: pod "topology-aware-scheduler-pod-20" to be scheduled, error: context deadline exceeded
        noderesourcetopology_test.go:2019: Pod topology-aware-scheduler-pod-20 scheduled
        noderesourcetopology_test.go:2028: Pod topology-aware-scheduler-pod-20 is expected on node [fake-node-1], but found on node 
        noderesourcetopology_test.go:2058: Case [tier1] multi containers all requiring ephemeral storage with good allocation - fit finished
    --- FAIL: TestTopologyMatchPlugin/[tier1]_multi_containers_some_requiring_ephemeral_storage_with_good_allocation_-_fit (20.03s)
        noderesourcetopology_test.go:1992: Start-topology-match-test [tier1] multi containers some requiring ephemeral storage with good allocation - fit
        noderesourcetopology_test.go:2002: Creating Pod "topology-aware-scheduler-pod-21"
        noderesourcetopology_test.go:2016: pod "topology-aware-scheduler-pod-21" to be scheduled, error: context deadline exceeded
        noderesourcetopology_test.go:2019: Pod topology-aware-scheduler-pod-21 scheduled
        noderesourcetopology_test.go:2028: Pod topology-aware-scheduler-pod-21 is expected on node [fake-node-1], but found on node 
        noderesourcetopology_test.go:2058: Case [tier1] multi containers some requiring ephemeral storage with good allocation - fit finished
    --- FAIL: TestTopologyMatchPlugin/[tier1]_multi_containers_one_requiring_ephemeral_storage_with_good_allocation_-_fit (20.03s)
        noderesourcetopology_test.go:1992: Start-topology-match-test [tier1] multi containers one requiring ephemeral storage with good allocation - fit
        noderesourcetopology_test.go:2002: Creating Pod "topology-aware-scheduler-pod-22"
        noderesourcetopology_test.go:2016: pod "topology-aware-scheduler-pod-22" to be scheduled, error: context deadline exceeded
        noderesourcetopology_test.go:2019: Pod topology-aware-scheduler-pod-22 scheduled
        noderesourcetopology_test.go:2028: Pod topology-aware-scheduler-pod-22 is expected on node [fake-node-1], but found on node 
        noderesourcetopology_test.go:2058: Case [tier1] multi containers one requiring ephemeral storage with good allocation - fit finished

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 27, 2024
@ffromani
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 27, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 28, 2024
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 28, 2024
@ffromani ffromani force-pushed the nrt-hostlevel-redo branch 3 times, most recently from 85ba7e4 to a9cabd4 Compare May 28, 2024 08:11
@ffromani ffromani changed the title WIP: [noderesourcetopology] rewrite accounting of numa-affine resources with scope=container [noderesourcetopology] rewrite accounting of numa-affine resources with scope=container May 28, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2024
@ffromani
Copy link
Contributor Author

/hold

more testing ongoing

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024
@ffromani
Copy link
Contributor Author

/cc @philsphicas

@k8s-ci-robot
Copy link
Contributor

@ffromani: GitHub didn't allow me to request PR reviews from the following users: philsphicas.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @philsphicas

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ffromani
Copy link
Contributor Author

/hold cancel
ready for review

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 29, 2024
@ffromani
Copy link
Contributor Author

/cc @PiotrProkop

pkg/noderesourcetopology/filter.go Outdated Show resolved Hide resolved
pkg/noderesourcetopology/numaresources.go Outdated Show resolved Hide resolved
pkg/noderesourcetopology/numaresources.go Outdated Show resolved Hide resolved
before kubernetes-sigs#710 and kubernetes-sigs#725, we logged the container being processed alongside
the pod (identified by namespace/name pair).
It was dropped by mistake and not deliberately.
This is useful information when troubleshooting, so let's add it back.

Signed-off-by: Francesco Romani <fromani@redhat.com>
The ephemeral storage resource is not a deciding factor
for noderesourcetopology filtering, but it was incorrectly
accounted causing bad scheduling decisions.
First, we add some integration test coverage to catch
these issues.

Signed-off-by: Francesco Romani <fromani@redhat.com>
@PiotrProkop
Copy link
Contributor

Looks good now, can you just squash 3rd and 4th commit into one?

Rewrite the accounting of NUMA-local resources when
using scope=container. The previous code was too lenient
and worked mostly by side effects when dealing with
non-NUMA affine resources.

A non-NUMA affine resource (aka a hostlevel resource)
is a resource which is not guaranteed to always have
a NUMA affinity. CPU and memory (incl. hugepages) always do,
but devices may or may not, both options are legal for
device plugins.

Similarly, ephemeral storage is a prominent example of resource
which should never have a NUMA affinity.
The accounting in this case was wrong because previously the
resource was considered NUMA affine.

Note: it's legal to configure topology updaters (e.g. NFD)
to not advertise CPU and memory in NRT objects.
Thus is best to treat lack of them as warnings, not
as blocking errors.

However if the per-NUMA affine counters go negative
this is definitely an error condition we need to detect
and be very loud about it.

Signed-off-by: Francesco Romani <fromani@redhat.com>
plugin.go should contain only entry point and
orchestration code. Let's move all the utilties and logic
to other source code files.

Trivial code movement with minimal renames.

Signed-off-by: Francesco Romani <fromani@redhat.com>
@PiotrProkop
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 29, 2024
@k8s-ci-robot k8s-ci-robot merged commit 0834feb into kubernetes-sigs:master May 29, 2024
10 checks passed
@ffromani ffromani deleted the nrt-hostlevel-redo branch May 29, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[noderesourcetopology] incorrect accounting of hostlevel resources (ephemeral storage) with scope=container
3 participants