Skip to content

Conversation

@nmn3m
Copy link
Member

@nmn3m nmn3m commented Oct 3, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR extends pkg/scheduler/framework/autoscaler_contract/lister_contract_test.go with a lister that uses a data structure defined in the k8s.io/dynamic-resource-allocation/structured package.

Currently, changes to that struct do not fall under SIG Autoscaling review unless explicitly requested. This makes it harder to catch issues during code review.

Why this is needed:
To ensure that changes affecting the autoscaler are visible to SIG Autoscaling reviewers and do not slip through unnoticed.

Which issue(s) this PR is related to:

Fixes #133162

Special notes for your reviewer:

One potential solution applied here is moving the relevant types into a new package:
k8s.io/dynamic-resource-allocation/structured/schedulerapi as @pohly mentioned in the issue.

An OWNERS file is set up there similar to the one under autoscaler_contract, so that the right reviewers are automatically added.

/wg device-management
/sig autoscaling

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. labels Oct 3, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 3, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Oct 3, 2025
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Oct 3, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 3, 2025
@nmn3m
Copy link
Member Author

nmn3m commented Oct 3, 2025

/verify-owners

@nmn3m nmn3m force-pushed the dra-consumable-capacity-autoscaler-contract branch from 78ddad0 to 0e923e4 Compare October 3, 2025 16:08
# See the OWNERS file documentation: https://git.k8s.io/community/contributors/guide/owners.md

approvers:
- kubernetes/sig-autoscaling-reviewers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubernetes/sig-autoscaling-reviewers should be sig-autoscaling-maintainers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis, Thanks for your feedback, I fixed them.


approvers:
- kubernetes/sig-autoscaling-reviewers
- kubernetes/wg-device-management
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no wg-device-management in any aliases definition in k/k (that I can tell). The common pattern as of now is @pohly and @klueska for WG Device Management approvers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback, I fixed them.

reviewers:
- kubernetes/sig-autoscaling-reviewers
- kubernetes/wg-device-management
- kubernetes/sig-scheduling-reviewers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubernetes/sig-scheduling-reviewers can be sig-scheduling-maintainers or just sig-scheduling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis, Thanks for your feedback, I fixed them.

@nmn3m nmn3m force-pushed the dra-consumable-capacity-autoscaler-contract branch from 0e923e4 to 92fe2c6 Compare October 4, 2025 09:50
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 4, 2025
@nmn3m nmn3m requested a review from jackfrancis October 4, 2025 10:46
@pohly pohly moved this from 🆕 New to 👀 In review in Dynamic Resource Allocation Oct 6, 2025

// TestSchedulerContractTypes validates that all scheduler contract types
// are properly defined and accessible.
func TestSchedulerContractTypes(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what exactly these tests are testing. They seem to exercise some of those types, but for what purpose? What kind of guarantee do we get out them that isn't already covered by "the production code compiles"?


// DeviceID represents a unique identifier for a device in the DRA system.
// This type is used in the scheduler and autoscaler contract.
type DeviceID = structured.DeviceID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want to achieve with this package is that any code change which changes content or semantic of AllocatedState is visible as a scheduler contract change and triggers the "needs autoscaler approval" OWNERS rule.

The way how the type aliases are set up right now doesn't achieve this. Code changes can be made in k8s.io/dynamic-resource-allocation/structured without autoscaler review and this code here continues to work unchanged.

It has to be the other way around: this package here needs to have the actual struct definitions such that it is self-contained. Then k8s.io/dynamic-resource-allocation/structured can have the type aliases to avoid code churn and simplify using the types.

@github-project-automation github-project-automation bot moved this to Needs Review in SIG Scheduling Oct 6, 2025
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less what I was expecting. Let's see what others think about this approach.

"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/sets"
draapi "k8s.io/dynamic-resource-allocation/api"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used for UniqueString, which I think is okay to import instead of defining it here.

}

// Clone makes a copy of ConsumedCapacity of each capacity.
func (c ConsumedCapacityCollection) Clone() ConsumedCapacityCollection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, you are just moving it, but I wonder: can we replace the manually written Clone with generated DeepCopy code?

/cc @sunya-ch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch,
Yes, we can replace the manual Clone methods with generated DeepCopy code.
Would you prefer I do that in this PR, or should we keep this PR focused on moving the
types to establish the contract and handle the DeepCopy generation in a follow-up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate PR is probably cleaner.

@k8s-ci-robot k8s-ci-robot requested a review from sunya-ch October 7, 2025 18:47
@nmn3m nmn3m force-pushed the dra-consumable-capacity-autoscaler-contract branch from f2a1c98 to bb3d60a Compare October 7, 2025 19:47
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I am not the main stakeholder here.

/assign @towca @sunya-ch

@sunya-ch
Copy link
Contributor

Looks good to me.

The original comment thread recommends to have this in framework/autoscaler_contract to keep it the same stability guarantees as the rest of the integration surface as it can break the Cluster Autoscaler/kube-scheduler integration and @nmn3m has already addressed this concern by setting the owner file already. I think @mortent and @towca are the main stakeholder who can confirm whether this is resolved.

@towca
Copy link
Contributor

towca commented Oct 23, 2025

Looks good to me, thanks for taking care of this!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 23, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2df719839e4fdbe6f039c217cf7343af171fbfe1

@jackfrancis
Copy link
Contributor

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 23, 2025
@nmn3m
Copy link
Member Author

nmn3m commented Oct 23, 2025

As mentioned #134404 (comment)
can you please take a look ?
/cc @klueska @gjtempleton

Copy link
Contributor

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@pohly
Copy link
Contributor

pohly commented Oct 24, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, nmn3m, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2025
@k8s-ci-robot k8s-ci-robot merged commit ae60c55 into kubernetes:master Oct 24, 2025
21 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.35 milestone Oct 24, 2025
@github-project-automation github-project-automation bot moved this from Waiting on Author to Done in SIG Node: code and documentation PRs Oct 24, 2025
@github-project-automation github-project-automation bot moved this from Needs Review to Done in SIG Scheduling Oct 24, 2025
@pohly pohly moved this from 👀 In review to ✅ Done in Dynamic Resource Allocation Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

DRA Consumable Capacity: scheduler/autoscaler contract definition

6 participants