Skip to content

Conversation

@ichekrygin
Copy link
Contributor

@ichekrygin ichekrygin commented Jun 5, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR introduces the foundational implementation of WorkloadSlices in Kueue, as proposed in KEP-77. WorkloadSlices enable controlled scaling of admitted workloads (e.g., scale-up) while preserving Kueue's scheduling guarantees and resource tracking semantics.

📌 Summary

  • Introduces WorkloadSlice concept as a transient workload object representing a logical scale-up request.
  • Enables mutable workload behavior using a dual-Workload model:
    • Original admitted workload.
    • A new WorkloadSlice with additional requested capacity (1:2 state during transition).
  • Admission of the new WorkloadSlice triggers preemption of the original Workload, even if additional capacity is available, to enforce consistent admission-state transitions.
  • Uses Pod scheduling gates (instead of spec.suspend) for gating new pods until the slice is admitted.
  • Defaulting logic enables the feature automatically for supported jobs (e.g., batchv1.Job, RayJob).
  • Ensures all new pods created during the transition are gated until the corresponding Workload is admitted.
  • Aggregates admission state and lifecycle management into the core Workload controller flow.

📎 Additional Notes

  • Fully backward-compatible with existing single-Workload flow.
  • Includes tests for:
    • Slice creation logic.
    • Admission/preemption interaction.
    • Scheduling gate behavior.
  • Documentation and KEP link updates to follow in separate PR.

⚠️ Known Limitations

  • Multi-cluster support for WorkloadSlices is still a work in progress and will be addressed in this or follow-up PRs.

Which issue(s) this PR fixes:

Fixes #5528

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Support for Elastic (Dynamically Sized Jobs) in Alpha as designed in [KEP-77](https://github.com/kubernetes-sigs/kueue/tree/main/keps/77-dynamically-sized-jobs). 
The implementation supports resizing (scale up and down) of batch/v1.Job and is behind the Alpha 
`ElasticJobsViaWorkloadSlices` feature gate. Jobs which are subject to resizing need to have the
`kueue.x-k8s.io/elastic-job` annotation added at creation time.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 5, 2025
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 5, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @ichekrygin. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 5, 2025
@k8s-ci-robot k8s-ci-robot requested review from mimowo and tenzen-y June 5, 2025 05:33
@netlify
Copy link

netlify bot commented Jun 5, 2025

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit f1d4409
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/68807c94966dc0000839ef27

@ichekrygin ichekrygin force-pushed the wl-slices branch 3 times, most recently from d9f4768 to fe05a9f Compare June 5, 2025 22:27
@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 5, 2025
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jun 5, 2025
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 5, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 10, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2025
@mimowo
Copy link
Contributor

mimowo commented Jul 21, 2025

@ichekrygin I think this is very close to be mergable. I left a bunch of comments, mostly renames to use "replacing" terminology consistently, rather than preemptions, because the mechanism only marginally relies on preemptions.

it would also be great to add integration tests for the happy path. The release is on Friday, so we still have a bit of time to address the comments I think.

Feel free to also squash the commits. There are 33 of them, I highly doubt anyone would like to be traversing them :)

ichekrygin and others added 6 commits July 21, 2025 13:45
Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>
Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>
Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>
…logy.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>
@mimowo
Copy link
Contributor

mimowo commented Jul 22, 2025

LGTM, but please address the remaining comments

…rom "scheduler" to "workloadslicing" package.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>
…s feature.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>
@mimowo
Copy link
Contributor

mimowo commented Jul 23, 2025

Let's make the note a bit more user-oriented, I think the workload-slices replacement is more of a technical detail. Putting a link to KEP77 is probably enough for interested readers.
/release-note-edit

Support for Elastic (Dynamically Sized Jobs) in Alpha as designed in [KEP-77](https://github.com/kubernetes-sigs/kueue/tree/main/keps/77-dynamically-sized-jobs). 
The implementation supports resizing (scale up and down) of batch/v1.Job and is behind the Alpha 
`ElasticJobsViaWorkloadSlices` feature gate. Jobs which are subject to resizing need to have the
`kueue.x-k8s.io/elastic-job` annotation added at creation time.

@mimowo
Copy link
Contributor

mimowo commented Jul 23, 2025

/lgtm
/approev
Thank you for your relentless work on KEP-77 and this implementation PR. This is one of the oldest and most anticipated KEPs in Kueue. While we still have a long way to go (e.g., support for other Job CRDs, MultiKueue, TAS), this is a huge milestone, and I'm very happy to get this in.

FYI @tenzen-y: Since the release is approaching and all of my comments have been addressed, I am merging this now to avoid potential conflicts with other PRs. I've taken extra care to ensure all new code is behind the alpha feature gate. Please feel free to add any further comments or open a new issue for follow-up items. I'm confident we can address them.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b632836c18466b71df5bac3e1328c769844678ef

@mimowo
Copy link
Contributor

mimowo commented Jul 23, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ichekrygin, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2025
@k8s-ci-robot k8s-ci-robot merged commit 388d6de into kubernetes-sigs:main Jul 23, 2025
22 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.13 milestone Jul 23, 2025
@ichekrygin ichekrygin deleted the wl-slices branch July 23, 2025 15:09
kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this pull request Aug 11, 2025
* Add workload slice conditions constants to track workload slice aggregation and deactivation.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Add support for generating unique workload names based on owner object generation.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Add and enhance workload PodSet count manipulation functions

This commit introduces new functions and enhances existing ones for manipulating a Workload's PodSet counts, including:

- Retrieving PodSet counts
- Detecting PodSet count reduction
- Checking for equality of PodSet counts
- Updating PodSet counts

These functions will be used in the upcoming workload-slice implementation. They also replace existing, similar functionality that is now marked for deprecation, promoting code consolidation and maintainability.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Add initial support for WorkloadSlices to support (dynamically)scaled jobs.

This change introduces core support for WorkloadSlices as outlined in KEP-77,
enabling the scheduler to handle dynamically sized jobs through fine-grained
workload subdivision.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Post-rebase update adding "WorkloadReference".

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Refactored prepareWorkload to extract workload slice handling into a separate function and added unit test coverage

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Rename WorkloadSlice feature gate and job annotation to follow DynamicallySizedJob naming convention

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Refactor workload slice scheduling: move capacity calculation to flavor assignment and assert flavor persistence between slices.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Post-rebase update changing "WorkloadReference" -> "Reference"

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Restore removed (by accident) "blank" imports.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Relax workload PodSet validation only when the "DynamicallySizedJob" feature is enabled, and update the associated unit and integration tests.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update DynamicallySizedJobs feature check.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Rebrand DynamicallySizedJobs -> ElasticJobs[ViaWorkloadSlices]

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Address PR review feedback.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* PR: address kubernetes-sigs#5510 (comment)

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* PR: address kubernetes-sigs#5510 (review)

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Refactor workload-slice eviction/preemption replacing it with workload slice aggregation.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Revert change to `ensureWorkload` merging with `ensureOneWorkload` as per PR review.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update `WorkloadPreemptibleSliceNameKey` annotation key per PR feedback.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update apis/kueue/v1beta1/constants.go

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>

* Update `WorkoadSlice` related constants and address additional comments per PR feedback,

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update apis/kueue/v1beta1/workload_types.go

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>

* Update to address PR feedback.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Convert replaceable slice target name to use workload.Reference for consistency.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update workload slice deactivation.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update feature gate activation test.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update per PR feedback.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update flavor assignment after rebase.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update per PR review.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Remove obsolete WorkloadSliceReplacementReason constant.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Fix linter errors.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update workload slice related integration tests.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Refactor name WorkloadSliceReplacementForKey -> WorkloadSliceReplacementFor for clarity.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update pkg/scheduler/scheduler.go

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>

* Remove check for old workload-slice name collision.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Update pkg/scheduler/scheduler.go

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>

* Update preemptable workload slice naming to utilize "Replace" terminology.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Refactor workload-slice replacement/preemption target functionality from "scheduler" to "workloadslicing" package.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

* Add integration test for job with enabled ElasticJobsViaWorkloadSlices feature.

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>

---------

Signed-off-by: ichekrygin <illya.chekrygin@gmail.com>
Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Initial implementation of KEP-77: Elastic Jobs via WorkloadSlices

3 participants