Skip to content

OSDOCS-18567: KUEUE 1.3 Release Notes#108005

Merged
skopacz1 merged 1 commit intoopenshift:mainfrom
StephenJamesSmith:OSDOCS-18567
Mar 9, 2026
Merged

OSDOCS-18567: KUEUE 1.3 Release Notes#108005
skopacz1 merged 1 commit intoopenshift:mainfrom
StephenJamesSmith:OSDOCS-18567

Conversation

@StephenJamesSmith
Copy link
Contributor

@StephenJamesSmith StephenJamesSmith commented Mar 5, 2026

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 5, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 5, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 5, 2026

@StephenJamesSmith: This pull request references OSDOCS-18567 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

KUEUE 1.3 Release Notes
Version(s): 4.21, 4.22

Issue: https://issues.redhat.com/browse/OSDOCS-17094

Link to docs preview:

Dev: @kannon92

QE review: @anahas-redhat

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 5, 2026
= Release notes for {kueue-name} version 1.3

[role="_abstract"]
{kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.14.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.14.
{kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.16.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

+
When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time.
+
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link:link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]

This seems like it wouldn't resolve correctly..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

== New features and enhancements

// {lws-operator}::
// {kueue-name} version 1.3 provides for the integration of the {lws-operator} with {kueue-name} so you can leverage the {kueue-name} scheduling and resource management functionality when running LeaderWorkerSets. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/ai_workloads/red-hat-build-of-kueue#integrating-lws[Integrating the {lws-operator}].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we don't merge JobSet/LWS yet then these docs would not be present.

Should we just include Release Notes on api? Or just accept there could be a gap and try to address as fast as possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I've commented out LWS and JS, as per our meeting discussion. If they can both make it into the Mar09 build, I will remove the comment marks.

@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented Mar 5, 2026

🤖 Fri Mar 06 16:59:11 - Prow CI generated the docs preview:

https://108005--ocpdocs-pr.netlify.app/openshift-enterprise/latest/ai_workloads/kueue/release-notes.html

@StephenJamesSmith StephenJamesSmith force-pushed the OSDOCS-18567 branch 3 times, most recently from 5da229f to 170fd3f Compare March 5, 2026 19:30
+
However, existing objects are only auto-converted to the new storage version by Kubernetes during a write request. This means that {kueue-name} API objects that rarely receive updates such as Topologies, ResourceFlavors, or long-running Workloads could remain in the older `v1beta1` format indefinitely.
+
For more information, see link:https://issues.redhat.com/browse/OSDOCS-18093[18093].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why linking this jira? perhaps you could link to the openshfit docs page where this is further documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that this not documented in full in any one place in our documentation that provides any clarification. Rather, it is just updated (from v1beta1) throughout 9 different topics. So I'm just going to remove the link.


// [id="release-notes-1.3-fixed-issues_{context}"]
// == Fixed issues

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following 1.2 and 1.1, here we would have the Known Issues (that now are fixed) from previous version, Kueue 1.2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Known issues from 1.2.

Upstream progression of the {kueue-name} API to `v1beta2`::
{kueue-name} version 1.3 provides the `v1beta2` version of the {kueue-name} API. This update continues the evolution of the {kueue-name} APIs with the ultimate goal of graduating the API to `v1`.
+
All new Kueue objects created after the upgrade will be stored using the `v1beta2` version. The earlier version of the API, `v1beta1` is deprecated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Objects still can be created using v1beta1, if needed. However, a deprecation message will be shown.

Kueue 1.3 has a webhook that accepts both api versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added first sentence.

+
For more information, see link:https://issues.redhat.com/browse/OSDOCS-18093[18093].

// [id="release-notes-1.3-fixed-issues_{context}"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that in Fixed issues

@StephenJamesSmith StephenJamesSmith force-pushed the OSDOCS-18567 branch 2 times, most recently from 1f2aef0 to 516608f Compare March 6, 2026 00:13
@StephenJamesSmith
Copy link
Contributor Author

/retest

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@StephenJamesSmith: This pull request references OSDOCS-18567 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

KUEUE 1.3 Release Notes
Version(s): 4.21, 4.22

Issue: https://issues.redhat.com/browse/OSDOCS-17094

Link to docs preview: https://108005--ocpdocs-pr.netlify.app/openshift-enterprise/latest/ai_workloads/kueue/release-notes.html#release-notes-1.3_release-notes

Dev: @kannon92

QE review: @anahas-redhat

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@StephenJamesSmith
Copy link
Contributor Author

/retest

1 similar comment
@StephenJamesSmith
Copy link
Contributor Author

/retest

Copy link

@kannon92 kannon92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from SME side.

/assign @anahas-redhat

@StephenJamesSmith
Copy link
Contributor Author

/retest

+
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].

LeaderWorkerSet validation errors::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StephenJamesSmith this error is already fixed https://issues.redhat.com/browse/OCPBUGS-74210.
I guess it should be under Fixed Issues instead of Known Issues.

It also was done for LeaderWorkerSet and JobSet (if you can add JobSet in the description). Both operators had the same behavior and the fix is applied to both. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to Fixed Issues and updated for JobSet.

+
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].

Reconcile jobs only in opt-in namespaces::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this is added to the Fixed issues already, it should be removed from Known issues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed from Known issues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed from Known issues

// {kueue-name} version 1.3 provides for the integration of the {lws-operator} with {kueue-name} so you can leverage the {kueue-name} scheduling and resource management functionality when running LeaderWorkerSets.

// {js-operator}::
// {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. The {js-operator} models a distributed batch workload as a group of Kubernetes Jobs. This allows you to easily specify different pod templates for different distinct groups of pods, for example, a leader, workers, parameter servers, and so on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. The {js-operator} models a distributed batch workload as a group of Kubernetes Jobs. This allows you to easily specify different pod templates for different distinct groups of pods, for example, a leader, workers, parameter servers, and so on.
// {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training.

We could keep it simple like we have on the lws-operator introduction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@StephenJamesSmith StephenJamesSmith force-pushed the OSDOCS-18567 branch 2 times, most recently from 6ed45ee to 25427a2 Compare March 6, 2026 14:36
@anahas-redhat
Copy link

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 6, 2026
@StephenJamesSmith StephenJamesSmith changed the title [WIP]OSDOCS-18567: KUEUE 1.3 Release Notes OSDOCS-18567: KUEUE 1.3 Release Notes Mar 6, 2026
@StephenJamesSmith
Copy link
Contributor Author

/label merge-review-needed

@openshift-ci openshift-ci bot added merge-review-needed Signifies that the merge review team needs to review this PR and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Mar 6, 2026
@skopacz1 skopacz1 added merge-review-in-progress Signifies that the merge review team is reviewing this PR branch/enterprise-4.21 branch/enterprise-4.22 labels Mar 6, 2026
@skopacz1 skopacz1 added this to the Continuous Release milestone Mar 6, 2026
Copy link
Contributor

@skopacz1 skopacz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that needs to be addressed, otherwise this will be good to merge!

= Release notes for {kueue-name} version 1.3

[role="_abstract"]
{kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.16.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, if this is supported on 4.18+, why is this PR for 4.21+?

+
When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time.
+
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link to the OpenShift docs needs to be an xref. Usually you can't put an xref in a module but release note files are the one exception:

Suggested change
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].

(I provided the xref that I think it should be, but please double check before using it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the xref.

@skopacz1 skopacz1 added ok-to-merge and removed merge-review-in-progress Signifies that the merge review team is reviewing this PR merge-review-needed Signifies that the merge review team needs to review this PR labels Mar 6, 2026
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 6, 2026
@openshift-ci
Copy link

openshift-ci bot commented Mar 6, 2026

New changes are detected. LGTM label has been removed.

+
When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time.
+
If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] OpenShiftAsciiDoc.NoXrefInModules: Do not include xrefs in modules, only assemblies (exception: release notes modules).

@openshift-ci
Copy link

openshift-ci bot commented Mar 6, 2026

@StephenJamesSmith: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

@skopacz1 skopacz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New xref looks good to me!

@skopacz1 skopacz1 merged commit 6571e08 into openshift:main Mar 9, 2026
2 checks passed
@skopacz1
Copy link
Contributor

skopacz1 commented Mar 9, 2026

/cherrypick enterprise-4.21
/cherrypick enterprise-4.22

@openshift-cherrypick-robot

@skopacz1: new pull request created: #108075

Details

In response to this:

/cherrypick enterprise-4.21
/cherrypick enterprise-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@skopacz1: new pull request created: #108076

Details

In response to this:

/cherrypick enterprise-4.21
/cherrypick enterprise-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch/enterprise-4.21 branch/enterprise-4.22 jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ok-to-merge size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants