OSDOCS-18567: KUEUE 1.3 Release Notes#108005
Conversation
|
@StephenJamesSmith: This pull request references OSDOCS-18567 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
modules/kueue-release-notes-1.3.adoc
Outdated
| = Release notes for {kueue-name} version 1.3 | ||
|
|
||
| [role="_abstract"] | ||
| {kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.14. |
There was a problem hiding this comment.
| {kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.14. | |
| {kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.16. |
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time. | ||
| + | ||
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. |
There was a problem hiding this comment.
link:link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]
This seems like it wouldn't resolve correctly..
modules/kueue-release-notes-1.3.adoc
Outdated
| == New features and enhancements | ||
|
|
||
| // {lws-operator}:: | ||
| // {kueue-name} version 1.3 provides for the integration of the {lws-operator} with {kueue-name} so you can leverage the {kueue-name} scheduling and resource management functionality when running LeaderWorkerSets. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/ai_workloads/red-hat-build-of-kueue#integrating-lws[Integrating the {lws-operator}]. |
There was a problem hiding this comment.
So if we don't merge JobSet/LWS yet then these docs would not be present.
Should we just include Release Notes on api? Or just accept there could be a gap and try to address as fast as possible?
There was a problem hiding this comment.
For now, I've commented out LWS and JS, as per our meeting discussion. If they can both make it into the Mar09 build, I will remove the comment marks.
|
🤖 Fri Mar 06 16:59:11 - Prow CI generated the docs preview: |
5da229f to
170fd3f
Compare
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| However, existing objects are only auto-converted to the new storage version by Kubernetes during a write request. This means that {kueue-name} API objects that rarely receive updates such as Topologies, ResourceFlavors, or long-running Workloads could remain in the older `v1beta1` format indefinitely. | ||
| + | ||
| For more information, see link:https://issues.redhat.com/browse/OSDOCS-18093[18093]. |
There was a problem hiding this comment.
why linking this jira? perhaps you could link to the openshfit docs page where this is further documented?
There was a problem hiding this comment.
The problem is that this not documented in full in any one place in our documentation that provides any clarification. Rather, it is just updated (from v1beta1) throughout 9 different topics. So I'm just going to remove the link.
|
|
||
| // [id="release-notes-1.3-fixed-issues_{context}"] | ||
| // == Fixed issues | ||
|
|
There was a problem hiding this comment.
Following 1.2 and 1.1, here we would have the Known Issues (that now are fixed) from previous version, Kueue 1.2.
There was a problem hiding this comment.
Added Known issues from 1.2.
modules/kueue-release-notes-1.3.adoc
Outdated
| Upstream progression of the {kueue-name} API to `v1beta2`:: | ||
| {kueue-name} version 1.3 provides the `v1beta2` version of the {kueue-name} API. This update continues the evolution of the {kueue-name} APIs with the ultimate goal of graduating the API to `v1`. | ||
| + | ||
| All new Kueue objects created after the upgrade will be stored using the `v1beta2` version. The earlier version of the API, `v1beta1` is deprecated. |
There was a problem hiding this comment.
Objects still can be created using v1beta1, if needed. However, a deprecation message will be shown.
Kueue 1.3 has a webhook that accepts both api versions.
There was a problem hiding this comment.
Added first sentence.
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| For more information, see link:https://issues.redhat.com/browse/OSDOCS-18093[18093]. | ||
|
|
||
| // [id="release-notes-1.3-fixed-issues_{context}"] |
There was a problem hiding this comment.
@MaysaMacedo brought up that we should probably have https://issues.redhat.com/browse/OCPBUGS-58205?jql=labels%20%3D%20Kueue-1.3%20AND%20type%20%3D%20bug
here.
There was a problem hiding this comment.
I've added that in Fixed issues
1f2aef0 to
516608f
Compare
|
/retest |
|
@StephenJamesSmith: This pull request references OSDOCS-18567 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
1 similar comment
|
/retest |
kannon92
left a comment
There was a problem hiding this comment.
LGTM from SME side.
/assign @anahas-redhat
|
/retest |
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. | ||
|
|
||
| LeaderWorkerSet validation errors:: |
There was a problem hiding this comment.
@StephenJamesSmith this error is already fixed https://issues.redhat.com/browse/OCPBUGS-74210.
I guess it should be under Fixed Issues instead of Known Issues.
It also was done for LeaderWorkerSet and JobSet (if you can add JobSet in the description). Both operators had the same behavior and the fix is applied to both. Thanks.
There was a problem hiding this comment.
Moved this to Fixed Issues and updated for JobSet.
516608f to
2ece973
Compare
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. | ||
|
|
||
| Reconcile jobs only in opt-in namespaces:: |
There was a problem hiding this comment.
given this is added to the Fixed issues already, it should be removed from Known issues
There was a problem hiding this comment.
Removed from Known issues
There was a problem hiding this comment.
Removed from Known issues
modules/kueue-release-notes-1.3.adoc
Outdated
| // {kueue-name} version 1.3 provides for the integration of the {lws-operator} with {kueue-name} so you can leverage the {kueue-name} scheduling and resource management functionality when running LeaderWorkerSets. | ||
|
|
||
| // {js-operator}:: | ||
| // {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. The {js-operator} models a distributed batch workload as a group of Kubernetes Jobs. This allows you to easily specify different pod templates for different distinct groups of pods, for example, a leader, workers, parameter servers, and so on. |
There was a problem hiding this comment.
| // {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. The {js-operator} models a distributed batch workload as a group of Kubernetes Jobs. This allows you to easily specify different pod templates for different distinct groups of pods, for example, a leader, workers, parameter servers, and so on. | |
| // {kueue-name} version 1.3 provides for the integration of the {js-operator} so you can use the {js-operator} to manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. |
We could keep it simple like we have on the lws-operator introduction
6ed45ee to
25427a2
Compare
|
/lgtm |
|
/label merge-review-needed |
skopacz1
left a comment
There was a problem hiding this comment.
One thing that needs to be addressed, otherwise this will be good to merge!
| = Release notes for {kueue-name} version 1.3 | ||
|
|
||
| [role="_abstract"] | ||
| {kueue-name} version 1.3 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.3 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.16. |
There was a problem hiding this comment.
Just curious, if this is supported on 4.18+, why is this PR for 4.21+?
modules/kueue-release-notes-1.3.adoc
Outdated
| + | ||
| When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time. | ||
| + | ||
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. |
There was a problem hiding this comment.
This link to the OpenShift docs needs to be an xref. Usually you can't put an xref in a module but release note files are the one exception:
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html/nodes/working-with-clusters#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. | |
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. |
(I provided the xref that I think it should be, but please double check before using it)
There was a problem hiding this comment.
updated the xref.
25427a2 to
8dedbfd
Compare
|
New changes are detected. LGTM label has been removed. |
| + | ||
| When any change is made to LeaderWorkerSet pods, a rolling update is triggered. This action gradually replaces the old pods of a deployment with new ones, keeping as many pods alive as possible to avoid downtime. If `MaxUnavailable` is disabled, which is the {product-title} default setting, the pods are updated one at a time. | ||
| + | ||
| If you want to run updates in parallel instead of running them sequentially, `MaxUnavailable` feature gate must be enabled. For more information, see xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-install_nodes-cluster-enabling[Enabling feature sets at installation] and link:https://lws.sigs.k8s.io/docs/concepts/rollout-strategy/[Rollout Strategy]. |
There was a problem hiding this comment.
🤖 [error] OpenShiftAsciiDoc.NoXrefInModules: Do not include xrefs in modules, only assemblies (exception: release notes modules).
|
@StephenJamesSmith: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
skopacz1
left a comment
There was a problem hiding this comment.
New xref looks good to me!
|
/cherrypick enterprise-4.21 |
|
@skopacz1: new pull request created: #108075 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@skopacz1: new pull request created: #108076 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
KUEUE 1.3 Release Notes
Version(s): 4.21, 4.22
Issue: https://issues.redhat.com/browse/OSDOCS-17094
Link to docs preview: https://108005--ocpdocs-pr.netlify.app/openshift-enterprise/latest/ai_workloads/kueue/release-notes.html#release-notes-1.3_release-notes
Dev: @kannon92
QE review: @anahas-redhat