Skip to content

OSDOCS#12034: HCP Kubevirt topology spread constraint#93263

Merged
xenolinux merged 1 commit intoopenshift:mainfrom
xenolinux:add-topology-hcp
May 15, 2025
Merged

OSDOCS#12034: HCP Kubevirt topology spread constraint#93263
xenolinux merged 1 commit intoopenshift:mainfrom
xenolinux:add-topology-hcp

Conversation

@xenolinux
Copy link
Contributor

@xenolinux xenolinux commented May 13, 2025

Version(s): 4.18+

Issue: Issue: https://issues.redhat.com/browse/OSDOCS-12034

Link to docs preview: https://93263--ocpdocs-pr.netlify.app/openshift-enterprise/latest/hosted_control_planes/hcp-manage/hcp-manage-virt.html#hcp-topology-spread-constraint_hcp-manage-virt

QE review:

  • QE has approved this change.

SME review:

  • SME has approved this change.

Additional information:

This content is QE/SME approved and peer-reviewed. It was reverted as per QE's suggestion. Adding the same content back to docs. Newly added content is reviewed by SMEs.

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 13, 2025
@xenolinux xenolinux added this to the Continuous Release milestone May 13, 2025
@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented May 13, 2025

- SoftTopologyAndDuplicates // <2>
- EvictPodsWithPVC // <3>
- EvictPodsWithLocalStorage // <4>
- LongLifecycle // <5>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest:

apiVersion: operator.openshift.io/v1
kind: KubeDescheduler
metadata:
  name: cluster
  namespace: openshift-kube-descheduler-operator
spec:
  managementState: Managed
  deschedulingIntervalSeconds: 30
  mode: "Automatic"
  profiles:
    - DevKubeVirtRelieveAndMigrate
    - SoftTopologyAndDuplicates
  profileCustomizations:
    devEnableSoftTainter: true
    devDeviationThresholds: AsymmetricLow
    devActualUtilizationProfile: PrometheusCPUCombined

DevKubeVirtRelieveAndMigrate is an enhanced variant of LongLifecycle for the Kubevirt use case.
With that EvictPodsWithPVC and EvictPodsWithLocalStorage are implicitly enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

Comment on lines 51 to 52
<3> By default, the {descheduler-operator} prevents the pod eviction with persistent volume claims (PVCs). Use this profile to allow eviction of pods with PVCs.
<4> By default, pods with local storage are not eligible for eviction. Use this profile to allow eviction of your VMs that use the local storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with DevKubeVirtRelieveAndMigrate we can avoid those two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

<2> This profile evicts pods that follow the soft topology constraint: `whenUnsatisfiable: ScheduleAnyway`.
<3> By default, the {descheduler-operator} prevents the pod eviction with persistent volume claims (PVCs). Use this profile to allow eviction of pods with PVCs.
<4> By default, pods with local storage are not eligible for eviction. Use this profile to allow eviction of your VMs that use the local storage.
<5> This profile balances resource usage between nodes and enables the strategies, such as `RemovePodsHavingTooManyRestarts` and `LowNodeUtilization`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same consideration is valid for DevKubeVirtRelieveAndMigrate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

<3> By default, the {descheduler-operator} prevents the pod eviction with persistent volume claims (PVCs). Use this profile to allow eviction of pods with PVCs.
<4> By default, pods with local storage are not eligible for eviction. Use this profile to allow eviction of your VMs that use the local storage.
<5> This profile balances resource usage between nodes and enables the strategies, such as `RemovePodsHavingTooManyRestarts` and `LowNodeUtilization`.
<6> You must use this setting when performing a live migration so that the descheduler runs in the background during the migration process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed with DevKubeVirtRelieveAndMigrate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

@xenolinux xenolinux force-pushed the add-topology-hcp branch 2 times, most recently from 7a76bf4 to ebfb886 Compare May 14, 2025 08:05
@LiangquanLi930
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 15, 2025
@xenolinux xenolinux added the peer-review-needed Signifies that the peer review team needs to review this PR label May 15, 2025
@tmalove
Copy link
Contributor

tmalove commented May 15, 2025

/remove-label peer-review-needed
/label peer-review-in-progress

@openshift-ci openshift-ci bot added peer-review-in-progress Signifies that the peer review team is reviewing this PR and removed peer-review-needed Signifies that the peer review team needs to review this PR labels May 15, 2025
devActualUtilizationProfile: PrometheusCPUCombined
# ...
----
<1> Sets the number of seconds between the descheduler running cycles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to also comment out the callout lines? The preview looks good though.

Copy link
Contributor Author

@xenolinux xenolinux May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Added # for call outs


By default, KubeVirt virtual machines (VMs) created by a node pool are scheduled on any available nodes that have the capacity to run the VMs. By default, the `topologySpreadConstraint` constraint is set to schedule VMs on multiple nodes.

In some scenarios, node pool VMs might run on the same node, which can cause availability issues. To avoid distribution of VMs on a single node, use the descheduler to continuously honour the `topologySpreadConstraint` constraint to spread VMs on multiple nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In some scenarios, node pool VMs might run on the same node, which can cause availability issues. To avoid distribution of VMs on a single node, use the descheduler to continuously honour the `topologySpreadConstraint` constraint to spread VMs on multiple nodes.
In some scenarios, node pool VMs might run on the same node, which can cause availability issues. To avoid distribution of VMs on a single node, use the descheduler to continuously honor the `topologySpreadConstraint` constraint to spread VMs on multiple nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@tmalove
Copy link
Contributor

tmalove commented May 15, 2025

/remove-label peer-review-in-progress
/label peer-review-done
One small nit, otherwise lgtm!

@openshift-ci openshift-ci bot added peer-review-done Signifies that the peer review team has reviewed this PR and removed peer-review-in-progress Signifies that the peer review team is reviewing this PR labels May 15, 2025
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label May 15, 2025
@openshift-ci
Copy link

openshift-ci bot commented May 15, 2025

New changes are detected. LGTM label has been removed.

@openshift-ci
Copy link

openshift-ci bot commented May 15, 2025

@xenolinux: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@xenolinux xenolinux merged commit 0c980f4 into openshift:main May 15, 2025
2 checks passed
@xenolinux
Copy link
Contributor Author

/cherrypick enterprise-4.19

@xenolinux
Copy link
Contributor Author

/cherrypick enterprise-4.18

@xenolinux xenolinux deleted the add-topology-hcp branch May 15, 2025 16:21
@openshift-cherrypick-robot

@xenolinux: new pull request created: #93449

Details

In response to this:

/cherrypick enterprise-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@xenolinux: new pull request created: #93450

Details

In response to this:

/cherrypick enterprise-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch/enterprise-4.18 branch/enterprise-4.19 peer-review-done Signifies that the peer review team has reviewed this PR size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants