Skip to content

Comments

CNTRLPLANE-2740: Set unhealthyPodEvictionPolicy to AlwaysAllow on all PDBs#7721

Merged
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
enxebre:fix-CNTRLPLANE-2740
Feb 24, 2026
Merged

CNTRLPLANE-2740: Set unhealthyPodEvictionPolicy to AlwaysAllow on all PDBs#7721
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
enxebre:fix-CNTRLPLANE-2740

Conversation

@enxebre
Copy link
Member

@enxebre enxebre commented Feb 13, 2026

Summary

  • Set unhealthyPodEvictionPolicy: AlwaysAllow on all PDBs managed by the HCP operator (etcd, kube-apiserver, openshift-apiserver, openshift-oauth-apiserver, oauth-openshift, HCP private router, shared ingress router)
  • Remove unused support/util/pdb.go (zero production callers)
  • Add unit tests with idempotency coverage

What this PR does / why we need it

Sets unhealthyPodEvictionPolicy: AlwaysAllow on every PodDisruptionBudget managed by the HostedControlPlane operator.

This allows running-but-unhealthy hosted control-plane pods to be evicted during management cluster node drains. Unready pods are unlikely to be contributing to service availability, so blocking drains on their behalf adds risk without meaningful availability benefit.

The unhealthyPodEvictionPolicy field is GA since Kubernetes 1.31 / OpenShift 4.18.

Components affected

All PDBs managed by the HCP operator:

  • etcd
  • kube-apiserver
  • openshift-apiserver
  • openshift-oauth-apiserver
  • oauth-openshift
  • HCP private router
  • Shared ingress router

Changes

  • support/controlplane-component/common.goAdaptPodDisruptionBudget() now sets UnhealthyPodEvictionPolicy to AlwaysAllow (covers the 6 v2 control-plane components)
  • hypershift-operator/controllers/sharedingress/router.goReconcileRouterPodDisruptionBudget() updated for the shared ingress router PDB
  • support/util/pdb.go – removed (zero production callers)

Which issue(s) this PR fixes

Fixes CNTRLPLANE-2740

Special notes for your reviewer

The unhealthyPodEvictionPolicy field is set programmatically in the Go adapter function that already mutates PDB specs at reconciliation time. No YAML asset changes are needed. The second commit contains only regenerated fixture files.

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code

enxebre and others added 2 commits February 13, 2026 02:16
…PDBs

Set the PodDisruptionBudget unhealthyPodEvictionPolicy property to
AlwaysAllow for all PDBs managed by the HostedControlPlane operator.
This allows running-but-unhealthy pods to be evicted during node drains
on the management cluster, without reducing hosted service availability
since unready pods are not contributing to service availability.

Remove unused support/util/pdb.go as ReconcilePodDisruptionBudget had
zero production callers.

This addresses WRKLDS-1490 which asks all PDB maintainers to adopt the
unhealthyPodEvictionPolicy property (GA in Kubernetes 1.31 / OCP 4.18).

Ref: CNTRLPLANE-2740

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Regenerated by: UPDATE=true go test ./control-plane-operator/controllers/hostedcontrolplane/ -run TestControlPlaneComponents

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 13, 2026

@enxebre: This pull request references CNTRLPLANE-2740 which is a valid jira issue.

Details

In response to this:

Summary

  • Set unhealthyPodEvictionPolicy: AlwaysAllow on all PDBs managed by the HCP operator (etcd, kube-apiserver, openshift-apiserver, openshift-oauth-apiserver, oauth-openshift, HCP private router, shared ingress router)
  • Remove unused support/util/pdb.go (zero production callers)
  • Add unit tests with idempotency coverage

What this PR does / why we need it

Sets unhealthyPodEvictionPolicy: AlwaysAllow on every PodDisruptionBudget managed by the HostedControlPlane operator.

This allows running-but-unhealthy hosted control-plane pods to be evicted during management cluster node drains. Unready pods are unlikely to be contributing to service availability, so blocking drains on their behalf adds risk without meaningful availability benefit.

The unhealthyPodEvictionPolicy field is GA since Kubernetes 1.31 / OpenShift 4.18.

Components affected

All PDBs managed by the HCP operator:

  • etcd
  • kube-apiserver
  • openshift-apiserver
  • openshift-oauth-apiserver
  • oauth-openshift
  • HCP private router
  • Shared ingress router

Changes

  • support/controlplane-component/common.goAdaptPodDisruptionBudget() now sets UnhealthyPodEvictionPolicy to AlwaysAllow (covers the 6 v2 control-plane components)
  • hypershift-operator/controllers/sharedingress/router.goReconcileRouterPodDisruptionBudget() updated for the shared ingress router PDB
  • support/util/pdb.go – removed (zero production callers)

Which issue(s) this PR fixes

Fixes CNTRLPLANE-2740

Special notes for your reviewer

The unhealthyPodEvictionPolicy field is set programmatically in the Go adapter function that already mutates PDB specs at reconciliation time. No YAML asset changes are needed. The second commit contains only regenerated fixture files.

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 13, 2026

Walkthrough

This pull request updates PodDisruptionBudget specifications across multiple control plane components and cloud provider variants to include unhealthyPodEvictionPolicy: AlwaysAllow. Changes span test fixture files, implementation code, and tests to consistently apply this eviction policy. A utility function is refactored with its functionality moved to a new common implementation.

Changes

Cohort / File(s) Summary
PodDisruptionBudget Test Fixtures - etcd
control-plane-operator/controllers/hostedcontrolplane/testdata/etcd/*/zz_fixture_TestControlPlaneComponents_etcd_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants (AROSwift, GCP, IBMCloud, TechPreviewNoUpgrade).
PodDisruptionBudget Test Fixtures - kube-apiserver
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/*/zz_fixture_TestControlPlaneComponents_kube_apiserver_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants.
PodDisruptionBudget Test Fixtures - oauth-openshift
control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/*/zz_fixture_TestControlPlaneComponents_oauth_openshift_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants.
PodDisruptionBudget Test Fixtures - openshift-apiserver
control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/*/zz_fixture_TestControlPlaneComponents_openshift_apiserver_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants.
PodDisruptionBudget Test Fixtures - openshift-oauth-apiserver
control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/*/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants.
PodDisruptionBudget Test Fixtures - router
control-plane-operator/controllers/hostedcontrolplane/testdata/router/*/zz_fixture_TestControlPlaneComponents_router_poddisruptionbudget.yaml
Added unhealthyPodEvictionPolicy: AlwaysAllow to PodDisruptionBudget spec across all cloud provider variants.
HyperShift Router Controller
hypershift-operator/controllers/sharedingress/router.go, hypershift-operator/controllers/sharedingress/router_test.go
Implemented UnhealthyPodEvictionPolicy: AlwaysAllow in router PodDisruptionBudget reconciliation with comprehensive unit tests validating policy application and overwrite behavior.
Control Plane Component Support
support/controlplane-component/common.go, support/controlplane-component/common_test.go
Implemented AdaptPodDisruptionBudget function setting UnhealthyPodEvictionPolicy to AlwaysAllow with unit tests covering SingleReplica and HighlyAvailable scenarios.
Utility Removal
support/util/pdb.go
Removed exported ReconcilePodDisruptionBudget function; functionality consolidated into AdaptPodDisruptionBudget in common support module.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from devguyio and muraee February 13, 2026 01:25
@openshift-ci openshift-ci bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Feb 13, 2026
@enxebre enxebre marked this pull request as draft February 13, 2026 01:30
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2026
@enxebre
Copy link
Member Author

enxebre commented Feb 13, 2026

/test verify
/test unit

@enxebre
Copy link
Member Author

enxebre commented Feb 13, 2026

/test images
/test security
/test okd-scos-images

Copy link
Member

@bryan-cox bryan-cox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/hold
Holding for others general consensus, feel free to remove it when you want

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 13, 2026
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 13, 2026
@openshift-ci-robot
Copy link

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-21
/test e2e-aws-4-21
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox, enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@csrwng
Copy link
Contributor

csrwng commented Feb 16, 2026

/lgtm
/retest-required

I don't see a negative to making this the default even if it doesn't fix the problem entirely.

@enxebre
Copy link
Member Author

enxebre commented Feb 18, 2026

/hold cancel

@enxebre enxebre marked this pull request as ready for review February 18, 2026 16:57
@openshift-ci openshift-ci bot removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Feb 18, 2026
@openshift-ci openshift-ci bot requested a review from jparrill February 18, 2026 17:05
@openshift-ci openshift-ci bot requested a review from sjenning February 18, 2026 17:05
@enxebre
Copy link
Member Author

enxebre commented Feb 18, 2026

/test e2e-aks

@enxebre
Copy link
Member Author

enxebre commented Feb 19, 2026

/retest

@enxebre enxebre added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Feb 19, 2026
@enxebre
Copy link
Member Author

enxebre commented Feb 19, 2026

/retest

@enxebre
Copy link
Member Author

enxebre commented Feb 24, 2026

/verified by @jiezhao16

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Feb 24, 2026
@openshift-ci-robot
Copy link

@enxebre: This PR has been marked as verified by @jiezhao16.

Details

In response to this:

/verified by @jiezhao16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@enxebre: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 34e30ac into openshift:main Feb 24, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants