Skip to content

Conversation

@muraee
Copy link
Contributor

@muraee muraee commented Jan 8, 2026

Summary

For cluster versions 4.20 and above, use the HyperShift Operator image directly for the Control Plane Operator instead of extracting it from the OCP release payload.

Benefits

  • Faster feature delivery: CPO ships with HO releases instead of being tied to OCP payload
  • Simplified hotfix process: Single HO image bump fixes all 4.20+ clusters (no per-cluster annotation overrides needed)
  • Consistent deployment model: Same approach for both managed services and self-managed

Changes

  1. support/util/util.go: Modified GetControlPlaneOperatorImage() to use HO image for 4.20+ if the CPO binary exists
  2. support/util/util_test.go: Added comprehensive unit tests for the new behavior
  3. Dockerfile & Containerfile.operator:
    • Build and include control-plane-operator and control-plane-pki-operator binaries
    • Add symlinks for ignition-server, konnectivity-socks5-proxy, availability-prober, token-minter
    • Add missing label io.openshift.hypershift.control-plane-operator-supports-kas-custom-kubeconfig=true

Safety Mechanism

The code includes a safety check that verifies /usr/bin/control-plane-operator exists in the HO image before using it. This ensures:

  • Older HO images (without the CPO binary) continue to use the release payload CPO
  • Self-managed users running pre-change HO versions are not affected
  • Graceful fallback to payload CPO if binary check fails

Behavior Matrix

Cluster Version HO Has CPO Binary Result
4.20+ Yes Uses HO image
4.20+ No Uses payload CPO (graceful fallback)
< 4.20 Any Uses payload CPO

Test plan

  • Unit tests pass for GetControlPlaneOperatorImage
  • E2E test with 4.20+ cluster to verify CPO uses HO image
  • E2E test with pre-change HO image to verify fallback to payload CPO
  • Verify CPO starts correctly with the new image

🤖 Generated with Claude Code

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 8, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Changes update container build files to include control-plane-operator and control-plane-pki-operator binaries with symlink configuration and new metadata label. The image selection logic adds CPO binary detection with caching to check HO image availability before consulting release payload. Tests validate the updated precedence logic across multiple scenarios.

Changes

Cohort / File(s) Summary
Container build configuration
Containerfile.operator, Dockerfile
Added karpenter-operator, control-plane-operator, and control-plane-pki-operator build targets; copy control-plane-operator and control-plane-pki-operator binaries to /usr/bin/; create symlinks for ignition-server, konnectivity-socks5-proxy, availability-prober, and token-minter pointing to control-plane-operator; add new LABEL annotation io.openshift.hypershift.control-plane-operator-supports-kas-custom-kubeconfig=true.
CPO image selection logic
support/util/util.go
Introduced cpoBinaryPath constant and cpoBinaryExists cache variable; added cpoBinaryExistsInHOImage() function for cached binary presence detection; refactored GetControlPlaneOperatorImage precedence to check for CPO binary in HO image (4.20+) before consulting release payload.
Image selection test coverage
support/util/util_test.go
Added TestGetControlPlaneOperatorImage with multiple test cases covering CPO annotation override, HO image availability, hypershift payload presence, and CPO binary existence scenarios; introduced testReleaseProvider for mocking release lookups.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 8, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added do-not-merge/needs-area area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release labels Jan 8, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: muraee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Jan 8, 2026
@muraee
Copy link
Contributor Author

muraee commented Jan 8, 2026

/test verify
/test e2e-aws
/test e2e-aws-4-21

@muraee
Copy link
Contributor Author

muraee commented Jan 8, 2026

/test e2e-aws-4-20

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 8, 2026

@muraee: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test e2e-aks
/test e2e-aks-4-21
/test e2e-aks-override
/test e2e-aws
/test e2e-aws-4-21
/test e2e-aws-override
/test e2e-aws-upgrade-hypershift-operator
/test e2e-kubevirt-aws-ovn-reduced
/test images
/test okd-scos-images
/test security
/test unit
/test verify
/test verify-deps

The following commands are available to trigger optional jobs:

/test e2e-aws-autonode
/test e2e-aws-metrics
/test e2e-aws-minimal
/test e2e-aws-techpreview
/test e2e-azure-aks-ovn-conformance
/test e2e-conformance
/test e2e-kubevirt-aws-ovn
/test e2e-kubevirt-azure-ovn
/test e2e-kubevirt-metal-conformance
/test e2e-openstack-aws
/test e2e-openstack-aws-conformance
/test e2e-openstack-aws-csi-cinder
/test e2e-openstack-aws-csi-manila
/test e2e-openstack-aws-nfv
/test okd-scos-e2e-aws-ovn
/test reqserving-e2e-aws

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-hypershift-main-e2e-aks
pull-ci-openshift-hypershift-main-e2e-aks-4-21
pull-ci-openshift-hypershift-main-e2e-aws
pull-ci-openshift-hypershift-main-e2e-aws-upgrade-hypershift-operator
pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn-reduced
pull-ci-openshift-hypershift-main-images
pull-ci-openshift-hypershift-main-okd-scos-images
pull-ci-openshift-hypershift-main-security
pull-ci-openshift-hypershift-main-unit
pull-ci-openshift-hypershift-main-verify
Details

In response to this:

/test e2e-aws-4-20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@muraee
Copy link
Contributor Author

muraee commented Jan 8, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @support/util/util.go:
- Line 686: The version check using "if version.Major >= 4 && version.Minor >=
20 && cpoBinaryExistsInHOImage()" is incorrect for majors >4 and for majors <4;
update the condition in util.go to correctly test "version >= 4.20" by either
constructing a semver threshold (e.g., create minVersionForHOImage :=
semver.Version{Major:4, Minor:20, Patch:0} and use
version.GTE(minVersionForHOImage)) or use an equivalent comparison like "if
version.Major > 4 || (version.Major == 4 && version.Minor >= 20) &&
cpoBinaryExistsInHOImage()", keeping the check against
cpoBinaryExistsInHOImage() unchanged and referencing the existing version
variable and cpoBinaryExistsInHOImage() call.
- Around line 633-647: The package-level cpoBinaryExists is racy; replace the
manual nil-check/write in cpoBinaryExistsInHOImage with a thread-safe
initialization using sync.Once (add a package-level sync.Once, e.g.
cpoBinaryOnce) or a sync.Mutex; call cpoBinaryOnce.Do(func(){ stat
os.Stat(cpoBinaryPath) and set cpoBinaryExists = &exists }) inside
cpoBinaryExistsInHOImage and then return *cpoBinaryExists so reads/writes are
synchronized and the value is computed exactly once.
🧹 Nitpick comments (2)
support/util/util_test.go (2)

1128-1134: Test setup directly mutates package-level state.

The test sets cpoBinaryExists directly (line 1133), which works with the current pointer-based caching but would break if sync.Once is adopted per the suggestion in util.go. Consider abstracting the binary existence check via a function variable or interface to improve testability.

♻️ Suggestion for improved testability

In util.go, use a function variable that can be replaced in tests:

// cpoBinaryExistsFunc is the function used to check CPO binary existence.
// It can be replaced in tests.
var cpoBinaryExistsFunc = cpoBinaryExistsInHOImage

Then in tests:

cpoBinaryExistsFunc = func() bool { return tc.cpoBinaryExists }
defer func() { cpoBinaryExistsFunc = cpoBinaryExistsInHOImage }()

1045-1126: Good test coverage, but consider adding a test case for major version > 4.

The test cases comprehensively cover 4.x versions, but given the version comparison bug noted in util.go, adding a test case for version 5.0+ would help catch regressions and validate the fix.

{
    name:                 "When version is 5.0 and CPO binary exists it should use HO image",
    version:              "5.0.0",
    payloadHasHypershift: true,
    cpoBinaryExists:      true,
    expectedImage:        hoImage,
},
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 032f041 and bac4dde.

📒 Files selected for processing (4)
  • Containerfile.operator
  • Dockerfile
  • support/util/util.go
  • support/util/util_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • Containerfile.operator
  • support/util/util.go
  • Dockerfile
  • support/util/util_test.go
🧬 Code graph analysis (1)
support/util/util_test.go (3)
support/releaseinfo/releaseinfo.go (1)
  • ReleaseImage (39-42)
api/hypershift/v1beta1/hostedcluster_types.go (1)
  • ControlPlaneOperatorImageAnnotation (59-59)
support/util/util.go (1)
  • GetControlPlaneOperatorImage (663-698)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
  • GitHub Check: Red Hat Konflux / control-plane-operator-main-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-gomaxprocs-webhook-on-pull-request
🔇 Additional comments (6)
Dockerfile (2)

7-29: LGTM! Build stage and binary packaging changes are well-structured.

The additions correctly build and package the control-plane-operator and control-plane-pki-operator binaries, with appropriate symlinks for the multi-call binary pattern where ignition-server, konnectivity-socks5-proxy, availability-prober, and token-minter all resolve to control-plane-operator.


45-45: LGTM! New capability label added.

The label correctly signals that this image supports the kas-custom-kubeconfig feature.

Containerfile.operator (2)

7-29: LGTM! Changes mirror Dockerfile appropriately.

The build stage and binary packaging changes are consistent with the Dockerfile, ensuring both container build paths produce equivalent images.


54-54: LGTM! Label added consistently with Dockerfile.

support/util/util.go (1)

649-661: LGTM! Documentation accurately reflects the updated precedence logic.

The docstring clearly explains the five-level precedence hierarchy for CPO image resolution.

support/util/util_test.go (1)

1016-1036: LGTM! Clean fake release provider implementation.

The testReleaseProvider correctly constructs a ReleaseImage with the version in the ImageStream name and component images in tags.

@muraee
Copy link
Contributor Author

muraee commented Jan 9, 2026

/test e2e-aws
/test e2e-aws-4-21

@rtheis
Copy link
Contributor

rtheis commented Jan 9, 2026

/cc @rtheis

@openshift-ci openshift-ci bot requested a review from rtheis January 9, 2026 12:34
@muraee
Copy link
Contributor Author

muraee commented Jan 12, 2026

/test e2e-aws

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 12, 2026

@muraee: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

For cluster versions 4.20 and above, use the HyperShift Operator image
directly for the Control Plane Operator instead of extracting it from
the OCP release payload. This enables:

- Faster feature delivery for CPO (ships with HO releases)
- Simplified hotfix process (single HO image bump fixes all clusters)
- Consistent deployment model between managed and self-managed

The change includes a safety check that verifies the control-plane-operator
binary exists in the HO image before using it. This ensures backward
compatibility with older HO images that don't include the CPO binary -
they will continue to use the release payload CPO.

Dockerfiles are updated to:
- Build and include control-plane-operator and control-plane-pki-operator
- Add symlinks for ignition-server, konnectivity-socks5-proxy, etc.
- Add missing CPO feature discovery labels

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@muraee muraee force-pushed the use-ho-image-for-cpo-420 branch from bac4dde to 1141357 Compare January 12, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants