enhancement: dual-stream RHEL 9/10 NodePool support in HyperShift by enxebre · Pull Request #2019 · openshift/enhancements

enxebre · 2026-05-21T16:39:53Z

Summary

Add enhancement proposal for OCPSTRAT-3014
Introduces spec.osImageStream on NodePool API for per-NodePool RHEL stream selection (rhel-9 / rhel-10)
Threads OS stream through token secret → ignition server → MCO bootstrap pipeline via OSImageStream CR injection
Handles runc upgrade guard: RHEL 10 does not ship runc, so NodePools with runc MachineConfigs stay on RHEL 9
Updates boot image metadata parsing for the new multi-stream ConfigMap format (streams key)
All stream validation surfaces through NodePoolValidMachineConfigConditionType

Test plan

Unit tests for getRHELStream(), config hash behavior, runc detection, OSImageStream CR generation
E2E tests on TechPreview CI: dual-stream NodePools, validation rejection, runc fallback

🤖 Generated with Claude Code

jparrill

Thanks for putting this together, Alberto, the enhancement looks really nice. I've left some inline comments worth looking at. The main areas I'd like to see addressed:

Boot image availability: What happens when a user requests a stream whose boot image doesn't exist in their region/platform? (Comment on Risks section)
InPlace NodePools: The enhancement doesn't address osImageStream changes on InPlace upgrade-type NodePools, which has significant implications for baremetal and disconnected deployments. (Comment on Upgrade section)
Karpenter: The Karpenter path needs a brief subsection explaining how the default stream flows through the pipeline. (Comment on Non-Goals)
Hash invariant: The separation between spec-value and resolved-value in the config hash is critical for avoiding unintended rollouts and deserves explicit documentation. (Comment on Phase 2)

Everything else is either minor (condition type, pipeline diagram) or informational (CI cost, graduation timeline, status data source).

jparrill · 2026-05-25T10:20:34Z

+
+3. **runc detection false negatives.** A user could configure runc via a mechanism other than CRI-O drop-in files. *Mitigation:* Match the MCO's detection logic exactly. If the MCO considers it sufficient for standalone, it is sufficient for HyperShift.
+
+4. **Mixed-stream clusters.** A HostedCluster with RHEL 9 and RHEL 10 NodePools is a new topology. Component compatibility across RHEL versions must be validated. *Mitigation:* This is the same topology supported by standalone clusters via per-MachineConfigPool stream selection.


The enhancement acknowledges that RHEL 10 AMIs exist in only 2 AWS regions and Azure has no marketplace images. However, it doesn't specify the user-facing behavior when someone requests osImageStream: rhel-10 in a region or platform where the boot image doesn't exist.

A few questions:

Does the NodePool controller reject the request via ValidPlatformImage=False? Does it fall back to rhel-9 silently? Or does it proceed and fail at machine provisioning time?

Which platforms are actually supported for rhel-10 in TechPreview? AWS (2 regions only)? Azure (VHD only, no marketplace)? GCP? KubeVirt? The enhancement should explicitly list which platform+region combinations are supported and what happens for unsupported ones.

For disconnected environments: if dual-stream payloads carry additional images, customers need to mirror both streams. The enhancement should document what additional content needs to be mirrored (two sets of boot images + two sets of node OS container images) and whether IDMS handles this transparently.

ValidPlatformImage=False

yes, this continues current behaviour and signaling. When a boot image doesn't exist for a stream we fail. @yuqi-zhang can you help hcp align with mco/installer expectations/ux on standalone? Should we instead fallback when a boot image is not available for a stream and assume that the appropriate os container image will be discovered and succeed to install via image-references?

jparrill · 2026-05-25T10:22:26Z

+The MCO's `GetBuiltinDefaultStreamName()` already returns a version-based default. HyperShift could skip generating the OSImageStream CR and let the MCO choose.
+
+This was rejected because:
+- It removes per-NodePool stream control — all NodePools would get the same default stream


The test plan lists 5 E2E scenarios, several of which create HostedClusters with dual-stream NodePools. A couple of concerns on CI cost:

Each test with two NodePools (rhel-9 + rhel-10) roughly doubles the node count per test. What's the estimated overhead in time and infrastructure per CI run?

HyperShift doesn't have a dedicated TechPreview CI job — tests that need TechPreview features simply create a HostedCluster with FeatureSet: TechPreviewNoUpgrade (as seen in create_cluster_test.go). These new tests would run alongside existing E2E tests in the same jobs. What's the impact on overall job duration?

Given RHEL 10 AMIs are limited to 2 AWS regions, the tests are geographically constrained. Do the current CI jobs run in those regions?

This doesn't block the enhancement, but the team should have visibility into the CI cost before approving.

hypershift has a techpreview job and also can set any HC to be techpreview itself.
This validations would require about ~6 additional ec2 instances per run.
Update the section to clarify all this.

jparrill · 2026-05-25T10:22:27Z

+
+1. Administrator upgrades a HostedCluster from 4.19 to 5.0. An existing NodePool has no `spec.osImageStream` (would default to `rhel-10`) but has a MachineConfig setting `default_runtime = "runc"`.
+2. NodePool controller's `validMachineConfigCondition` calls `getRHELStream()` which detects runc and returns `"rhel-9"` (fallback).
+3. The condition is set to `NodePoolValidMachineConfigConditionType=True` with message "OS stream defaulted to rhel-9: RHEL 10 is incompatible with default_runtime=runc". The NodePool stays on RHEL 9.


The enhancement routes stream validation errors through NodePoolValidMachineConfigConditionType. This works technically — getRHELStream() errors propagate through NewConfigGenerator() and the existing condition handler catches them.

However, for SRE observability, it might be worth having a separate condition (e.g., ValidOSStream) so that monitoring dashboards can distinguish "malformed MachineConfig" from "incompatible OS stream selection" without parsing the condition message. This is especially relevant for ROSA HCP and ARO HCP where SRE teams monitor conditions at fleet scale.

I'd suggest validating this with the ROSA and ARO SRE teams to see if a dedicated condition would help their workflows.

jparrill · 2026-05-25T10:22:28Z

+   name: "rhel-10"
+       │
+       ▼
+ Resolve stream ────────▶ Token Secret


Small correction on the pipeline description: copyMCOOutputToMCC() is not a separate top-level step in GetPayload(). Looking at local_ignitionprovider.go, it's called inside runMCO() at line 454, not as an independent step between runMCO() and runMCC().

The actual sequence in GetPayload() is:

1. runFeatureGateRender() — line 735 2. runMCO() — line 740 └── copyMCOOutputToMCC() — line 454 (inside runMCO) 3. [NEW: write 99_osimagestream.yaml to mccDir] 4. runMCC() — line 745 5. runMCSAndFetchPayload() — line 750

The injection point between runMCO() and runMCC() (lines 743-745) is correct. Just the diagram that should reflect copyMCOOutputToMCC is internal to runMCO, not a standalone step.

jparrill · 2026-05-25T10:22:30Z

+
+7. **Pass stream through `IgnitionProvider` interface** — add `osStream string` parameter to `GetPayload()` in the `IgnitionProvider` interface (`tokensecret_controller.go`). The `TokenSecretReconciler` reads `os-stream` from the token secret data and passes it to `GetPayload()`. Update the mock `IgnitionProvider` in tests accordingly.
+
+   ```go


This is a critical implementation invariant that I think deserves more emphasis in the enhancement:

The hash must use the value from spec.osImageStream.name directly (empty string when the field is unset), never the resolved value from getRHELStream(). If someone accidentally puts the resolved default (e.g., "rhel-10" for implicit >=5.0) into the hash input, it would change the hash for every existing NodePool without explicit osImageStream set, triggering a mass rollout of all nodes across the fleet.

In Go, "abc" + "" == "abc", so appending an empty string to the hash input is safe — the hash stays identical. But appending a non-empty resolved default would break this property.

I'd suggest documenting this explicitly as a design invariant, something like: "The rhelStream field in rolloutConfig MUST be populated from spec.osImageStream.name, never from the return value of getRHELStream(). This ensures zero-impact on existing NodePools."

jparrill · 2026-05-25T10:22:31Z

+- Boot image ConfigMap parsing supports multi-stream format.
+- Stream plumbed through token secret to ignition server.
+- OSImageStream CR generated in `GetPayload()`.
+- MCO ExternalTopologyMode guard removed (coordinated with MCO team).


The graduation criteria describe TechPreview first, then GA, but don't specify a timeline. A couple of questions:

Is TechPreview targeting 5.0? If so, when does GA land?

If GA targets 5.0 as well (no TechPreview phase), then full E2E coverage is needed before that release. The current test plan only covers TechPreview CI jobs.

The NodePool API field promotion must be coordinated with the MachineConfigPool OSStreams feature gate promotion in openshift/api — is there alignment on timeline with the MCO team?

jparrill · 2026-05-25T10:22:32Z

+
+### Architecture Overview
+
+The OS stream flows through three layers:


The enhancement mentions in Non-Goals that "Karpenter NodePools always use the version-derived default stream" but doesn't elaborate on how this works in practice.

Looking at the current code, OpenshiftEC2NodeClass has no field for OS stream selection. setKarpenterAMILabels() in token.go resolves a single AMI per architecture from StreamMetadata with no stream dimension. The in-memory NodePool created by KarpenterIgnitionReconciler.createInMemoryNodePool() doesn't carry spec.osImageStream.

Could you add a brief "Karpenter" subsection that clarifies:

How the version-derived default stream flows through the Karpenter pipeline (from release image → in-memory NodePool → setKarpenterAMILabels → AMI labels → EC2NodeClass)?

Whether OpenshiftEC2NodeClass will gain a stream field in a future phase, or if this is explicitly a non-goal?

Whether the AMI label scheme (hypershift.openshift.io/ami, hypershift.openshift.io/ami-arm64) needs any changes for dual-stream, given it currently assumes one AMI per architecture.

updated to clarify

jparrill · 2026-05-25T10:22:33Z

+- No manual action required.
+
+**Downgrade to a version without this feature:**
+- `spec.osImageStream` is ignored by older controllers. NodePools revert to the single-stream behavior (always using the `rhel-coreos` image from the release payload).


This section covers upgrade/downgrade for the feature itself but doesn't address what happens when spec.osImageStream changes on a NodePool with upgradeType: InPlace.

For Replace NodePools this is straightforward — a new hash triggers a MachineDeployment rollout, old nodes are replaced with new ones booting the correct stream. But for InPlace NodePools, changing the stream would mean the MCD on the running node attempts an rpm-ostree rebase from RHEL 9 to RHEL 10 (or vice versa) — which is a cross-major-version OS rebase on a live node.

Looking at the current codebase:

The in-place upgrader (inplaceupgrader.go) has no rollback mechanism. If a node enters MachineConfigDaemonStateDegraded, the upgrade stops but does not revert previously upgraded nodes. (Grep for "rollback"/"revert"/"undo" in the inplaceupgrader directory returns zero results.)

E2E tests for in-place upgrades (nodepool_upgrade_test.go, nodepool_lifecycle_test.go) test forward version upgrades only. No test exercises cross-RHEL-version changes or failure/degradation scenarios.

The Agent platform has no restrictions on upgrade type — Agent NodePools can be InPlace. In self-managed baremetal deployments, InPlace is common because re-provisioning hardware is expensive.

This means a baremetal operator could change osImageStream on an InPlace NodePool, triggering an OS rebase across all metal nodes with no rollback path if something goes wrong.

I'd suggest the enhancement explicitly addresses InPlace NodePools. Options:

Reject osImageStream changes on InPlace NodePools via validation (safest for TechPreview)

Document it as supported-but-risky with a clear warning in the condition message

Explicitly list it as a non-goal for the initial implementation

Also worth noting: for disconnected/air-gapped baremetal deployments, dual-stream means customers need to mirror both streams' container images and boot artifacts. The enhancement should document what additional content needs to be mirrored.

version downgrades are not supported. Clarified. Day 2 changes to the streams are supported as they are also in the canonical machineConfigPool API.

@yuqi-zhang does mco does dualstream UX really supports moving back and forward inplace between rhel 9 and 10?

jparrill · 2026-05-25T10:22:35Z

+
+1. Update `DeserializeImageMetadata` (`support/releaseinfo/deserialize.go`) — parse the `streams` key when present, fall back to `stream` for older payloads. Return stream-indexed metadata.
+2. Update `CoreOSStreamMetadata` (`support/releaseinfo/releaseinfo.go`) — support multiple streams (map of stream name to per-arch metadata).
+3. Wire default boot image resolution per platform — accept stream parameter and look up stream-specific boot images from the parsed metadata:


Small verification question: the enhancement says the NodePool controller infers the OS stream from CAPI Machine.Status.NodeInfo.OSImage. The current code in nodeVersionsFromMachines() (version.go) reads KubeletVersion from Machine.Status.NodeInfo but doesn't read OSImage today.

CAPI should copy NodeInfo (including OSImage) from the guest Node to the Machine status, so this should work. But has this been verified across all platforms (AWS, Azure, Agent, KubeVirt)? If any platform's CAPI provider doesn't populate Machine.Status.NodeInfo.OSImage, the status reporting would silently fail.

this is not a provider specific thing, it's core capi bubbling up nodeinfo, that's a contractual boundary we can consider consumable.

enxebre · 2026-05-25T12:37:33Z

/label tide/merge-method-squash

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix pipeline diagram: copyMCOOutputToMCC() is internal to runMCO(), not a standalone step - Add boot image error behavior for missing stream+region (NodePoolValidPlatformImageType=False) - Add Karpenter step (phase 1) clarifying version-derived default flow - Add design invariant for rolloutConfig.rhelStream hash safety - Rewrite E2E section: 8 NodePools in TestNodePool suite covering explicit/implicit streams, validation, runc, and upgrade stream switches (Replace + InPlace) - Add disconnected mirroring risk (IDMS/ITMS handles both streams transparently) - Downgrade section: downgrades are not supported Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci · 2026-05-25T13:33:11Z

@enxebre: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`97df15f`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

sdminonne

This is great. Thanks for this @enxebre

sdminonne · 2026-05-26T12:44:43Z

+
+4. **Mixed-stream clusters.** A HostedCluster with RHEL 9 and RHEL 10 NodePools is a new topology. Component compatibility across RHEL versions must be validated. *Mitigation:* This is the same topology supported by standalone clusters via per-MachineConfigPool stream selection.
+
+5. **Disconnected environments require mirroring both streams.** Dual-stream payloads carry two sets of node OS container images (one per RHEL stream). Disconnected customers using both streams must mirror both sets. Boot images are platform-specific and already handled outside the payload (e.g. pre-uploaded AMIs, VHDs). The OS container images are referenced via `ImageDigestMirrorSet` / `ImageTagMirrorSet` — existing IDMS/ITMS mirroring workflows handle this transparently as long as both stream images are included in the mirror list. No additional HyperShift-specific mirroring tooling is needed.


I can't supply hard values but if the size of two streams is critical in disconnected we should eventually double check with ROSA people for ECR size limit for their zero egress config.

openshift-ci · 2026-05-26T12:45:44Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdminonne
Once this PR has been reviewed and has the lgtm label, please assign enxebre for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

enhancements/hypershift/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

csrwng

Looking good. Just a question after a brief look

csrwng · 2026-05-27T00:39:09Z

+
+## Open Questions
+
+1. Should `spec.osImageStream` be immutable after creation (requiring NodePool replacement to change streams), or mutable (triggering a rolling update)? The current proposal allows mutation with rollout, mirroring how other NodePool spec changes work.


This makes sense for replace strategy. For in-place, mutation would only make sense if you're moving forward (rhel9 -> rhel10), but not the other way, no?

openshift-ci Bot requested review from csrwng and sjenning May 21, 2026 16:40

jparrill reviewed May 25, 2026

View reviewed changes

openshift-ci Bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label May 25, 2026

enxebre and others added 2 commits May 25, 2026 15:19

enhancement: dual-stream RHEL 9/10 NodePool support in HyperShift

b3a7f5a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

enxebre force-pushed the dual-stream-rhel-nodepool branch from c413be6 to 97df15f Compare May 25, 2026 13:19

sdminonne approved these changes May 26, 2026

View reviewed changes

csrwng reviewed May 27, 2026

View reviewed changes


		3. runc detection false negatives. A user could configure runc via a mechanism other than CRI-O drop-in files. Mitigation: Match the MCO's detection logic exactly. If the MCO considers it sufficient for standalone, it is sufficient for HyperShift.

		4. Mixed-stream clusters. A HostedCluster with RHEL 9 and RHEL 10 NodePools is a new topology. Component compatibility across RHEL versions must be validated. Mitigation: This is the same topology supported by standalone clusters via per-MachineConfigPool stream selection.


		7. Pass stream through `IgnitionProvider` interface — add `osStream string` parameter to `GetPayload()` in the `IgnitionProvider` interface (`tokensecret_controller.go`). The `TokenSecretReconciler` reads `os-stream` from the token secret data and passes it to `GetPayload()`. Update the mock `IgnitionProvider` in tests accordingly.

		```go


		### Architecture Overview

		The OS stream flows through three layers:


		4. Mixed-stream clusters. A HostedCluster with RHEL 9 and RHEL 10 NodePools is a new topology. Component compatibility across RHEL versions must be validated. Mitigation: This is the same topology supported by standalone clusters via per-MachineConfigPool stream selection.

		5. Disconnected environments require mirroring both streams. Dual-stream payloads carry two sets of node OS container images (one per RHEL stream). Disconnected customers using both streams must mirror both sets. Boot images are platform-specific and already handled outside the payload (e.g. pre-uploaded AMIs, VHDs). The OS container images are referenced via `ImageDigestMirrorSet` / `ImageTagMirrorSet` — existing IDMS/ITMS mirroring workflows handle this transparently as long as both stream images are included in the mirror list. No additional HyperShift-specific mirroring tooling is needed.


		## Open Questions

		1. Should `spec.osImageStream` be immutable after creation (requiring NodePool replacement to change streams), or mutable (triggering a rolling update)? The current proposal allows mutation with rollout, mirroring how other NodePool spec changes work.

Conversation

enxebre commented May 21, 2026

Summary

Test plan

Uh oh!

jparrill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

enxebre commented May 25, 2026

Uh oh!

openshift-ci Bot commented May 25, 2026

Uh oh!

sdminonne left a comment

Choose a reason for hiding this comment

Uh oh!

sdminonne May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 26, 2026

Uh oh!

csrwng left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sdminonne May 26, 2026 •

edited

Loading