OCPNODE-4494: Testcase to test runc Upgrade case by asahay19 · Pull Request #31266 · openshift/origin

asahay19 · 2026-06-08T11:43:50Z

This PR adds an end-to-end test that validates MCO's guard blocking a RHCOS 9→10 OS stream transition when a MachineConfigPool has runc configured as the default container runtime. Also this PR contains runc_upgrade_cases.md file which contains the Test Plan with all the required meta data.

What the test does

Creates an isolated MachineConfigPool (runc-rhcos10-guard) pinned to spec.osImageStream: rhel-9 and a runc CRI-O drop-in MachineConfig
Labels one pure worker into the pool and waits for a healthy baseline rollout
Verifies the node is on RHCOS 9 with runc as the default runtime
Patches the pool's osImageStream to rhel-10
Asserts the guard fires: MCP Degraded=True + RenderDegraded=True with a message referencing runc and rhel-10
Confirms the node remains on RHCOS 9 with runc (rollout was blocked)
Cleans up: removes node label, waits for pool to drain to zero machines, deletes MC and MCP
Skips on: MicroShift, Hypershift

MCO change PR: openshift/machine-config-operator#5891

Locally tested with my custom mco image against the above mco PR and it got passed:
./openshift-tests run-test "[Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when default runtime is runc"

  ============================================
  Random Seed: 1780995902 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when default runtime is runc
  github.com/openshift/origin/test/extended/node/runc_upgrade_cases.go:72
    STEP: Creating a kubernetes client @ 06/09/26 14:35:06.706
  I0609 14:35:06.707258   98967 discovery.go:214] Invalidating discovery information
  I0609 14:35:09.485040 98967 client.go:293] configPath is now "/var/folders/95/wktd6vvd57g5hgy7prd1sgmh0000gn/T/configfile2276327571"
  I0609 14:35:09.485095 98967 client.go:368] The user is now "e2e-test-runc-rhcos10-guard-sbgbj-user"
  I0609 14:35:09.485123 98967 client.go:370] Creating project "e2e-test-runc-rhcos10-guard-sbgbj"
  I0609 14:35:09.777466 98967 client.go:378] Waiting on permissions in project "e2e-test-runc-rhcos10-guard-sbgbj" ...
  I0609 14:35:10.696738 98967 client.go:407] DeploymentConfig capability is enabled, adding 'deployer' SA to the list of default SAs
  I0609 14:35:10.926882 98967 client.go:422] Waiting for ServiceAccount "default" to be provisioned...
  I0609 14:35:11.484858 98967 client.go:422] Waiting for ServiceAccount "builder" to be provisioned...
  I0609 14:35:12.043524 98967 client.go:422] Waiting for ServiceAccount "deployer" to be provisioned...
  I0609 14:35:12.599974 98967 client.go:432] Waiting for RoleBinding "system:image-pullers" to be provisioned...
  I0609 14:35:12.830033 98967 client.go:432] Waiting for RoleBinding "system:image-builders" to be provisioned...
  I0609 14:35:13.056201 98967 client.go:432] Waiting for RoleBinding "system:deployers" to be provisioned...
  I0609 14:35:14.221083 98967 client.go:469] Project "e2e-test-runc-rhcos10-guard-sbgbj" has been fully provisioned.
  I0609 14:35:14.224261 98967 framework.go:2330] [precondition-check] checking if cluster is MicroShift
  I0609 14:35:14.452293 98967 framework.go:2353] IsMicroShiftCluster: microshift-version configmap not found, not MicroShift
  I0609 14:35:14.912080 98967 runc_upgrade_cases.go:140] Cluster version "5.0.0-0.nightly-2026-06-08-075337" satisfies OCP 5.0+ requirement
  I0609 14:35:15.368033 98967 runc_upgrade_cases.go:190] OSImageStream default="rhel-10" streams=[rhel-10 rhel-9]
    STEP: Labeling one worker into the custom pool @ 06/09/26 14:35:15.368
  I0609 14:35:16.059378 98967 runc_upgrade_cases.go:281] Labeled node ip-10-0-54-203.us-east-2.compute.internal with node-role.kubernetes.io/runc-rhcos10-guard
    STEP: Creating custom MachineConfigPool pinned to rhel-9 and runc MachineConfig @ 06/09/26 14:35:16.059
    STEP: Waiting for node to join the custom pool @ 06/09/26 14:35:16.522
  I0609 14:35:16.749912 98967 runc_upgrade_cases.go:312] MCP runc-rhcos10-guard waiting for machine count 1 (current 0)
  I0609 14:35:26.752298 98967 runc_upgrade_cases.go:309] MCP runc-rhcos10-guard machine count reached 1
    STEP: Waiting for pool rollout on rhel-9 with runc @ 06/09/26 14:35:26.752
  I0609 14:35:26.752430 98967 node_utils.go:522] Waiting for MCP runc-rhcos10-guard to be ready (timeout: 30m0s)...
  I0609 14:35:26.983069 98967 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0609 14:35:36.981462 98967 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: 
  I0609 14:38:16.981428 98967 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0609 14:38:26.980874 98967 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0609 14:38:36.981599 98967 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0609 14:38:46.982475 98967 node_utils.go:561] MachineConfigPool runc-rhcos10-guard is ready: 1/1 machines ready
    STEP: Checking default runtime is runc on RHCOS 9 @ 06/09/26 14:38:46.982
    STEP: Upgrading RHCOS version to RHCOS 10 via osImageStream @ 06/09/26 14:39:00.827
  I0609 14:39:01.521687 98967 runc_upgrade_cases.go:395] MCP runc-rhcos10-guard waiting for runc+rhel-10 guard: degraded=false renderDegraded=false message=""
  I0609 14:39:11.524497 98967 runc_upgrade_cases.go:391] MCP runc-rhcos10-guard is degraded as expected: Failed to render configuration for pool runc-rhcos10-guard: MachineConfigPool runc-rhcos10-guard targets OS image stream "rhel-10" where runc is not available. To unblock, migrate to crun by removing any ContainerRuntimeConfig that sets defaultRuntime to runc, and removing any MachineConfig that sets default_runtime = "runc" in CRI-O configuration under /etc/crio/crio.conf.d/
    STEP: Verifying ClusterVersion remains stable while pool guard is active @ 06/09/26 14:39:11.524
  I0609 14:39:11.755459 98967 runc_upgrade_cases.go:160] ClusterVersion "5.0.0-0.nightly-2026-06-08-075337" is stable (Available=True, Progressing=False, Degraded=False)
    STEP: Verifying node remains on RHCOS 9 with runc after guard blocks rollout @ 06/09/26 14:39:11.755
    STEP: Recovering pool by setting osImageStream back to rhel-9 @ 06/09/26 14:39:17.898
  I0609 14:39:18.591454 98967 runc_upgrade_cases.go:359] MCP runc-rhcos10-guard recovery in progress: degraded=true renderDegraded=true updating=false updated=true machines=1/1
  I0609 14:39:28.592430 98967 runc_upgrade_cases.go:359] MCP runc-rhcos10-guard recovery in progress: degraded=true renderDegraded=false updating=false updated=true machines=1/1
  I0609 14:39:38.592582 98967 runc_upgrade_cases.go:357] MCP runc-rhcos10-guard recovered: 1/1 machines ready
    STEP: Verifying node remains on RHCOS 9 with runc after recovery @ 06/09/26 14:39:38.592
  I0609 14:39:45.185581 98967 client.go:689] Deleted {user.openshift.io/v1, Resource=users  e2e-test-runc-rhcos10-guard-sbgbj-user}, err: <nil>
  I0609 14:39:45.418282 98967 client.go:689] Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-runc-rhcos10-guard-sbgbj}, err: <nil>
  I0609 14:39:45.650136 98967 client.go:689] Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  sha256~8MV96BVphhXAsqhe9odpjMSd6OYyzlbZ3Trpn9lyPdo}, err: <nil>
  I0609 14:39:46.568467 98967 runc_upgrade_cases.go:312] MCP runc-rhcos10-guard waiting for machine count 0 (current 1)
  I0609 14:39:56.569563 98967 runc_upgrade_cases.go:309] MCP runc-rhcos10-guard machine count reached 0
    STEP: Destroying namespace "e2e-test-runc-rhcos10-guard-sbgbj" for this suite. @ 06/09/26 14:39:57.033
  • [290.577 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 290.578 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when default runtime is runc",
    "lifecycle": "blocking",
    "duration": 290577,
    "startTime": "2026-06-09 09:05:06.688042 UTC",
    "endTime": "2026-06-09 09:09:57.265851 UTC",
    "result": "passed",`

Summary by CodeRabbit

Release Notes

Tests
- Added a disruptive serial e2e test suite that verifies the RHCOS 9→10 upgrade guard when the cluster's default container runtime is set to runc. Tests ensure pool-scoped blocking behavior, node OS/runtime persistence, and cluster health during the guard. Includes environment skip rules for unsupported/topology-specific setups.
Documentation
- Added a detailed test-plan for the upgrade-guard scenario with prerequisites, skip conditions, test flow, and pass/fail criteria.

openshift-merge-bot · 2026-06-08T11:43:52Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

openshift-ci-robot · 2026-06-08T11:43:54Z

@asahay19: This pull request references OCPNODE-4494 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

This PR adds an end-to-end test that validates MCO's guard blocking a RHCOS 9→10 OS stream transition when a MachineConfigPool has runc configured as the default container runtime. Also this PR contains runc_upgrade_cases.md file which contains the Test Plan with all the required meta data.

What the test does

Creates an isolated MachineConfigPool (runc-rhcos10-guard) pinned to spec.osImageStream: rhel-9 and a runc CRI-O drop-in MachineConfig

Labels one pure worker into the pool and waits for a healthy baseline rollout

Verifies the node is on RHCOS 9 with runc as the default runtime

Patches the pool's osImageStream to rhel-10

Asserts the guard fires: MCP Degraded=True + RenderDegraded=True with a message referencing runc and rhel-10

Confirms the node remains on RHCOS 9 with runc (rollout was blocked)

Cleans up: removes node label, waits for pool to drain to zero machines, deletes MC and MCP

Skips on: MicroShift, Hypershift

MCO change PR: openshift/machine-config-operator#5891

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2026-06-08T11:43:55Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-06-08T11:44:11Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 89065da1-4c50-46b6-a5ac-525184229998

📥 Commits

Reviewing files that changed from the base of the PR and between 2db0de0 and f6fc984.

📒 Files selected for processing (2)

test/extended/node/runc_upgrade_cases.go
test/extended/node/runc_upgrade_cases.md

✅ Files skipped from review due to trivial changes (1)

test/extended/node/runc_upgrade_cases.md

Walkthrough

Adds a disruptive Ginkgo e2e that verifies MCO blocks RHCOS 9→10 OSImageStream moves when CRI-O default runtime is runc, with helpers to create/label/update/poll/verify/cleanup a dedicated MachineConfigPool and ContainerRuntimeConfig. Adds a companion Markdown test plan describing the UC-1 flow and criteria.

Changes

runc RHCOS 10 upgrade guard test

Layer / File(s)	Summary
Test suite setup and gating `test/extended/node/runc_upgrade_cases.go`	Package, constants, main disruptive/serial Ginkgo suite and gating helpers (platform/topology/OpenShift version and OSImageStream availability).
MCP and MachineConfig creation `test/extended/node/runc_upgrade_cases.go`	Create MachineConfigPool targeting `rhel-9` and ContainerRuntimeConfig that sets CRI-O `default_runtime` to `runc` (idempotent/AlreadyExists-tolerant).
Node selection and labeling utilities `test/extended/node/runc_upgrade_cases.go`	Select a pure worker node, apply and remove pool-specific node-role label for enrollment in the custom MCP.
MCP machine count and OSImageStream update helpers `test/extended/node/runc_upgrade_cases.go`	Polling helper to wait for MCP machine count targets and helper to set the MCP `OSImageStream` reference.
MCP guard and recovery polling `test/extended/node/runc_upgrade_cases.go`	Poll for Degraded+RenderDegraded guard state (render message must include `runc` and `rhel-10`) and poll recovery to ensure degraded conditions clear and MCP reaches updated/ready.
Node verification helpers `test/extended/node/runc_upgrade_cases.go`	Chrooted grep to verify installed CRI-O drop-in contains `runc` and helper to extract node OS major version from `/etc/os-release`.
Resource deletion and cleanup `test/extended/node/runc_upgrade_cases.go`	Delete ContainerRuntimeConfig and MachineConfigPool helpers that treat NotFound as success; test AfterEach unlabels node and waits for pool machine count zero.
Test documentation `test/extended/node/runc_upgrade_cases.md`	Test plan (UC-1) with requirements, skip conditions, step flow, pass/fail criteria, execution command, suggested CI lanes, and related references.

Sequence Diagram(s)

sequenceDiagram
  participant TestRunner
  participant API_Server
  participant MachineConfigOperator
  participant Node
  participant ClusterVersion
  TestRunner->>API_Server: create MachineConfigPool (rhel-9) & MachineConfig (runc drop-in)
  API_Server->>MachineConfigOperator: notify new MCP/MC
  MachineConfigOperator->>Node: render and apply ignition + drop-in (runc)
  Node-->>MachineConfigOperator: node reports installed config and OS version (rhel-9)
  TestRunner->>API_Server: patch MCP OSImageStream -> rhel-10
  API_Server->>MachineConfigOperator: new desired OSImageStream (rhel-10)
  MachineConfigOperator->>MachineConfigOperator: detect runc + rhel-10 -> set Degraded & RenderDegraded with message
  MachineConfigOperator->>ClusterVersion: no change (CV remains Available=true)
  TestRunner->>API_Server: patch MCP OSImageStream -> rhel-9 (recovery)
  MachineConfigOperator->>Node: reconcile to rhel-9 stream, clear degraded

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

BhargaviGudi
deads2k

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.79% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (14 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references the Jira issue (OCPNODE-4494) and describes the main change: adding a test case for the runc upgrade scenario. It accurately reflects the core functionality being added to validate MCO's guard behavior for RHCOS 9→10 upgrades with runc configured.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	All test names in runc_upgrade_cases.go use static, deterministic strings. The Describe and It blocks contain no variable interpolation, fmt.Sprintf, concatenation, or dynamic identifiers.
Test Structure And Quality	✅ Passed	Test meets all quality criteria: single cohesive responsibility, proper setup/cleanup, appropriate timeouts, meaningful assertion messages, consistent with codebase.
Microshift Test Compatibility	✅ Passed	Test uses MicroShift-unavailable APIs (MachineConfigPool, ClusterOperator, ClusterVersion) but properly guards with IsMicroShiftCluster() check in BeforeEach.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	Test has proper SNO protection via runtime topology check in BeforeEach: skips if controlPlaneTopology == SingleReplicaTopologyMode using exutil.GetControlPlaneTopology().
Topology-Aware Scheduling Compatibility	✅ Passed	PR adds only test code (test/extended/node/runc_upgrade_cases.go and .md documentation), not deployment manifests, operator code, or controllers. No scheduling constraints are introduced.
Ote Binary Stdout Contract	✅ Passed	No process-level stdout writes detected. Uses standard Ginkgo v2 pattern with all code in test blocks. All logging via framework.Logf. No fmt.Print/klog/os.Stdout issues found.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	Test contains no IPv4 hardcoded addresses, external connectivity requirements, or IP family assumptions. All calls are cluster-internal (Kubernetes APIs, kubectl debug via chroot).
No-Weak-Crypto	✅ Passed	PR adds test code (runc_upgrade_cases.go/md) with no weak crypto, custom crypto implementations, or unsafe secret comparisons found.
Container-Privileges	✅ Passed	No privileged container settings (privileged: true, hostPID, hostNetwork, hostIPC, SYS_ADMIN, allowPrivilegeEscalation) found in either the Go test file or Markdown documentation.
No-Sensitive-Data-In-Logs	✅ Passed	No sensitive data in logs. All framework.Logf and fmt.Errorf calls log only standard resource names, conditions, and non-sensitive metadata. No passwords, tokens, keys, or PII are exposed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-06-08T11:44:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: asahay19
Once this PR has been reviewed and has the lgtm label, please assign cpmeadors for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

test/extended/node/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/node/runc_upgrade_cases.go`:
- Around line 59-63: The test fails on SingleReplica (SNO) clusters because it
requires a “pure worker”; update the preflight topology check that calls
exutil.GetControlPlaneTopology (variable controlPlaneTopology) to also detect
configv1.SingleReplicaTopologyMode and call g.Skip("Skipping on single-replica
(SNO) cluster") before attempting to select a pure worker. Make the same change
in the other preflight/topology-check sites that use
exutil.GetControlPlaneTopology or perform pure-worker selection (the other
occurrences referenced in the review) so the test is skipped early on
SingleReplica clusters.
- Around line 99-107: The AfterEach currently ignores all error returns from
cleanup calls (removeNodeLabel, waitForMCPMachineCount, deleteMachineConfig,
deleteMachineConfigPool) which can leave test state dirty; update AfterEach to
capture each error into a variable and assert failure instead of swallowing it
(e.g., err := removeNodeLabel(...); Expect(err).NotTo(HaveOccurred())) for each
call that uses nodeName, oc, mcClient, runcRHCOS10GuardPool, and runcGuardMCName
(and similarly for
waitForMCPMachineCount/deleteMachineConfig/deleteMachineConfigPool) so any
cleanup failure fails the test and surfaces the underlying error.

In `@test/extended/node/runc_upgrade_cases.md`:
- Around line 51-53: Add a language tag to the fenced code block containing the
test declaration g.It("blocks upgrade of RHCOS 9 to 10 when default runtime is
runc") — replace the opening triple backticks with a language-tagged fence
(e.g., ```go) so the block reads as a Go snippet and satisfies MD040.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 02321dc4-fc08-4828-8271-c00e13720ac4

📥 Commits

Reviewing files that changed from the base of the PR and between c0f50ac and 0c371a9.

📒 Files selected for processing (2)

test/extended/node/runc_upgrade_cases.go
test/extended/node/runc_upgrade_cases.md

openshift-ci · 2026-06-08T13:10:51Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

openshift-ci · 2026-06-08T13:13:46Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/de265370-633b-11f1-9c47-870e6f6dcba6-0

openshift-ci · 2026-06-08T15:33:38Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/67ece2a0-634f-11f1-8aab-94564898c66c-0

openshift-ci · 2026-06-09T08:32:25Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b9a6bab0-63dd-11f1-828a-621d5ba8e722-0

openshift-ci · 2026-06-09T08:32:37Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c1104ff0-63dd-11f1-9f15-5905ea882eb2-0

bitoku · 2026-06-09T10:52:38Z

+		o.Expect(err).NotTo(o.HaveOccurred(), "need a worker node for the custom pool")
+
+		g.By("Creating custom MachineConfigPool pinned to rhel-9 and runc MachineConfig")
+		o.Expect(createRuncGuardMachineConfig(ctx, mcClient)).To(o.Succeed())


I want test cases to align with the real usecases.
Drop-in MC is not supported in openshift.
What we will see is either 1. CRC or 2. MC that was created during the past update.

I want the test case name to explicitly describe which.
If you test 1, it should have CRC with runc config.
If you test 2, I want it to use the same MC name and explicitly say that it's that MC.

As of now ,I will take test 1 should have CRC with runc config.

bitoku · 2026-06-09T10:55:53Z

+		g.By("Verifying ClusterVersion remains stable while pool guard is active")
+		o.Expect(verifyClusterVersionStable(ctx, oc)).To(o.Succeed())


This should fail. See openshift/machine-config-operator#5891 (comment)

bitoku · 2026-06-09T10:57:07Z

+		g.By("Verifying node remains on RHCOS 9 with runc after guard blocks rollout")
+		rhelMajor, err = nodeRHELMajorVersion(oc, nodeName)
+		o.Expect(err).NotTo(o.HaveOccurred())
+		o.Expect(rhelMajor).To(o.Equal("9"))
+		o.Expect(nodeUsesRuncRuntime(oc, nodeName)).To(o.BeTrue())


We also want to check the node readiness, and not being rolled out by checking mco labels.

bitoku · 2026-06-09T10:58:56Z

+func requireOpenShift5OrNewer(ctx context.Context, oc *exutil.CLI) {
+	cv, err := oc.AdminConfigClient().ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
+	o.Expect(err).NotTo(o.HaveOccurred())
+
+	version := cv.Status.Desired.Version
+	major, err := strconv.Atoi(strings.SplitN(version, ".", 2)[0])
+	o.Expect(err).NotTo(o.HaveOccurred())
+	if major < 5 {
+		g.Skip(fmt.Sprintf("cluster version %q is below OCP 5.0; runc RHCOS10 guard applies to 5.0+", version))
+	}
+	framework.Logf("Cluster version %q satisfies OCP 5.0+ requirement", version)
+}


Is this required? I think checking OSImageStream FG is enough.

bitoku · 2026-06-09T12:42:16Z

+	_, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})
+	if apierrors.IsNotFound(err) {
+		g.Skip("OSImageStream API is not available; enable TechPreviewNoUpgrade / OSStreams on the cluster")
+	}
+	o.Expect(err).NotTo(o.HaveOccurred())
+
+	osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})
+	o.Expect(err).NotTo(o.HaveOccurred())


This checks the same cluster OSImageStream twice.

openshift-ci · 2026-06-09T13:31:40Z

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8921abf0-6407-11f1-9686-c8cdaefe3b6e-0

bitoku · 2026-06-09T18:05:15Z

+
+// verifyClusterVersionUnaffectedByIsolatedPoolGuard checks that a render failure on an isolated
+// custom MCP does not degrade the cluster-wide machine-config operator or ClusterVersion.
+// The guard is pool-scoped; worker/master pools remain healthy.


Did you confirm it? If so we may want to do some additional propagation.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 8, 2026

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 8, 2026

coderabbitai Bot requested changes Jun 8, 2026

View reviewed changes

Comment thread test/extended/node/runc_upgrade_cases.go

Comment thread test/extended/node/runc_upgrade_cases.go

Comment thread test/extended/node/runc_upgrade_cases.md Outdated

bitoku mentioned this pull request Jun 8, 2026

OCPNODE-4443: Add runc upgradeable guard to block upgrades on RHEL 10 streams openshift/machine-config-operator#5891

Open

bitoku reviewed Jun 8, 2026

View reviewed changes

Comment thread test/extended/node/runc_upgrade_cases.go

Comment thread test/extended/node/runc_upgrade_cases.go

coderabbitai Bot approved these changes Jun 9, 2026

View reviewed changes

openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 9, 2026

asahay19 force-pushed the 4494 branch 2 times, most recently from 07d64fb to 2db0de0 Compare June 9, 2026 09:14

bitoku reviewed Jun 9, 2026

View reviewed changes

Testcase to test runc Upgrade case

f6fc984

asahay19 force-pushed the 4494 branch from 2db0de0 to f6fc984 Compare June 9, 2026 17:27

bitoku reviewed Jun 9, 2026

View reviewed changes

		g.By("Verifying ClusterVersion remains stable while pool guard is active")
		o.Expect(verifyClusterVersionStable(ctx, oc)).To(o.Succeed())

Conversation

asahay19 commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-merge-bot Bot commented Jun 8, 2026

Uh oh!

openshift-ci-robot commented Jun 8, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci Bot commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

openshift-ci Bot commented Jun 8, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci Bot commented Jun 8, 2026

Uh oh!

openshift-ci Bot commented Jun 8, 2026

Uh oh!

openshift-ci Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

asahay19 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

bitoku Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asahay19 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Jun 8, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited by openshift-ci Bot

Loading