Skip to content

OCPBUGS-83863: Remove rhel8 CNI binary logic, fall back to default paths#2967

Open
sdodson wants to merge 1 commit intoopenshift:masterfrom
sdodson:OCPBUGS-83863-remove-rhel8-fallback
Open

OCPBUGS-83863: Remove rhel8 CNI binary logic, fall back to default paths#2967
sdodson wants to merge 1 commit intoopenshift:masterfrom
sdodson:OCPBUGS-83863-remove-rhel8-fallback

Conversation

@sdodson
Copy link
Copy Markdown
Member

@sdodson sdodson commented Apr 21, 2026

Summary

  • Replace hardcoded rhel8/rhel9 case statements in both cnibincopy scripts with dynamic OS version detection
  • Consolidate RHEL8_SOURCE_DIRECTORY, RHEL9_SOURCE_DIRECTORY, and DEFAULT_SOURCE_DIRECTORY env vars into a single SOURCE_DIRECTORY
  • Update Fedora CoreOS hardcoded rhelmajor from 8 to 9

multus.yaml

The script detects the RHEL major version at runtime and probes for a version-specific directory in two layouts:

  1. Standard: rhel<N> inserted before the last path component (e.g., /usr/src/multus-cni/rhel9/bin/)
  2. Flat: rhel<N> as a subdirectory (e.g., /bondcni/rhel9/)

Falls back to SOURCE_DIRECTORY with a warning when neither exists.

008-script-lib.yaml (OVN)

Tries /usr/libexec/cni/rhel${rhelmajor} first, falls back to /usr/libexec/cni/ with a warning.

This unblocks removing rhel8 build stages from upstream images (openshift/ovn-kubernetes#3149, openshift/multus-cni#285) and is forwards-compatible with future RHEL versions without any CNO changes.

Test plan

  • Verify OVN CNI shim binary is correctly copied on RHEL 9 CoreOS nodes
  • Verify multus and ancillary CNI plugin binaries are correctly copied on RHEL 9 nodes
  • Verify graceful fallback to default directory when version-specific directory is absent
  • Verify no regression on clusters with current images (rhel9 directories still present)

Summary by CodeRabbit

  • Chores
    • Updated Fedora CoreOS to use RHEL 9 container networking configuration instead of RHEL 8.
    • Enhanced robustness of container initialization with improved fallback logic and logging for network plugin binaries.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Apr 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@sdodson: This pull request references Jira Issue OCPBUGS-83863, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Replace hardcoded rhel8/rhel9 case statements in both cnibincopy scripts with dynamic OS version detection that tries the version-specific directory first and falls back to the default path when it doesn't exist
  • Remove all RHEL8_SOURCE_DIRECTORY env vars from multus init containers
  • Update Fedora CoreOS hardcoded rhelmajor from 8 to 9

This unblocks removing rhel8 build stages from upstream images (openshift/ovn-kubernetes#3149, openshift/multus-cni#285) and is forwards-compatible with future RHEL versions — adding RHEL 10 support only requires adding RHEL10_SOURCE_DIRECTORY env vars to the init containers.

Files changed

  • bindata/network/ovn-kubernetes/common/008-script-lib.yamlcni-bin-copy() now tries /usr/libexec/cni/rhel${rhelmajor}, falls back to /usr/libexec/cni/
  • bindata/network/multus/multus.yamlcnibincopy.sh dynamically looks up RHEL${rhelmajor}_SOURCE_DIRECTORY via bash indirect reference, falls back to DEFAULT_SOURCE_DIRECTORY

Test plan

  • Verify OVN CNI shim binary is correctly copied on RHEL 9 CoreOS nodes
  • Verify multus and ancillary CNI plugin binaries are correctly copied on RHEL 9 nodes
  • Verify graceful fallback to default directory when version-specific directory is absent
  • Verify no regression on clusters with current images (rhel9 directories still present)

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Walkthrough

Updated scripts and pod manifests to require a single SOURCE_DIRECTORY (removing RHEL8/RHEL9/DEFAULT env vars), change Fedora CoreOS mapping to rhelmajor=9, compute/version-select the RHEL-specific CNI source directory with existence checks and fallback, and log the chosen copy source before copying.

Changes

Cohort / File(s) Summary
Multus manifests
bindata/network/multus/multus.yaml
Removed RHEL8_SOURCE_DIRECTORY/RHEL9_SOURCE_DIRECTORY/DEFAULT_SOURCE_DIRECTORY env entries from initContainers; replaced with a single SOURCE_DIRECTORY for kube-multus and the various CNI plugin/binary copy initContainers.
OVN Kubernetes script lib & cni copy
bindata/network/ovn-kubernetes/common/008-script-lib.yaml
cni-bin-copy() (cnibincopy.sh/ovnkube-lib.sh): require only SOURCE_DIRECTORY (fatal if unset); set Fedora CoreOS rhelmajor=9; compute a versioned sourcedir (supports two directory layouts), use it if it exists, otherwise warn and fall back to SOURCE_DIRECTORY/generic path; added log of chosen sourcedir before cp.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary changes: removing RHEL8-specific CNI binary logic and implementing fallback to default paths, which aligns with the core objective of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR modifies shell scripts in Kubernetes ConfigMap YAML files, not Go test files. No Ginkgo test declarations are added or modified.
Test Structure And Quality ✅ Passed This PR modifies only Kubernetes manifest YAML files with embedded shell scripts for CNI logic, containing no Ginkgo test code changes.
Microshift Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. The changes are limited to shell script logic embedded in YAML configuration files, updating CNI binary copying logic and OS version detection for initialization purposes, not e2e tests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests added; changes limited to shell scripts in YAML configuration files for network operators.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only shell scripts in ConfigMaps for CNI binary copying; no Kubernetes scheduling constraints are affected.
Ote Binary Stdout Contract ✅ Passed PR modifies only YAML configuration files with embedded shell scripts for CNI operations, not OTE test binaries or Go test code that would violate stdout contract requirements.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR modifies only YAML manifest files and shell scripts, not Ginkgo e2e test code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdodson
Once this PR has been reviewed and has the lgtm label, please assign abhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
bindata/network/multus/multus.yaml (1)

24-26: Also validate that DEFAULT_SOURCE_DIRECTORY is an existing directory.

Right now only variable presence is validated. Add a -d check so failures are explicit and happen before copy-time exits.

Suggested patch
 if [ -z "$DEFAULT_SOURCE_DIRECTORY" ]; then
   log "FATAL ERROR: You must set the DEFAULT_SOURCE_DIRECTORY env variable"
   exit 1
 fi
+if [ ! -d "$DEFAULT_SOURCE_DIRECTORY" ]; then
+  log "FATAL ERROR: DEFAULT_SOURCE_DIRECTORY ($DEFAULT_SOURCE_DIRECTORY) does not exist"
+  exit 1
+fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindata/network/multus/multus.yaml` around lines 24 - 26, The script
currently checks only for presence of DEFAULT_SOURCE_DIRECTORY; modify the
validation around that variable (the if block referencing
DEFAULT_SOURCE_DIRECTORY and the log function) to also verify it is an existing
directory (use a -d style check), and if the check fails call log with a clear
fatal message including the variable name and then exit 1 so failures occur
immediately before any copy operations.
bindata/network/ovn-kubernetes/common/008-script-lib.yaml (1)

500-503: Prefer deriving FCOS major dynamically instead of hardcoding 9.

Line 502 hardcodes Fedora CoreOS to RHEL9, which means this path won’t pick rhel10+ directories without another code change. Consider deriving/probing dynamically to keep the new fallback logic future-proof.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml` around lines 500 -
503, The fedora) branch currently hardcodes rhelmajor=9 for FCOS (VARIANT_ID ==
"coreos"); replace that hardcoded assignment with runtime detection: inside the
fedora) block (where VARIANT_ID and rhelmajor are used) probe the host image to
derive the RHEL major version (e.g., parse /etc/os-release fields like
VERSION_ID/ID_LIKE or run an rpm macro query such as rpm -E '%{rhel}' as a
fallback) and set rhelmajor to the detected major number, falling back to a
sensible default only if detection fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bindata/network/multus/multus.yaml`:
- Around line 24-26: The script currently checks only for presence of
DEFAULT_SOURCE_DIRECTORY; modify the validation around that variable (the if
block referencing DEFAULT_SOURCE_DIRECTORY and the log function) to also verify
it is an existing directory (use a -d style check), and if the check fails call
log with a clear fatal message including the variable name and then exit 1 so
failures occur immediately before any copy operations.

In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml`:
- Around line 500-503: The fedora) branch currently hardcodes rhelmajor=9 for
FCOS (VARIANT_ID == "coreos"); replace that hardcoded assignment with runtime
detection: inside the fedora) block (where VARIANT_ID and rhelmajor are used)
probe the host image to derive the RHEL major version (e.g., parse
/etc/os-release fields like VERSION_ID/ID_LIKE or run an rpm macro query such as
rpm -E '%{rhel}' as a fallback) and set rhelmajor to the detected major number,
falling back to a sensible default only if detection fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 899d45c2-d48a-447b-a74a-f98028b409f0

📥 Commits

Reviewing files that changed from the base of the PR and between bdbba59 and 71efd9d.

📒 Files selected for processing (2)
  • bindata/network/multus/multus.yaml
  • bindata/network/ovn-kubernetes/common/008-script-lib.yaml

@sdodson sdodson force-pushed the OCPBUGS-83863-remove-rhel8-fallback branch from 71efd9d to a4f2c56 Compare April 21, 2026 20:19
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindata/network/multus/multus.yaml`:
- Around line 415-416: The SOURCE_DIRECTORY environment for the bond-cni-plugin
is currently set to the EL9-specific path "/bondcni/rhel9/" which causes the
resolver to construct paths like "/bondcni/rhel${rhelmajor}/rhel9"; change
SOURCE_DIRECTORY to the unversioned base path "/bondcni/" (or alternatively keep
explicit per-RHEL wiring) so the resolver derives the correct per-release
subpaths instead of always falling back to rhel9; update the env var named
SOURCE_DIRECTORY in the bond-cni-plugin container spec accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 88d2ddfe-14cd-40c7-9bc4-3a3bd9a94435

📥 Commits

Reviewing files that changed from the base of the PR and between 71efd9d and a4f2c56.

📒 Files selected for processing (2)
  • bindata/network/multus/multus.yaml
  • bindata/network/ovn-kubernetes/common/008-script-lib.yaml

Comment thread bindata/network/multus/multus.yaml Outdated
@sdodson
Copy link
Copy Markdown
Member Author

sdodson commented Apr 21, 2026

/hold
Need to test all of these together

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 21, 2026
@sdodson
Copy link
Copy Markdown
Member Author

sdodson commented Apr 28, 2026

/testwith e2e-gcp-ovn openshift/ovn-kubernetes#3149 openshift/multus-cni#285

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 28, 2026

@sdodson, testwith: Error processing request. ERROR:

could not determine job runs: requested job is invalid. needs to be formatted like: <org>/<repo>/<branch>/<variant?>/<job>. instead it was: e2e-gcp-ovn

@sdodson
Copy link
Copy Markdown
Member Author

sdodson commented Apr 28, 2026

/testwith openshift/cluster-network-operator/master/e2e-gcp-ovn openshift/ovn-kubernetes#3149 openshift/multus-cni#285

@sdodson
Copy link
Copy Markdown
Member Author

sdodson commented Apr 28, 2026

/testwith openshift/cluster-network-operator/master/e2e-gcp-ovn openshift/ovn-kubernetes#3149 openshift/multus-cni#285 openshift/route-override-cni#66 openshift/bond-cni#113 openshift/egress-router-cni#100 openshift/containernetworking-pluigins#228 openshift/whereabouts-cni#405

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 28, 2026

@sdodson, testwith: Error processing request. ERROR:

could not determine job runs: couldn't get PR from GitHub: openshift/containernetworking-pluigins#228: status code 404 not one of [200], body: {"message":"Not Found","documentation_url":"https://docs.github.com/rest/pulls/pulls#get-a-pull-request","status":"404"}

@sdodson
Copy link
Copy Markdown
Member Author

sdodson commented Apr 28, 2026

Replace hardcoded rhel8/rhel9 case statements with dynamic OS version
detection in both cnibincopy scripts:

- multus.yaml: Consolidate RHEL8_SOURCE_DIRECTORY, RHEL9_SOURCE_DIRECTORY,
  and DEFAULT_SOURCE_DIRECTORY into a single SOURCE_DIRECTORY. The script
  detects the RHEL major version at runtime and probes for a
  version-specific directory (both standard and flat layouts), falling
  back to SOURCE_DIRECTORY with a warning when none exists.

- 008-script-lib.yaml (OVN): Try /usr/libexec/cni/rhel${rhelmajor} first,
  fall back to /usr/libexec/cni/ with a warning.

Both scripts also update the Fedora CoreOS default rhelmajor from 8 to 9.

This unblocks removing rhel8 build stages from upstream images and is
forwards-compatible with future RHEL versions without any CNO changes.

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
@sdodson sdodson force-pushed the OCPBUGS-83863-remove-rhel8-fallback branch from a4f2c56 to f5f6667 Compare April 29, 2026 14:27
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
bindata/network/multus/multus.yaml (1)

71-73: Normalize the fallback path before reusing it.

This branch keeps SOURCE_DIRECTORY verbatim, so a future caller that passes /usr/src/plugins/bin instead of /usr/src/plugins/bin/ will make Line 82 copy the bin directory itself rather than its contents. Using the already-trimmed value here avoids making the trailing slash part of the script’s API.

♻️ Proposed fix
     if [ -z "$sourcedir" ]; then
       log "WARNING: No version-specific directory found for rhel${rhelmajor}, using ${SOURCE_DIRECTORY}"
-      sourcedir="${SOURCE_DIRECTORY}"
+      sourcedir="${default_trimmed}/"
     fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindata/network/multus/multus.yaml` around lines 71 - 73, When falling back
to SOURCE_DIRECTORY for sourcedir, normalize the path by removing any trailing
slash before assigning it so callers passing either "/usr/src/plugins/bin" or
"/usr/src/plugins/bin/" behave the same; update the assignment of sourcedir in
the branch that checks if [ -z "$sourcedir" ] to use the already-trimmed value
(or compute a trimmed version of SOURCE_DIRECTORY) so subsequent uses of
sourcedir (e.g., the copy logic later) operate on the directory contents rather
than the directory name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bindata/network/multus/multus.yaml`:
- Around line 71-73: When falling back to SOURCE_DIRECTORY for sourcedir,
normalize the path by removing any trailing slash before assigning it so callers
passing either "/usr/src/plugins/bin" or "/usr/src/plugins/bin/" behave the
same; update the assignment of sourcedir in the branch that checks if [ -z
"$sourcedir" ] to use the already-trimmed value (or compute a trimmed version of
SOURCE_DIRECTORY) so subsequent uses of sourcedir (e.g., the copy logic later)
operate on the directory contents rather than the directory name.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8d7eed9b-550f-42ae-a699-05e7a2f2429b

📥 Commits

Reviewing files that changed from the base of the PR and between a4f2c56 and f5f6667.

📒 Files selected for processing (2)
  • bindata/network/multus/multus.yaml
  • bindata/network/ovn-kubernetes/common/008-script-lib.yaml

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 29, 2026

@sdodson: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn-upgrade f5f6667 link true /test e2e-gcp-ovn-upgrade
ci/prow/e2e-azure-ovn-upgrade f5f6667 link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-aws-ovn-hypershift-conformance f5f6667 link true /test e2e-aws-ovn-hypershift-conformance
ci/prow/e2e-aws-ovn-rhcos10-techpreview f5f6667 link false /test e2e-aws-ovn-rhcos10-techpreview
ci/prow/e2e-aws-ovn-upgrade f5f6667 link true /test e2e-aws-ovn-upgrade
ci/prow/4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade f5f6667 link false /test 4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade
ci/prow/4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade f5f6667 link false /test 4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade
ci/prow/security f5f6667 link false /test security
ci/prow/e2e-aws-ovn-upgrade-ipsec f5f6667 link true /test e2e-aws-ovn-upgrade-ipsec

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants