Skip to content

OCPBUGS-85381: Sanitize bracketed hostnames in kubeconfig server URLs#295

Merged
sdodson merged 1 commit into
openshift:release-4.14from
bpickard22:fix-kubeconfig-brackets-4.14
May 18, 2026
Merged

OCPBUGS-85381: Sanitize bracketed hostnames in kubeconfig server URLs#295
sdodson merged 1 commit into
openshift:release-4.14from
bpickard22:fix-kubeconfig-brackets-4.14

Conversation

@bpickard22
Copy link
Copy Markdown
Contributor

@bpickard22 bpickard22 commented May 8, 2026

Summary

  • Sanitize server URLs read from kubeconfig files in GetK8sClient() to strip brackets from non-IPv6 hostnames
  • Fixes the 4.13-to-4.14 upgrade failure caused by Go 1.24.8+ (CVE-2025-47912) rejecting https://[hostname]:6443
  • IPv6 addresses in brackets are correctly preserved (detected by the presence of colons)

Root cause

During 4.13-to-4.14 upgrades, there is a race window between cnibincopy.sh copying the new multus binary and multus-daemon starting and rewriting the CNI config. In this window, CRI-O still has the old 4.13 CNI config ("type": "multus") and invokes the new standalone binary, which reads the old kubeconfig written by 4.13's entrypoint.sh. That kubeconfig unconditionally wraps KUBERNETES_SERVICE_HOST in brackets (https://[hostname]:6443), which Go 1.24.8+ net/url.Parse() now rejects for non-IPv6 addresses.

This manifests as FailedKillPod errors (500+), stalls the dns-default DaemonSet rollout, causes DNS operator degradation, and fails the upgrade. The gcp-ovn-rt-upgrade-4.14-minor blocking job has been failing for 5+ consecutive payloads.

Fix

Reader-side sanitization in GetK8sClient(): after loading a kubeconfig via clientcmd.BuildConfigFromFlags(), strip brackets from config.Host when the bracketed content contains no colons (hostname or IPv4, not IPv6). This is the single code path where the standalone multus binary reads the kubeconfig from disk.

This approach is preferred over the CNO init container approach (cluster-network-operator#3000) because:

  • Single code change vs. permanent DaemonSet infrastructure
  • No fragile sed regex (the init container regex incorrectly strips brackets from IPv6 addresses ending in hex digits a-f)
  • Handles any kubeconfig with brackets, not just the specific file path

Test plan

  • Unit tests for sanitizeBracketedHost() covering hostname, IPv4, IPv6 (digit-ending and hex-ending), loopback, no-brackets, and empty string
  • go build ./pkg/k8sclient/ passes
  • Verify gcp-ovn-rt-upgrade-4.14-minor (4.13-to-4.14 upgrade) passes with this change
  • Verify fresh 4.14 install succeeds (no kubeconfig to sanitize)
  • Verify IPv6 dual-stack clusters preserve bracketed IPv6 addresses

/cc @s1061123 @tsorya

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed Kubernetes client configuration to properly handle bracketed hostnames in API server URLs, preventing rejection by the Go runtime.
  • Tests

    • Added test coverage for bracketed hostname handling in various formats.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 005c3610-ef3e-40ce-b906-0cc7257b13d6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Added a sanitizeBracketedHost helper function to pkg/k8sclient/kubeconfig.go that removes brackets from bracketed hostnames and IPv4 addresses in Kubernetes API server URLs while preserving bracketed IPv6 literals. Integrated the sanitization into GetK8sClient to clean the config host value. Comprehensive test coverage validates behavior across hostnames, IPv4, IPv6, and edge cases.

Changes

Host Sanitization in Kube Client Configuration

Layer / File(s) Summary
Sanitization logic and integration
pkg/k8sclient/kubeconfig.go
Added strings import and internal sanitizeBracketedHost helper that detects bracketed hosts, extracts the hostname, and conditionally strips brackets only when no colon is present (preserving IPv6 literals). Integrated into GetK8sClient to sanitize config.Host after building from kubeconfig.
Test coverage
pkg/k8sclient/kubeconfig_test.go
TestSanitizeBracketedHost table-driven test validates bracket removal for hostnames and IPv4 addresses, preservation of bracketed IPv6 variants, unchanged non-bracketed inputs, and empty-string handling.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: sanitizing bracketed hostnames in kubeconfig server URLs, which is the core purpose of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR adds standard Go test with table-driven subtests, not Ginkgo tests. All test names are static and descriptive with no dynamic elements. Custom check is inapplicable.
Test Structure And Quality ✅ Passed Custom check requires reviewing Ginkgo tests, but PR adds only a standard Go table-driven unit test using the testing package, not Ginkgo. Check is not applicable.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added. Only TestSanitizeBracketedHost—a standard Go unit test using the testing package—was added. The check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests were added. This PR adds only standard Go unit tests using the testing package. SNO compatibility check applies only to Ginkgo-style e2e tests, not applicable here.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies kubeconfig utility code only (helper function + unit test). No deployment manifests, operator code, controllers, or scheduling constraints are introduced. The check is not applicable.
Ote Binary Stdout Contract ✅ Passed This PR modifies library code in pkg/k8sclient (multus-cni), not an OTE binary or test infrastructure. No stdout writes exist in process-level code; the check is not applicable.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only a standard Go unit test (TestSanitizeBracketedHost), not Ginkgo e2e tests. Check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@sdodson sdodson changed the title OCPBUGS-85253: Sanitize bracketed hostnames in kubeconfig server URLs OCPBUGS-85381: Sanitize bracketed hostnames in kubeconfig server URLs May 11, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 11, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@bpickard22: This pull request references Jira Issue OCPBUGS-85381, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-85381 to depend on a bug targeting a version in 4.15.0, 4.15.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Sanitize server URLs read from kubeconfig files in GetK8sClient() to strip brackets from non-IPv6 hostnames
  • Fixes the 4.13-to-4.14 upgrade failure caused by Go 1.24.8+ (CVE-2025-47912) rejecting https://[hostname]:6443
  • IPv6 addresses in brackets are correctly preserved (detected by the presence of colons)

Root cause

During 4.13-to-4.14 upgrades, there is a race window between cnibincopy.sh copying the new multus binary and multus-daemon starting and rewriting the CNI config. In this window, CRI-O still has the old 4.13 CNI config ("type": "multus") and invokes the new standalone binary, which reads the old kubeconfig written by 4.13's entrypoint.sh. That kubeconfig unconditionally wraps KUBERNETES_SERVICE_HOST in brackets (https://[hostname]:6443), which Go 1.24.8+ net/url.Parse() now rejects for non-IPv6 addresses.

This manifests as FailedKillPod errors (500+), stalls the dns-default DaemonSet rollout, causes DNS operator degradation, and fails the upgrade. The gcp-ovn-rt-upgrade-4.14-minor blocking job has been failing for 5+ consecutive payloads.

Fix

Reader-side sanitization in GetK8sClient(): after loading a kubeconfig via clientcmd.BuildConfigFromFlags(), strip brackets from config.Host when the bracketed content contains no colons (hostname or IPv4, not IPv6). This is the single code path where the standalone multus binary reads the kubeconfig from disk.

This approach is preferred over the CNO init container approach (cluster-network-operator#3000) because:

  • Single code change vs. permanent DaemonSet infrastructure
  • No fragile sed regex (the init container regex incorrectly strips brackets from IPv6 addresses ending in hex digits a-f)
  • Handles any kubeconfig with brackets, not just the specific file path

Test plan

  • Unit tests for sanitizeBracketedHost() covering hostname, IPv4, IPv6 (digit-ending and hex-ending), loopback, no-brackets, and empty string
  • go build ./pkg/k8sclient/ passes
  • Verify gcp-ovn-rt-upgrade-4.14-minor (4.13-to-4.14 upgrade) passes with this change
  • Verify fresh 4.14 install succeeds (no kubeconfig to sanitize)
  • Verify IPv6 dual-stack clusters preserve bracketed IPv6 addresses

/cc @s1061123 @tsorya

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread pkg/k8sclient/kubeconfig.go
@sdodson sdodson added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 12, 2026
@bpickard22
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@bpickard22: This pull request references Jira Issue OCPBUGS-85381, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-85381 to depend on a bug targeting a version in 4.15.0, 4.15.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Retaining the jira/valid-bug label as it was manually added.

Details

In response to this:

Summary

  • Sanitize server URLs read from kubeconfig files in GetK8sClient() to strip brackets from non-IPv6 hostnames
  • Fixes the 4.13-to-4.14 upgrade failure caused by Go 1.24.8+ (CVE-2025-47912) rejecting https://[hostname]:6443
  • IPv6 addresses in brackets are correctly preserved (detected by the presence of colons)

Root cause

During 4.13-to-4.14 upgrades, there is a race window between cnibincopy.sh copying the new multus binary and multus-daemon starting and rewriting the CNI config. In this window, CRI-O still has the old 4.13 CNI config ("type": "multus") and invokes the new standalone binary, which reads the old kubeconfig written by 4.13's entrypoint.sh. That kubeconfig unconditionally wraps KUBERNETES_SERVICE_HOST in brackets (https://[hostname]:6443), which Go 1.24.8+ net/url.Parse() now rejects for non-IPv6 addresses.

This manifests as FailedKillPod errors (500+), stalls the dns-default DaemonSet rollout, causes DNS operator degradation, and fails the upgrade. The gcp-ovn-rt-upgrade-4.14-minor blocking job has been failing for 5+ consecutive payloads.

Fix

Reader-side sanitization in GetK8sClient(): after loading a kubeconfig via clientcmd.BuildConfigFromFlags(), strip brackets from config.Host when the bracketed content contains no colons (hostname or IPv4, not IPv6). This is the single code path where the standalone multus binary reads the kubeconfig from disk.

This approach is preferred over the CNO init container approach (cluster-network-operator#3000) because:

  • Single code change vs. permanent DaemonSet infrastructure
  • No fragile sed regex (the init container regex incorrectly strips brackets from IPv6 addresses ending in hex digits a-f)
  • Handles any kubeconfig with brackets, not just the specific file path

Test plan

  • Unit tests for sanitizeBracketedHost() covering hostname, IPv4, IPv6 (digit-ending and hex-ending), loopback, no-brackets, and empty string
  • go build ./pkg/k8sclient/ passes
  • Verify gcp-ovn-rt-upgrade-4.14-minor (4.13-to-4.14 upgrade) passes with this change
  • Verify fresh 4.14 install succeeds (no kubeconfig to sanitize)
  • Verify IPv6 dual-stack clusters preserve bracketed IPv6 addresses

/cc @s1061123 @tsorya

Summary by CodeRabbit

Release Notes

  • Bug Fixes

  • Fixed Kubernetes client configuration to properly handle bracketed hostnames in API server URLs, preventing rejection by the Go runtime.

  • Tests

  • Added test coverage for bracketed hostname handling in various formats.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 14, 2026

/retest-required

During 4.13-to-4.14 upgrades, old kubeconfigs written by 4.13's
entrypoint.sh contain server URLs like https://[hostname]:6443. The
Go 1.24.8+ bump (CVE-2025-47912) causes net/url to reject brackets
around non-IPv6 addresses. This fails every CNI DEL call during the
upgrade window before multus-daemon rewrites the config.

Strip brackets from non-IPv6 hostnames when reading kubeconfig files
in GetK8sClient. IPv6 addresses (identified by containing colons) are
preserved.

Assisted by Claude Opus 4.6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Benjamin Pickard <bpickard@redhat.com>
@bpickard22 bpickard22 force-pushed the fix-kubeconfig-brackets-4.14 branch from b578183 to 4e91ee7 Compare May 14, 2026 14:34
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2026
@tsorya
Copy link
Copy Markdown
Contributor

tsorya commented May 14, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 14, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bpickard22, tsorya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sdodson sdodson added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label May 14, 2026
@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 14, 2026

/payload-job periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

@sdodson: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/713923f0-4fae-11f1-9989-8be35ba84626-0

@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 14, 2026

/payload-job periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

@sdodson: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c0a84b50-4fd1-11f1-937f-2abac8644d2b-0

@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 17, 2026

/payload-job periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 17, 2026

@sdodson: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f1a41a10-51ff-11f1-8781-ced68785a58f-0

@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 18, 2026

/verified by CI

@sdodson sdodson merged commit e3c26de into openshift:release-4.14 May 18, 2026
4 of 6 checks passed
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@bpickard22: Jira Issue OCPBUGS-85381: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-85381 has been moved to the MODIFIED state.

Details

In response to this:

Summary

  • Sanitize server URLs read from kubeconfig files in GetK8sClient() to strip brackets from non-IPv6 hostnames
  • Fixes the 4.13-to-4.14 upgrade failure caused by Go 1.24.8+ (CVE-2025-47912) rejecting https://[hostname]:6443
  • IPv6 addresses in brackets are correctly preserved (detected by the presence of colons)

Root cause

During 4.13-to-4.14 upgrades, there is a race window between cnibincopy.sh copying the new multus binary and multus-daemon starting and rewriting the CNI config. In this window, CRI-O still has the old 4.13 CNI config ("type": "multus") and invokes the new standalone binary, which reads the old kubeconfig written by 4.13's entrypoint.sh. That kubeconfig unconditionally wraps KUBERNETES_SERVICE_HOST in brackets (https://[hostname]:6443), which Go 1.24.8+ net/url.Parse() now rejects for non-IPv6 addresses.

This manifests as FailedKillPod errors (500+), stalls the dns-default DaemonSet rollout, causes DNS operator degradation, and fails the upgrade. The gcp-ovn-rt-upgrade-4.14-minor blocking job has been failing for 5+ consecutive payloads.

Fix

Reader-side sanitization in GetK8sClient(): after loading a kubeconfig via clientcmd.BuildConfigFromFlags(), strip brackets from config.Host when the bracketed content contains no colons (hostname or IPv4, not IPv6). This is the single code path where the standalone multus binary reads the kubeconfig from disk.

This approach is preferred over the CNO init container approach (cluster-network-operator#3000) because:

  • Single code change vs. permanent DaemonSet infrastructure
  • No fragile sed regex (the init container regex incorrectly strips brackets from IPv6 addresses ending in hex digits a-f)
  • Handles any kubeconfig with brackets, not just the specific file path

Test plan

  • Unit tests for sanitizeBracketedHost() covering hostname, IPv4, IPv6 (digit-ending and hex-ending), loopback, no-brackets, and empty string
  • go build ./pkg/k8sclient/ passes
  • Verify gcp-ovn-rt-upgrade-4.14-minor (4.13-to-4.14 upgrade) passes with this change
  • Verify fresh 4.14 install succeeds (no kubeconfig to sanitize)
  • Verify IPv6 dual-stack clusters preserve bracketed IPv6 addresses

/cc @s1061123 @tsorya

Summary by CodeRabbit

Release Notes

  • Bug Fixes

  • Fixed Kubernetes client configuration to properly handle bracketed hostnames in API server URLs, preventing rejection by the Go runtime.

  • Tests

  • Added test coverage for bracketed hostname handling in various formats.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@sdodson: Jira Issue OCPBUGS-85381 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state.

Details

In response to this:

/verified by CI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented May 18, 2026

Insights was broken until the last payload job ran, now things look much better though it still complained about how long it took to rebase the OS it looked fine otherwise.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in release 4.14.0-0.nightly-2026-05-18-155714

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.