Skip to content

OCPBUGS-44637: node-joiner: configure proxy CA certificate before image pulls#10532

Open
rwsu wants to merge 1 commit intoopenshift:mainfrom
rwsu:OCPBUGS-44637
Open

OCPBUGS-44637: node-joiner: configure proxy CA certificate before image pulls#10532
rwsu wants to merge 1 commit intoopenshift:mainfrom
rwsu:OCPBUGS-44637

Conversation

@rwsu
Copy link
Copy Markdown
Contributor

@rwsu rwsu commented May 5, 2026

When a cluster proxy is configured with a self-signed certificate, the node-joiner pod fails to pull images through the proxy because the proxy CA is not trusted by the pod's TLS stack.

setupProxyCACert now runs before the asset graph. It reads proxy/cluster to check for a trusted CA bundle, fetches the named ConfigMap from openshift-config, concatenates the cert with the system CA bundle, and sets SSL_CERT_FILE so that all subsequent Go TLS connections and oc subprocess calls trust both the proxy CA and public registry CAs.

Assisted-by: Claude Sonnet 4.6 noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • Node-join now picks up cluster proxy trusted CA, merges it with the system CA bundle when present, and generates an SSL certificate bundle used by the process.
    • If no trusted CA is configured or the referenced CA data is missing, no bundle is created and behavior remains unchanged.
  • Tests

    • Added tests covering bundle creation, concatenation behavior (including missing trailing newline), and no-op cases when proxy/CA data is absent.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@rwsu: This pull request references Jira Issue OCPBUGS-44637, which is invalid:

  • expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.19.z" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When a cluster proxy is configured with a self-signed certificate, the node-joiner pod fails to pull images through the proxy because the proxy CA is not trusted by the pod's TLS stack.

setupProxyCACert now runs before the asset graph. It reads proxy/cluster to check for a trusted CA bundle, fetches the named ConfigMap from openshift-config, concatenates the cert with the system CA bundle, and sets SSL_CERT_FILE so that all subsequent Go TLS connections and oc subprocess calls trust both the proxy CA and public registry CAs.

Assisted-by: Claude Sonnet 4.6 noreply@anthropic.com

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 5, 2026
@openshift-ci openshift-ci Bot requested review from pawanpinjarkar and tthvo May 5, 2026 02:00
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign bfournie for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 570d247b-acee-4917-940c-0aa247d0a1fb

📥 Commits

Reviewing files that changed from the base of the PR and between 460c9e3 and 7568ce4.

📒 Files selected for processing (2)
  • pkg/nodejoiner/addnodes.go
  • pkg/nodejoiner/addnodes_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/nodejoiner/addnodes.go
  • pkg/nodejoiner/addnodes_test.go

Walkthrough

New logic runs during add-node command startup to read the cluster Proxy, fetch a referenced trusted CA ConfigMap from openshift-config, and—when a CA bundle is present—concatenate it with the system CA bundle into proxy-ca-bundle.crt and set SSL_CERT_FILE to that path.

Changes

Proxy CA Trust Setup

Layer / File(s) Summary
Imports & Dependencies
pkg/nodejoiner/addnodes.go
Added Kubernetes/OpenShift client, REST config, error types, and logging imports to support proxy CA setup.
System CA Path Configuration
pkg/nodejoiner/addnodes.go
Introduced package variable systemCACertBundle for system CA bundle path (overridable in tests).
Command Integration
pkg/nodejoiner/addnodes.go
NewAddNodesCommand now calls setupProxyCACert(directory, kubeConfig) and logs a warning if it returns an error.
Config Client Builder
pkg/nodejoiner/addnodes.go
Added setupProxyCACert() to build REST config (kubeconfig or in-cluster), construct OpenShift config and Kubernetes clients, and delegate to client-based implementation.
Core Proxy CA Setup Logic
pkg/nodejoiner/addnodes.go
Added setupProxyCACertWithClients() which reads Proxies/cluster, fetches the referenced ConfigMap in openshift-config, reads optional system CA bundle, concatenates (ensuring newline when needed), writes proxy-ca-bundle.crt to directory, logs, and sets SSL_CERT_FILE. Missing proxy, missing configmap, or missing/empty ca-bundle.crt are treated as no-ops.
Tests / Fixtures
pkg/nodejoiner/addnodes_test.go
Added TestSetupProxyCACert (table-driven) covering cases for present/missing proxy, present/missing ConfigMap, absent ca-bundle.crt key, and system CA content newline behavior; uses fake clients, temp dirs, and helper builders proxyWithTrustedCA() and caConfigMap().

Sequence Diagram

sequenceDiagram
    actor AddNodesCommand as NewAddNodesCommand
    participant Setup as setupProxyCACert
    participant REST as REST Config Builder
    participant ConfigClient as OpenShift Config Client
    participant K8sClient as Kubernetes Client
    participant ConfigMap as ConfigMap (openshift-config)
    participant FS as Filesystem

    AddNodesCommand->>Setup: setupProxyCACert(directory, kubeConfig)
    Setup->>REST: build REST config (kubeconfig or in-cluster)
    REST-->>Setup: REST config
    Setup->>ConfigClient: create OpenShift config client
    Setup->>K8sClient: create Kubernetes client
    Setup->>ConfigClient: Get Proxies/cluster
    alt Proxy exists and TrustedCA.name set
        ConfigClient->>ConfigMap: Get ConfigMap in openshift-config
        alt ConfigMap found and has ca-bundle.crt
            ConfigMap-->>Setup: ca-bundle.crt
            Setup->>FS: read system CA bundle (optional)
            FS-->>Setup: system CA content (may be empty)
            Setup->>FS: write concatenated proxy-ca-bundle.crt
            FS-->>Setup: path to proxy-ca-bundle.crt
            Setup->>Setup: set SSL_CERT_FILE to path
        else ConfigMap missing or missing key
            ConfigMap-->>Setup: NotFound or no ca-bundle.crt => no-op
        end
    else Proxy missing or TrustedCA not set
        ConfigClient-->>Setup: no-op
    end
    Setup-->>AddNodesCommand: return (nil or error)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: configuring proxy CA certificates before image pulls in the node-joiner component, directly matching the changeset's core functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Test uses standard Go t.Run(), not Ginkgo. All test names are stable static strings with no dynamic values, generated IDs, timestamps, or changing identifiers.
Test Structure And Quality ✅ Passed Custom check targets Ginkgo test patterns (It blocks, BeforeEach/AfterEach). PR uses standard Go table-driven tests with testify, not Ginkgo. Check is not applicable.
Microshift Test Compatibility ✅ Passed The test added is TestSetupProxyCACert, a standard Go unit test (func TestSetupProxyCACert(t *testing.T)), not a Ginkgo e2e test. No Ginkgo constructs used. Check only applies to Ginkgo e2e tests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added. The new test file contains only a standard Go unit test with fake Kubernetes clients. The check targets Ginkgo e2e tests, which are not present in this PR.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds proxy CA cert setup utility code with no deployment manifests, operators, or scheduling constraints.
Ote Binary Stdout Contract ✅ Passed The OTE check is inapplicable. node-joiner is a standalone CLI, not an OTE binary. The added code uses only logrus (redirected to stderr) with no stdout writes at process level.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR does not add Ginkgo e2e tests—the test is a standard Go unit test using testing.T, not Ginkgo patterns. No IPv4 assumptions or external connectivity requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.1)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/nodejoiner/addnodes.go (1)

139-145: 💤 Low value

Consider ensuring a newline separator between the system CA and proxy CA bundles.

If the system CA bundle file doesn't end with a newline, the concatenation will produce malformed PEM where the proxy CA's -----BEGIN CERTIFICATE----- immediately follows the system CA's -----END CERTIFICATE----- on the same line. While RHEL/CoreOS bundles typically end with newlines, defensive handling would prevent subtle TLS failures.

Proposed fix
 	// Read the system CA bundle so public registries remain trusted alongside the proxy CA.
 	systemCerts, err := os.ReadFile(systemCACertBundle)
 	if err != nil && !os.IsNotExist(err) {
 		return fmt.Errorf("cannot read system CA bundle: %w", err)
 	}

-	combined := append(systemCerts, []byte(proxyCACert)...)
+	combined := systemCerts
+	if len(combined) > 0 && combined[len(combined)-1] != '\n' {
+		combined = append(combined, '\n')
+	}
+	combined = append(combined, []byte(proxyCACert)...)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/nodejoiner/addnodes.go` around lines 139 - 145, When building the
combined CA bundle after reading systemCerts and proxyCACert, ensure there is a
newline separator so PEM blocks don't run together; modify the logic around
systemCerts/combined (the os.ReadFile(systemCACertBundle) handling and the
combined := append(...) step) to detect if systemCerts ends with a newline (or
is empty) and if not append a single '\n' before concatenating proxyCACert, so
the final combined PEM always has a newline boundary between the system CA
bundle and proxy CA.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/nodejoiner/addnodes.go`:
- Around line 139-145: When building the combined CA bundle after reading
systemCerts and proxyCACert, ensure there is a newline separator so PEM blocks
don't run together; modify the logic around systemCerts/combined (the
os.ReadFile(systemCACertBundle) handling and the combined := append(...) step)
to detect if systemCerts ends with a newline (or is empty) and if not append a
single '\n' before concatenating proxyCACert, so the final combined PEM always
has a newline boundary between the system CA bundle and proxy CA.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f1058115-4d0d-497a-a248-8f09a866f032

📥 Commits

Reviewing files that changed from the base of the PR and between f043a6d and 460c9e3.

📒 Files selected for processing (2)
  • pkg/nodejoiner/addnodes.go
  • pkg/nodejoiner/addnodes_test.go

…ge pulls

When a cluster proxy is configured with a self-signed certificate,
the node-joiner pod fails to pull images through the proxy because
the proxy CA is not trusted by the pod's TLS stack.

setupProxyCACert now runs before the asset graph. It reads
proxy/cluster to check for a trusted CA bundle, fetches the named
ConfigMap from openshift-config, concatenates the cert with the
system CA bundle, and sets SSL_CERT_FILE so that all subsequent
Go TLS connections and oc subprocess calls trust both the proxy CA
and public registry CAs.

Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

@rwsu: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@rwsu
Copy link
Copy Markdown
Contributor Author

rwsu commented May 6, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@rwsu: This pull request references Jira Issue OCPBUGS-44637, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @zniu1011

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

@openshift-ci-robot: GitHub didn't allow me to request PR reviews from the following users: zniu1011.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

@rwsu: This pull request references Jira Issue OCPBUGS-44637, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @zniu1011

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants