NO-JIRA: slim and harden the shared-ingress HAProxy runtime#8399
NO-JIRA: slim and harden the shared-ingress HAProxy runtime#8399tuxerrante wants to merge 7 commits intoopenshift:mainfrom
Conversation
Move the shared-ingress image to a pinned UBI micro runtime and remove the unused socat package from the locked RPM inputs. Harden the HAProxy pod security context so the shared-ingress deployment matches the reduced runtime assumptions. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
Validate the shared-ingress image with the controller's real HAProxy command, mounted config and runtime directories, and read-only root filesystem. Keep the smoke test useful on macOS and Linux hosts by using the repo's container runtime detection for podman or docker. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@tuxerrante: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
/auto-cc |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughConverted the shared-ingress image to a multi-stage Containerfile using a pinned UBI builder and a pinned UBI-micro runtime, removed Sequence Diagram(s)sequenceDiagram
actor DevShell as Developer Shell
participant ContainerRuntime as Container Runtime (podman/docker)
participant HAProxy as HAProxy (in container)
participant HostFS as Host filesystem (mounted config & socket)
DevShell->>ContainerRuntime: build image (multi-stage Containerfile)
DevShell->>HostFS: create temp config dir and runtime socket dir
DevShell->>ContainerRuntime: run container (read-only rootfs, mount config & runtime, publish port)
ContainerRuntime->>HAProxy: start haproxy with mounted config and admin socket
HAProxy->>HostFS: create admin socket in mounted runtime dir
DevShell->>HAProxy: poll /haproxy_ready endpoint
HAProxy-->>DevShell: respond 200 when ready
DevShell->>ContainerRuntime: inspect container state (expect running)
DevShell->>ContainerRuntime: stop & remove container (cleanup)
🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/label area/hypershift-operator area/documentation area/testing |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8399 +/- ##
==========================================
+ Coverage 37.19% 37.40% +0.21%
==========================================
Files 750 751 +1
Lines 91791 91821 +30
==========================================
+ Hits 34139 34348 +209
+ Misses 55002 54838 -164
+ Partials 2650 2635 -15
... and 39 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (3)
hypershift-operator/controllers/sharedingress/router.go (2)
188-210: ⚡ Quick win
config-generatorcontainer has noSecurityContextWhile this PR focuses on hardening the
private-routercontainer, theconfig-generatorsidecar is left with noSecurityContext, meaning it runs with default (potentially root) permissions and unrestricted capabilities. Consider applying at leastallowPrivilegeEscalation: falseandcapabilities.drop: ["ALL"]here as well, as a follow-up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/sharedingress/router.go` around lines 188 - 210, The config-generator sidecar created by buildConfigGeneratorContainer lacks a SecurityContext; update the returned function so it sets c.SecurityContext to a non-nil &corev1.SecurityContext with at least AllowPrivilegeEscalation=false and Capabilities.Drop containing "ALL" (use corev1.Capability("ALL") or corev1.Capability values) to mirror hardening applied to the private-router container; ensure you set the SecurityContext on the container variable c within buildConfigGeneratorContainer so it is applied when the PodSpec is built.
157-168: AddRunAsNonRootfor Kubernetes Restricted PSS complianceThe
SecurityContextcorrectly setsAllowPrivilegeEscalation=false,ReadOnlyRootFilesystem=true,Capabilities.Drop=["ALL"], andSeccompProfile=RuntimeDefault. However, the Kubernetes "Restricted" pod security standard also requiresrunAsNonRoot=true. Without it, the container can start as UID 0 if the image lacks a non-rootUSERdirective.Proposed addition
c.SecurityContext = &corev1.SecurityContext{ AllowPrivilegeEscalation: ptr.To(false), + RunAsNonRoot: ptr.To(true), ReadOnlyRootFilesystem: ptr.To(true), Capabilities: &corev1.Capabilities{ Drop: []corev1.Capability{ "ALL", }, }, SeccompProfile: &corev1.SeccompProfile{ Type: corev1.SeccompProfileTypeRuntimeDefault, }, }First, verify the HAProxy UBI-micro image defines a non-root
USER; if it does not, settingRunAsNonRoot=truecould cause pod startup failures in environments that strictly enforce pod security standards.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/sharedingress/router.go` around lines 157 - 168, SecurityContext for the router container lacks RunAsNonRoot which is required for Kubernetes "Restricted" PSS; update the SecurityContext set on c.SecurityContext in router.go to include RunAsNonRoot: ptr.To(true) (and verify the HAProxy UBI-micro image uses a non-root USER to avoid startup failures), i.e., add RunAsNonRoot to the existing SecurityContext alongside AllowPrivilegeEscalation, ReadOnlyRootFilesystem, Capabilities.Drop and SeccompProfile so pods comply with Restricted policy.shared-ingress/smoke-test.sh (1)
73-84: ⚡ Quick winDrop
--rm; the EXIT trap already handles cleanupWith
--rm, if HAProxy crashes before the readiness check passes (lines 87–95), the container is auto-removed immediately. The subsequent"${runtime}" logs "${container_name}"calls on lines 91, 99, and 106 then fail silently (swallowed by|| true), leaving no diagnostic output for CI. Thetrap cleanup EXITon line 18 already guarantees cleanup under all exit paths, making--rmboth redundant and actively harmful for debugging.♻️ Proposed fix
"${runtime}" run -d \ - --rm \ --name "${container_name}" \ --read-only \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shared-ingress/smoke-test.sh` around lines 73 - 84, Remove the dangerous --rm flag from the container startup command so CI can inspect logs if HAProxy crashes; locate the docker run invocation that uses "${runtime}" run -d --rm --name "${container_name}" ... and delete only the --rm token, leaving the rest (publish, volumes, entrypoint, image_tag, args) intact—cleanup is already handled by the existing trap cleanup (trap cleanup EXIT), so keep that trap and the subsequent "${runtime}" logs "${container_name}" calls as-is to preserve post-failure diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@hypershift-operator/controllers/sharedingress/router.go`:
- Around line 188-210: The config-generator sidecar created by
buildConfigGeneratorContainer lacks a SecurityContext; update the returned
function so it sets c.SecurityContext to a non-nil &corev1.SecurityContext with
at least AllowPrivilegeEscalation=false and Capabilities.Drop containing "ALL"
(use corev1.Capability("ALL") or corev1.Capability values) to mirror hardening
applied to the private-router container; ensure you set the SecurityContext on
the container variable c within buildConfigGeneratorContainer so it is applied
when the PodSpec is built.
- Around line 157-168: SecurityContext for the router container lacks
RunAsNonRoot which is required for Kubernetes "Restricted" PSS; update the
SecurityContext set on c.SecurityContext in router.go to include RunAsNonRoot:
ptr.To(true) (and verify the HAProxy UBI-micro image uses a non-root USER to
avoid startup failures), i.e., add RunAsNonRoot to the existing SecurityContext
alongside AllowPrivilegeEscalation, ReadOnlyRootFilesystem, Capabilities.Drop
and SeccompProfile so pods comply with Restricted policy.
In `@shared-ingress/smoke-test.sh`:
- Around line 73-84: Remove the dangerous --rm flag from the container startup
command so CI can inspect logs if HAProxy crashes; locate the docker run
invocation that uses "${runtime}" run -d --rm --name "${container_name}" ... and
delete only the --rm token, leaving the rest (publish, volumes, entrypoint,
image_tag, args) intact—cleanup is already handled by the existing trap cleanup
(trap cleanup EXIT), so keep that trap and the subsequent "${runtime}" logs
"${container_name}" calls as-is to preserve post-failure diagnostics.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: ad066b4a-d1b9-4c8b-81db-02437bec4a82
📒 Files selected for processing (8)
Makefilehypershift-operator/controllers/sharedingress/router.gohypershift-operator/controllers/sharedingress/router_test.goshared-ingress/Containerfileshared-ingress/README.mdshared-ingress/rpms.in.yamlshared-ingress/rpms.lock.yamlshared-ingress/smoke-test.sh
💤 Files with no reviewable changes (2)
- shared-ingress/rpms.in.yaml
- shared-ingress/rpms.lock.yaml
Resolve the shared-ingress RPM lockfile as an empty-root transaction so Konflux hermetic builds prefetch the full dependency closure needed by `dnf --installroot /rootfs`. Add the UBI BaseOS repository and disable weak dependencies in the RPM input so the lockfile matches the installroot build used by the Containerfile. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
Root cause for the failing shared-ingress Konflux build was the mismatch between the new empty-root I pushed a follow-up commit that:
Local verification on the updated branch:
|
Create the merged-/usr lib64 symlink before running the empty-root DNF transaction so the filesystem RPM does not fail when /rootfs/lib64 is present as a directory in buildah-based CI. Keep the installroot layout aligned with the installed UBI 10 filesystem package while preserving the smaller ubi-micro runtime image. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
I pushed a second follow-up after reproducing the remaining installroot failure locally. The key detail is that the Fresh local verification on the updated head:
|
Konflux injects local RPM repositories from the lockfile, but those generated repo definitions do not carry an importable gpgkey. Disable DNF repo GPG checks for the installroot transaction so the hermetic build can consume the checksum-verified local RPM mirror. Retain the pinned package list and the smaller ubi-micro runtime image while matching the offline repo behavior exercised in CI. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
I pushed another targeted follow-up after reproducing the Konflux-style RPM path locally. What the local hermetic repro showed:
Fresh local verification on the updated head:
|
Harden the config-generator sidecar by dropping all capabilities and disabling privilege escalation, and keep smoke-test containers around long enough to preserve logs when startup fails. Avoid changing RunAsNonRoot here because the current shared-ingress runtime image does not yet declare a non-root USER and that would risk breaking existing deployments. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
I pushed one more small follow-up for the outstanding CodeRabbit nitpicks. Included in this update:
I intentionally did not add Fresh local verification on the updated head:
|
|
/test e2e-aks |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
@tuxerrante: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test Resultse2e-aks
Failed TestsTotal failed tests: 3
|
|
I now have all the evidence needed. Let me produce the final report. Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe sole failure is Root CauseThe
This is a transient infrastructure issue — a brief network blip during pod initialization that prevented the TLS handshake from completing within the timeout window. The This failure is completely unrelated to PR #8399, which only modifies:
The shared-ingress router pods ( Recommendations
Evidence
|
jparrill
left a comment
There was a problem hiding this comment.
Dropped some comments. Thanks!
Document the hermetic RPM and router capability decisions inline so the hardening changes remain easier to review and maintain. Refactor the shared-ingress controller tests to use table-driven gomega assertions and the shared podspec container lookup helper requested in review. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Assisted-by: GPT-5.4 (via Cursor) Made-with: Cursor
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tuxerrante The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR reduces the shared-ingress HAProxy image footprint and runtime attack surface without leaving the Red Hat image supply chain.
The runtime image currently starts from the full
ubi10/ubibase even though the production container only needs HAProxy and its runtime libraries. This PR switches that flow to a pinned multi-stage build that installs HAProxy into an installroot in a pinnedubi10/ubibuilder and copies only that payload into a pinnedubi10/ubi-microfinal image.It also removes the unused
socatRPM from the image inputs, hardens the HAProxy container security context in the shared-ingress deployment, and adds a portable smoke test that exercises the same startup contract the controller uses in-cluster.Problem this solves
The previous image pulled in many packages that are not needed by the shared-ingress runtime path. That inflated the image, increased scanner noise, and left avoidable packages such as Python- and Vim-related content in a critical ingress-facing image.
By trimming the image down to the HAProxy runtime payload we expect:
allowPrivilegeEscalation: false, andRuntimeDefaultseccompLocal image comparison used for this change
registry.access.redhat.com/ubi10/ubi:10.0-1753787353226035753bytes0 critical,29 high,193 medium,142 low107109829bytes0 critical,0 high,45 medium,22 lowWhich issue(s) this PR fixes:
Fixes #8398
Special notes for your reviewer:
NO-JIRAin the title because this change is linked to a GitHub issue rather than a Jira ticket.Validation
go test ./hypershift-operator/controllers/sharedingressmake test-shared-ingress-smokemake pre-commitgo test ./hypershift-operator/controllers/nodepool -run TestResolveHAProxyImageChecklist:
Summary by CodeRabbit
New Features
Security
Tests
Chores
Documentation