Skip to content

fix(installer): corporate-proxy hardening (auth proxy, NO_PROXY, 0.0.0.0 detect, Windows parity)#172

Draft
LukasWodka wants to merge 5 commits into
developfrom
fix/proxy-hardening
Draft

fix(installer): corporate-proxy hardening (auth proxy, NO_PROXY, 0.0.0.0 detect, Windows parity)#172
LukasWodka wants to merge 5 commits into
developfrom
fix/proxy-hardening

Conversation

@LukasWodka
Copy link
Copy Markdown
Contributor

🚫 STACKED ON #171 — DO NOT MERGE BEFORE #171. This branch is based on fix/installer-hardening (#171), so until #171 merges this PR's commit range includes #171's 4 commits. Review only commit 2bdc448 / the 4 files listed under "What changed". Once #171 merges, GitHub auto-reduces this diff to just the proxy changes and I'll flip it from draft → ready.

Summary

Hardens the standalone installer for corporate HTTP proxies (hospital / banking / gov tenants). A customer behind an authenticated corporate proxy hit install failures; the 0.0.0.0 kubeconfig headline was fixed for bash in #166/#167, but adverse testing (forward proxy + real k3d on Linux VMs) found three remaining gaps + a Windows parity hole. Installer-only — no chart changes.

Related

Ref tracebloc/backend#722 · stacked on #171

Type of change

  • Bug fix
  • Security / hardening

What changed (the 4 proxy files)

  • Gap A — authenticated proxies (http://user:pass@host) were silently skipped (k3d's --env KEY=VALUE@FILTER can't carry an @). Now propagated via a k3d --config file (structured YAML env preserves the credentials). Verified k3d v5.8.3 merges --config env with the existing CLI flags. — scripts/lib/cluster.sh
  • Gap B — NO_PROXY was propagated verbatim → in-cluster traffic misrouted through the proxy + k3d cluster create --wait hung. Now auto-augmented (_augment_no_proxy: loopback + RFC1918 + .svc/.cluster.local + host.k3d.internal), into the cluster and host-side for the installer's own kubectl. — scripts/lib/cluster.sh
  • Gap C — externally-created 0.0.0.0 clusters are detected (serverlb HostIp) and flagged with a non-destructive recreate remedy. — scripts/lib/cluster.sh
  • Windows parityinstall-k8s.ps1::New-K3dCluster had none of the bash fixes (still bound --api-port 0.0.0.0:6550, normalized only host.docker.internal, propagated no proxy env). Now mirrors bash: 127.0.0.1:6550, a 0.0.0.0 → 127.0.0.1 kubeconfig rewrite, and Get-EffectiveNoProxy + Write-K3dProxyConfig (auth-safe, augmented NO_PROXY, UTF-8 without BOM). — scripts/install-k8s.ps1

Deferred (out of scope): Gap D — TLS-inspecting / break-and-inspect proxies (corporate CA injection); the CA can only come from customer IT, so it can't be a shipped default. Tracked in backend#722.

Test plan

  • scripts/tests/cluster.bats (15, new) — _augment_no_proxy dedupe/defaults; Gap A regression (auth creds propagated via --config, not skipped); Gap B (augmented NO_PROXY in config + host export); Gap C (0.0.0.0 detect). ✅
  • Pester (install-k8s.Tests.ps1, +6) — Get-EffectiveNoProxy + Write-K3dProxyConfig (auth preserved, augmented NO_PROXY, no BOM). ✅ (38 total)
  • End-to-end on Linux VMs — real create_cluster behind an unreachable authenticated proxy + a deliberately partial NO_PROXY: auth creds baked into the k3s node, cluster up in ~12 s (no hang), kubectl reachable; a hand-made 0.0.0.0 cluster triggered the Gap C warning.
  • ⚠️ PowerShell runtime not validated on a real Windows host (none available) — logic is covered by Pester; a Windows reviewer should confirm the live New-K3dCluster create on a proxied Windows box.

Deployment notes

None — installer-only, no chart/env changes. Proxy settings are honored at install time from the user's existing HTTP_PROXY/HTTPS_PROXY/NO_PROXY.

Checklist

  • Tests added / updated and passing locally
  • Docs updated if behavior or config changed (N/A — no config surface change; behavior documented in code comments + backend#722)
  • No secrets / credentials in the diff (test fixtures use user:pass@proxy.example.com)
  • For security-sensitive paths: appropriate reviewer requested (Windows reviewer needed for the ps1 runtime)

LukasWodka and others added 5 commits June 1, 2026 15:19
Docker: install docker-ce from the official Docker CentOS dnf repo on AlmaLinux/Rocky/Oracle, which get.docker.com rejects as unsupported. k3d: preserve PATH through sudo so the post-install lookup survives RHEL secure_path (which omits /usr/local/bin). conntrack: use the conntrack apt package on Debian/Ubuntu and conntrack-tools elsewhere.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ccess (#716, #717)

Credentials entered at the prompt are validated against the backend api-token-auth endpoint (the same call jobs-manager makes) with a re-prompt loop, so a wrong Client ID or password is caught immediately instead of after a full deploy. After helm apply, wait_for_client_ready polls rollout status and classifies the outcome; print_summary reports connected, starting, bad_creds, image_pull or crash, and prints the data-never-leaves message only when the client is verifiably connected. Exit code now reflects the real outcome.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…716, #717)

Test-Credentials, Wait-ForClientReady and Get-NotReadyState mirror the bash logic in install-k8s.ps1, and Print-Summary is now state-branched. Validated with the PowerShell 7.4 parser; runtime behavior still needs a check on a Windows host.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scripts/tests/: 64 bats tests (summary, install-client-helm, setup-linux, common) + 32 Pester tests for install-k8s.ps1 — all mocked, no Docker/k3d/network needed. Changed-line coverage measured with kcov (bash 96.2%) and Pester (PowerShell 97.4%); residual lines are the real RHEL Docker-install commands + the guarded main() orchestration, exercised by the integration E2E. A TB_PESTER guard lets the suite dot-source install-k8s.ps1 without running the installer. New installer-tests job in helm-ci.yaml runs both suites on PRs (scripts/ added to path filters).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… 0.0.0.0 detect, Windows parity)

A customer running behind an authenticated corporate HTTP proxy hit install
failures. The 0.0.0.0 kubeconfig headline was fixed for the bash path in
#166/#167, but adverse testing (a forward proxy + real k3d on Linux VMs)
surfaced three remaining gaps plus a Windows parity hole:

- Gap A: authenticated proxies (http://user:pass@host) were silently SKIPPED —
  k3d's --env KEY=VALUE@FILTER can't carry an '@' in the value. Now propagated
  via a k3d --config file (structured YAML env) so credentials survive intact.
  Verified on k3d v5.8.3 (it merges the --config env with the existing CLI flags).
- Gap B: NO_PROXY was propagated verbatim. Now auto-augmented with the
  cluster-internal ranges (loopback + RFC1918 + .svc/.cluster.local +
  host.k3d.internal), both into the cluster and host-side, so in-cluster traffic
  never routes through the proxy — fixes the misroute AND the observed
  `k3d cluster create --wait` hang.
- Gap C: a cluster created outside the installer and bound to 0.0.0.0 is now
  detected (serverlb HostIp) and flagged with a non-destructive recreate remedy.
- Windows parity: install-k8s.ps1::New-K3dCluster had NONE of the bash fixes —
  it still bound --api-port 0.0.0.0:6550 (the original headline bug, still live
  on Windows), normalized only host.docker.internal in the kubeconfig, and
  propagated zero proxy env. Now mirrors bash: 127.0.0.1:6550, a 0.0.0.0->127.0.0.1
  kubeconfig rewrite, and Get-EffectiveNoProxy + Write-K3dProxyConfig (auth +
  augmented NO_PROXY, written UTF-8 without a BOM).

Tests: new scripts/tests/cluster.bats (15) + Pester for the two ps1 helpers (6),
both green. Verified end-to-end on Linux VMs: auth creds propagated into the node,
no startup hang behind an unreachable proxy, and 0.0.0.0 detection firing.

Stacked on #171 (the installer test scaffolding + final install-k8s.ps1 live there).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LukasWodka
Copy link
Copy Markdown
Contributor Author

👋 Heads-up — Code review queue is at 15 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants