fix(#166, #168): installer survives corporate proxy + sanitizes credential input#167
Merged
Conversation
Four bugs caused i.sh to fail behind corporate proxies (surfaced during a Charité install — P1 customer blocker): 1. k3d --api-port defaulted to 0.0.0.0:6550 — corporate proxies intercept the 0.0.0.0 connection. Pin explicitly to 127.0.0.1:6550. 2. HTTP_PROXY/HTTPS_PROXY/NO_PROXY never propagated to k3d containers, so k3s couldn't reach external registries. Propagate when set (no-op when unset — fully backward compatible). 3. _merge_kubeconfig clobbered manual sed fixes on every re-run. Apply a defensive 0.0.0.0 → 127.0.0.1 normalization after the merge, anchored to https://0.0.0.0: so CIDR ranges are untouched. 4. _wait_for_api error message blamed Docker when the real cause was the API being unreachable. Rewrite with the three actual causes. Also bump client chart to v1.4.2. Closes #166 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
|
👋 Heads-up — Code review queue is at 11 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
BSD sed (macOS) requires an argument to -i; the previous workaround text would error out for the audience most likely to see this message (local dev installs on macOS). Use the `sed -i.bak ... && rm ...bak` form, which works on both BSD and GNU sed — same form the script itself uses for the Bug 3 normalization. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two bugbot findings on PR #167: 1. (high) k3d's --env KEY=VALUE@NODEFILTER syntax breaks when the proxy value itself contains '@' (e.g. embedded creds: user:pass@host) — k3d cluster create fails before the cluster exists. Detect '@' in the value, skip propagation for that var, and warn the user with actionable guidance (strip creds, or configure proxy auth inside the cluster). 2. (medium) Proxy env was only injected on fresh cluster creation; re-runs against an existing cluster left registries unreachable because k3d bakes --env into containers at create time. Add _check_existing_cluster_proxy: docker-inspect the server container, compare against host proxy env, warn the user to delete+recreate when the cluster is missing proxy env. Silent no-op when Docker is down or no proxy env is set on the host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two more bugbot findings on 8a12455: 1. (medium) The Bug 4 error message hardcoded ~/.kube/config in the (c) sed-workaround hint, but _merge_kubeconfig honors $KUBECONFIG. Users with a custom KUBECONFIG were being directed to the wrong file. Use ${KUBECONFIG:-~/.kube/config}, and if it's colon-separated (multi-file), point at the first entry. 2. (medium) _check_existing_cluster_proxy only checked HTTP/S_PROXY in the cluster env, so a partial mismatch (cluster has HTTP_PROXY but not NO_PROXY) silently passed the gate. Switch to a per-var check: for each proxy var set on the host, verify the cluster has it, and list specifically which ones are missing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…malize Two consistency bugs introduced by earlier follow-ups on 5ad0c96: 1. (medium) _check_existing_cluster_proxy flagged any host proxy var missing from the cluster as drift — including '@'-containing values that _create_new_cluster deliberately skips. The advice ("delete + recreate") wouldn't bake them either; the warning was lying. Filter '@' values out of the drift list, matching what propagation actually does. Users with embedded-creds proxies already saw the skip warning at first-create time. 2. (medium) _merge_kubeconfig's defensive normalization checked `-f "$KUBECONFIG"`, which is false for a colon-separated path list. k3d's --kubeconfig-merge-default writes into the first entry of KUBECONFIG (or ~/.kube/config if unset), so the rewrite needs to target that file. Resolve with ${kc%%:*}, same idiom we use in the _wait_for_api error hint. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Open
2 tasks
aptracebloc
previously approved these changes
Jun 1, 2026
Helm rejects values.yaml with 'control characters are not allowed' when the user's terminal has bracketed-paste enabled and they paste creds: the paste is wrapped in ESC[200~ ... ESC[201~ (ESC = 0x1B), which `read` captures verbatim and the heredoc embeds into clientId/clientPassword. Surfaced today during an EC2 install: Error: failed to parse /home/ec2-user/.tracebloc/values.yaml: error converting YAML to JSON: yaml: control characters are not allowed Fix: - New _sanitize_credential helper strips 0x00-0x1F and 0x7F from the string. UTF-8 bytes (0x80+) preserved so international passwords work. - Apply after every TB_CLIENT_ID / TB_CLIENT_PASSWORD read, before the empty check and before the YAML single-quote escape. - Also strip in _extract_yaml_value so a previously-corrupted values.yaml self-heals on the next install instead of perpetuating the bad value. - Warn the user when anything was stripped so they know we touched their input. Closes #168 Independent of #166 but bundled here at the user's request. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two regressions in c237721: 1. `warn` in _sanitize_credential wrote to STDOUT, but the function is called from `TB_CLIENT_PASSWORD=$(_sanitize_credential "$PW")` — command substitution captures stdout, so the warning message landed inside the password value and ended up in values.yaml verbatim. Surfaced when a customer's clientPassword became literally: ' ⚠ Stripped non-printable characters from input... [CrypveP-sumwac-5siqra' Fix: warn explicitly to >&2 in this caller. Keeps the global warn() behavior unchanged (other callers print to stdout intentionally). 2. The C0-strip removed the ESC byte but left the literal `[200~` / `[201~` bracketed-paste markers visible in the value. Add a shared _strip_paste_garbage helper that matches both the ESC-prefixed form and the standalone-marker form, then C0-strips the remainder. Use it from both _sanitize_credential (input path) and _extract_yaml_value (defaults-from-prior-file path) so self-heal works against either corruption shape. Verified with bash unit tests against four input shapes (full wrapper, literal-marker remnant, clean input, capture-isolation check). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Customer report: password 'rypveP-sumwac-5siqra' stored as '[CrypveP-sumwac-5siqra'. The leading '[C' is the post-C0-strip remnant of ESC[C (cursor-right arrow key) — a stray arrow key press during the password prompt. d22a381 only matched ESC[200~/ESC[201~ (bracketed paste), so generic CSI sequences still leaked through. Replace the narrow paste-marker strip with an iterative match over the full ANSI CSI shape: ESC '[' <params> <final-byte> where params ∈ [0-9;] and final ∈ [A-Za-z~]. Catches arrow keys, cursor moves, function keys, modifier-combo CSIs, and bracketed paste — anything `read -r -s` can capture from a terminal. Keep the standalone-marker self-heal narrow (only [200~/[201~); generic `[X]` shapes could plausibly be real password content. Verified against eight input shapes: bracketed-paste wrapper, cursor-right prefix, mid-string arrow keys, multi-param CSI (Ctrl-Right), orphan markers, clean ASCII (no-op), UTF-8 preservation, warn-to-stderr isolation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
aptracebloc
previously approved these changes
Jun 1, 2026
The Helm Chart CI workflow (Lint, Template render, Unit tests, Schema validation) only triggered on main- and openshift-targeted PRs. The required-check ruleset on develop still expects the `Lint` context, so develop-targeted PRs land in "Expected — Waiting for status to be reported" forever and can't merge. Adding `develop` to both push and pull_request branch filters fixes this PR's merge block and closes the gap going forward — chart issues will be caught at the develop integration point rather than slipping through until the develop→main release sync. Paths filter unchanged: only PRs that touch client/**, ingestor/**, or this workflow file itself trigger CI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9257a74. Configure here.
aptracebloc
previously approved these changes
Jun 1, 2026
Previous round filtered '@'-containing host proxy vars out of the drift check (correctly — recreating wouldn't bake them). But if the host has ONLY @-style proxy URLs, the function went silent on every re-run. The create-time warn in _create_new_cluster doesn't fire on the existing- cluster path, so the user got zero installer signal that image pulls might fail. Partition host vars into two buckets: • drift_candidates — clean values; cluster should have these • at_skipped — @-values; recreate alone won't bake them @-skipped vars now get their own dedicated warning that fires on every re-run, with remedies that actually work (strip creds, or front the corporate proxy with a credential-less local auth-proxy). Drift check runs unchanged against the clean-value bucket. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
aptracebloc
approved these changes
Jun 1, 2026
LukasWodka
added a commit
that referenced
this pull request
Jun 1, 2026
… 0.0.0.0 detect, Windows parity) A customer running behind an authenticated corporate HTTP proxy hit install failures. The 0.0.0.0 kubeconfig headline was fixed for the bash path in #166/#167, but adverse testing (a forward proxy + real k3d on Linux VMs) surfaced three remaining gaps plus a Windows parity hole: - Gap A: authenticated proxies (http://user:pass@host) were silently SKIPPED — k3d's --env KEY=VALUE@FILTER can't carry an '@' in the value. Now propagated via a k3d --config file (structured YAML env) so credentials survive intact. Verified on k3d v5.8.3 (it merges the --config env with the existing CLI flags). - Gap B: NO_PROXY was propagated verbatim. Now auto-augmented with the cluster-internal ranges (loopback + RFC1918 + .svc/.cluster.local + host.k3d.internal), both into the cluster and host-side, so in-cluster traffic never routes through the proxy — fixes the misroute AND the observed `k3d cluster create --wait` hang. - Gap C: a cluster created outside the installer and bound to 0.0.0.0 is now detected (serverlb HostIp) and flagged with a non-destructive recreate remedy. - Windows parity: install-k8s.ps1::New-K3dCluster had NONE of the bash fixes — it still bound --api-port 0.0.0.0:6550 (the original headline bug, still live on Windows), normalized only host.docker.internal in the kubeconfig, and propagated zero proxy env. Now mirrors bash: 127.0.0.1:6550, a 0.0.0.0->127.0.0.1 kubeconfig rewrite, and Get-EffectiveNoProxy + Write-K3dProxyConfig (auth + augmented NO_PROXY, written UTF-8 without a BOM). Tests: new scripts/tests/cluster.bats (15) + Pester for the two ps1 helpers (6), both green. Verified end-to-end on Linux VMs: auth creds propagated into the node, no startup hang behind an unreachable proxy, and 0.0.0.0 detection firing. Stacked on #171 (the installer test scaffolding + final install-k8s.ps1 live there). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Two installer-side fixes bundled together (both surfaced while validating the corporate-proxy fix):
i.shfails behind corporate HTTP/HTTPS proxies (Charité install, P1 customer blocker). Four bugs inscripts/lib/cluster.sh.values.yamlwith "control characters are not allowed" when bracketed-paste-enabled terminals leak ESC sequences into clientId/clientPassword. One bug inscripts/lib/install-client-helm.sh.#166 — corporate-proxy fixes (
scripts/lib/cluster.sh)0.0.0.0— corporate proxies intercept the0.0.0.0connection and returnTemporary Redirecttokubectl. Pin explicitly to127.0.0.1:6550.HTTP_PROXY/HTTPS_PROXY/NO_PROXYon the host weren't passed into k3s. Loop over both cases; skip vars whose value contains@(k3d's filter-syntax conflict — recreate wouldn't help)._merge_kubeconfigclobbered manualsedfix on re-run — defensivehttps://0.0.0.0:→https://127.0.0.1:rewrite after the merge. Targets the file k3d actually writes to (first entry of colon-separatedKUBECONFIG, or~/.kube/config).${KUBECONFIG}path in the workaround hint and macOS-compatiblesed -i.bakform.Also adds
_check_existing_cluster_proxy:docker inspects the server container on re-run and warns per-var (includingNO_PROXYdrift) when the cluster is missing proxy env the host has.@-containing values are excluded since recreate wouldn't bake them either.#168 — credential sanitization (
scripts/lib/install-client-helm.sh)_sanitize_credentialstrips C0 controls (0x00-0x1F) + DEL (0x7F) fromTB_CLIENT_ID/TB_CLIENT_PASSWORDafter everyread. UTF-8 bytes (0x80+) preserved so international passwords work. Warns the user when anything was stripped._extract_yaml_valuealso strips on read so a previously-corruptedvalues.yamlself-heals on the next install (without it, the default-fill on re-run reproduces the bad value).Chart version
client/Chart.yamlbumped 1.4.1 → 1.4.2 per #166 acceptance criteria.Backward compatibility
0.0.0.0:6550get auto-normalized on the next run.[[ -n "${!var:-}" ]]— strict no-op when no proxy is set.https://0\.0\.0\.0:— won't touch CIDR ranges.sed -i.bakworks on BSD + GNU.Validation
bash -nclean on both touched files.shellcheck -x -s bashshows only two pre-existingSC2034warnings oncluster.sh(unusedlogfile,attempt— present before this PR).0.0.0.0:6550(CIDRs in NetworkPolicy templates unchanged).Test plan
HTTP_PROXY=http://example.invalid:8080, run installer end-to-end, confirm cluster comes up and Helm install completes.https://0.0.0.0:6550in kubeconfig, confirm rewrite.\x1b[200~, re-run installer accepting defaults, confirm clean output.Follow-ups (deferred per issues)
\e[?2004l/h(bug: installer values.yaml rejected by Helm — control chars from credential paste #168 nice-to-have).Closes #166
Closes #168
🤖 Generated with Claude Code
Note
Medium Risk
Changes affect every install path (k3d API bind, kubeconfig rewrite, proxy env) though defaults stay localhost-only when no proxy is set; credential stripping could alter unusual passwords that relied on control characters.
Overview
The installer and chart release are updated so installs work behind corporate HTTP proxies and credential prompts no longer poison
values.yaml.cluster.shbinds the k3d API to127.0.0.1:6550, forwards hostHTTP_PROXY/HTTPS_PROXY/NO_PROXY(both cases) into new clusters viak3d --env, and warns when proxy URLs contain@or when an existing cluster’s env drifts from the host. After kubeconfig merge,https://0.0.0.0:server URLs are rewritten to127.0.0.1; API wait failures now point at Docker, proxyNO_PROXY, and kubeconfig fixes.install-client-helm.shstrips bracketed-paste / CSI escapes and C0 controls from client ID/password (and when re-readingvalues.yaml) so Helm no longer rejects “control characters.”client/Chart.yamlis 1.4.1 → 1.4.2; Helm CI also runs ondevelop.Reviewed by Cursor Bugbot for commit 0eab8ed. Bugbot is set up for automated code reviews on this repo. Configure here.