Skip to content

feat(prometheus): custom HTTP headers for auth-protected backends#732

Merged
nadaverell merged 5 commits into
mainfrom
feature/prometheus-headers
May 20, 2026
Merged

feat(prometheus): custom HTTP headers for auth-protected backends#732
nadaverell merged 5 commits into
mainfrom
feature/prometheus-headers

Conversation

@nadaverell
Copy link
Copy Markdown
Contributor

@nadaverell nadaverell commented May 19, 2026

Summary

Resolves #683. --prometheus-url was effectively unusable against hosted or enterprise Prometheus-compatible backends (Grafana Cloud, Mimir, VictoriaMetrics Cluster, Thanos) that require header-based authentication — Bearer tokens, X-Scope-OrgID multi-tenancy headers, etc. Workaround was running a reverse proxy in front of Radar just to inject auth.

This PR adds custom headers that flow through to every Prometheus request.

What changed

Config + CLI

  • prometheusHeaders: { ... } map in ~/.radar/config.json
  • --prometheus-header Key=Value CLI flag (repeatable). CLI flags override file defaults outright rather than merging — matches kubectl semantics.

Wire-level

  • internal/prometheus/client.go — headers attached in doQuery and probe, preserved across context-switch reinit.
  • internal/traffic/caretta.go — same treatment in all 3 Prometheus request sites. Without this the topology traffic view would silently 401 against the same authed endpoint.

Helm chart

  • traffic.prometheusHeaders: {} renders into repeated --prometheus-header args on the deployment, properly quoted (matches the pattern used for auth.secret / cloud.token in the same template). For secret-bearing values, the chart README recommends external secret stores (sealed-secrets, external-secrets).

Misc

  • Diagnostics overlay shows a "Prometheus Headers: Set/None" row.
  • Settings dialog has no editor yet, but the field round-trips through PUT /api/config automatically (the dialog preserves unknown fields), so editing the config file or using the CLI is the v1 path.

Safety / correctness

  • Header byte validation via httpguts.ValidHeaderFieldName/Value at parse time — applied both to CLI flags and to headers loaded from ~/.radar/config.json. Rejects CR/LF in values (classic header-injection vector) and invalid characters in keys, instead of letting net/http either refuse opaquely or silently corrupt the request. File-loaded headers that fail validation are dropped with a clear startup log line rather than producing cryptic transport errors at first request.
  • 401/403 surfacing in both probe paths — without this, a misconfigured Bearer token shows up as "Prometheus not found" after discovery falls through every candidate. errorlog.Record(...) makes the auth failure visible in the diagnostics overlay.
  • Race fix in prometheus.Reinitialize — was reading globalClient.headers / manualURL while holding only clientMu; SetHeaders and SetURL-the-method write those under c.mu on independent code paths. Now snapshots under c.mu.RLock + copyHeaders. Verified clean under go test -race.
  • Caretta deadlock fix — original applyHeaders took c.mu.RLock(), but it's called from tryMetricsEndpointLocked which runs under c.mu.Lock() (write). Go's RWMutex isn't reentrant → would have deadlocked every Caretta user on the first metrics probe. Removed the lock entirely; CarettaSource.headers is set once inside initOnce.Do and never mutated.

Secret handling

Headers are stored in plain text in ~/.radar/config.json. This matches the file's existing trust level (kubeconfig paths, etc.) but it's called out in the docs so users with stricter requirements know to template from a secret store rather than checking values into their Helm values file.

GET /api/config redacts PrometheusHeaders from both the file and effective payloads — values would otherwise be returned in plaintext to anyone with API access (matters for auth-enabled deployments and embedded uses like Radar Hub; localhost-default auth-mode=none was already practically safe but the inconsistency vs. the diagnostics endpoint, which already masks them as a presence bool, was the bug). PUT /api/config preserves the on-disk value so a UI round-trip through the redacted GET can't silently wipe auth headers.

Tests

  • internal/config: round-trip test extended to cover the new field.
  • internal/prometheus: headers reach the wire on both doQuery and probe; no Authorization header sent when none is configured (avoids tripping picky reverse proxies).
  • internal/traffic: new caretta_test.go covering applyHeaders on queryPrometheusRaw + tryMetricsEndpointLocked — the latter acquires the write lock first so the production lock shape is reproduced (and the deadlock regression can't come back).
  • cmd/explorer: new main_test.go with table-driven coverage of headerFlagKey=Value parsing including = in Bearer-token values, CRLF / invalid-name rejection, the kubectl-style "first CLI flag wipes file defaults" latch, and defensive-copy semantics on value().
  • All packages green under go test -race. Helm template render verified manually.

…ends

`--prometheus-url` was effectively unusable against hosted or enterprise
Prometheus-compatible backends that require header-based authentication
(Bearer tokens, X-Scope-OrgID multi-tenancy headers, etc.) — users had
to put a reverse proxy in front just to inject auth.

Add a `prometheusHeaders` map to `~/.radar/config.json` and an
equivalent repeatable `--prometheus-header Key=Value` CLI flag. Headers
are applied to all Prometheus requests, in both the metrics-API client
(`internal/prometheus`) and the traffic/Caretta query path
(`internal/traffic/caretta`) — otherwise the topology traffic view
would silently 401 against the same authed endpoint.

Helm chart exposes `traffic.prometheusHeaders` as a map that renders
into repeated `--prometheus-header` args. CLI flags override file
defaults outright (kubectl-style) rather than merging.

Resolves #683.
@nadaverell nadaverell requested a review from hisco as a code owner May 19, 2026 11:38
Comment thread internal/traffic/caretta.go
Address review findings on PR #732:

- Reject CR/LF and invalid header-token bytes in --prometheus-header at
  parse time via httpguts (was silently corrupting requests or failing
  opaquely at first-query time; CRLF in a value is the classic header-
  injection vector). Promotes golang.org/x/net from indirect to direct.
- Surface 401/403 from probe() in both prometheus/client.go and
  traffic/caretta.go via errorlog — otherwise a misconfigured Bearer
  token shows up as "Prometheus not found" after discovery falls
  through every candidate.
- Quote --prometheus-header args in the Helm deployment template;
  matches the | quote pattern used for auth.secret, oidc.clientSecret,
  cloud.token in the same file.
- Add caretta_test.go covering applyHeaders on both queryPrometheusRaw
  and tryMetricsEndpointLocked — guards the parallel implementation
  against drift from internal/prometheus.
- Add main_test.go covering headerFlag: Key=Value parsing edge cases
  (including '=' in Bearer tokens), CRLF / invalid-name rejection, and
  the kubectl-style "first CLI flag wipes file defaults" latch.
- Trim WHAT-not-WHY comments on struct fields and test docstrings.
- gofmt diagnostics.go and bootstrap.go (struct alignment after the
  new bool field).
…alURL in Reinitialize

Reinitialize held clientMu exclusively but read globalClient.headers
and globalClient.manualURL without taking the per-client mutex.
SetHeaders releases clientMu before acquiring c.mu to write headers,
opening a race window go test -race would flag. SetURL (the method)
writes manualURL under c.mu on the HTTP handler path, completely
independent of clientMu — same race.

Take globalClient.mu.RLock around the snapshot. Also copyHeaders so
the new client doesn't alias the old client's map.
…intLocked

Bugbot caught a real deadlock: tryMetricsEndpointLocked is called
under c.mu.Lock() (write) by discoverPrometheus / Connect /
tryClusterAddrLocked / discoverMetricsServiceDynamic. My applyHeaders
took c.mu.RLock() — sync.RWMutex isn't reentrant, so every Caretta
user (not just header-config users) would have wedged on the first
metrics-endpoint probe.

The lock was over-engineered defense against a mutation that doesn't
exist: CarettaSource.headers is assigned exactly once inside
manager.go's initOnce.Do and never written again (context switches
construct a fresh source). Removing the lock both fixes the deadlock
and matches the actual concurrency story.

Update the test to acquire c.mu.Lock before calling
tryMetricsEndpointLocked — the original test invoked it directly and
that's why the deadlock didn't surface.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3adeafe. Configure here.

Comment thread internal/app/bootstrap.go
Comment thread cmd/explorer/main.go
…d headers

- Strip PrometheusHeaders from GET /api/config response (file + effective);
  diagnostics already masks them as a presence bool, /api/config was the
  remaining plaintext leak. PUT preserves the on-disk value so a UI
  round-trip can't silently wipe the user's auth headers.
- Apply httpguts.ValidHeaderFieldName/Value to headers loaded from
  ~/.radar/config.json (the CLI Set path already validates). Invalid
  entries are dropped with a startup log line instead of failing at
  request time.
@nadaverell nadaverell merged commit c685e34 into main May 20, 2026
8 checks passed
@nadaverell nadaverell deleted the feature/prometheus-headers branch May 20, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add prometheusHeaders config option for auth-protected Prometheus backends

1 participant