Skip to content

block 2: security & auth hardening (CSP, baseline headers, config validation, secret redaction, req_id slog)#68

Merged
aksOps merged 5 commits intomainfrom
feat/block2-security
Apr 24, 2026
Merged

block 2: security & auth hardening (CSP, baseline headers, config validation, secret redaction, req_id slog)#68
aksOps merged 5 commits intomainfrom
feat/block2-security

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented Apr 24, 2026

Summary

Implements Block 2 of the docsiq production-polish roadmap. Five independent server-side changes that harden the HTTP boundary and config loader.

  • 2.1 CSP header (Task 3) — securityHeadersMiddleware sets a strict CSP on every non-OPTIONS response, wired as the outermost router wrapper so 401/404/500 responses also carry it.
  • 2.2 Baseline security headers (Task 4) — extends the same middleware with X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin, Permissions-Policy (disables camera/mic/geo/payment/usb), and Strict-Transport-Security gated by server.hsts_enabled (default off; operators behind TLS set DOCSIQ_SERVER_HSTS_ENABLED=true).
  • 2.3 Secret scrubbing (Task 2) — Config.Redact() + secret:"true" struct tag on every api_key field; reflective deep-copy zeroes tagged strings. Audited existing log sites — no whole-struct log sites found, so no log-call changes were required. Helper is available for future code.
  • 2.4 Config validation (Task 1) — viper.UnmarshalExact rejects unknown/stale YAML keys at startup; validateLLM enforces per-provider minimums (Azure endpoint + api_key + api_version; OpenAI api_key; Ollama base_url). "none" and empty provider are accepted as explicit opt-out.
  • 2.5 Request-ID slog threading (Task 5) — additive ContextLogger(ctx) helper returns slog.Default().With("req_id", id) when the context carries a request ID. Refactored 4 representative error-path log sites in handlers.go (writeError 5xx, upload invalid-filename, upload path-traversal, oversize upload rejection). Full sweep deferred to Block 4 observability work.

Plan deviations

  1. validateLLM accepts "none" in addition to "" (existing code treats provider: "none" as explicit LLM opt-out with a test — TestProviderNone — which the plan's validator would have broken).
  2. TestLoad_ValidatesLLMProvider/ollama_missing_base_url uses explicit base_url: "" in YAML because the default ollama.base_url = http://localhost:11434 makes "plain absence of key" a passing config.
  3. Pre-existing config_file_load_yaml subtest loaded an Azure config with no endpoint/api_key; updated it to include both so the new validator passes.
  4. Fixed pre-existing gofmt drift in internal/config/config.go (drift pre-dated Block 2; fixed because Tasks 1/2/4 were all touching this file).
  5. Playwright smoke (Task 3 Step 7) not run locally; CI Playwright job will catch any CSP-triggered regression. Vitest UI suite (54 tests) ran locally and passed.

Test plan

  • CGO_ENABLED=1 go test -tags sqlite_fts5 -timeout 300s ./... - all packages pass
  • CGO_ENABLED=1 go test -tags sqlite_fts5 -race ./internal/api/ ./internal/config/ - pass
  • go vet -tags sqlite_fts5 ./... - clean
  • gofmt -l on touched files - clean
  • cd ui && npm test -- --run - 54/54 pass
  • CI Playwright smoke on main post-merge (gates CSP against the SPA)

Follow-ups (later blocks)

  • Block 4 (observability): wider sweep refactoring remaining slog.* sites in handlers to ContextLogger(r.Context()).
  • Block 4: consider ULID vs hex for request IDs (currently 16-char hex from crypto/rand — collision-safe for air-gapped single-binary deploy; not worth changing now).

aksOps and others added 5 commits April 24, 2026 02:00
- Load now uses viper.UnmarshalExact, surfacing typos and stale keys
  at startup instead of at first-use.
- validateLLM enforces per-provider minimums (azure endpoint+api_key
  and api_version; openai api_key; ollama base_url). Error messages
  name the offending yaml key so an operator can grep → fix in one
  hop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Config.Redact() returns a deep copy with api_key fields cleared. Uses
the new secret:"true" struct tag so future additions opt in simply by
tagging the field. Audited existing log sites; none logged the full
Config struct, so no handler changes were required — helper remains
available for future code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets a strict CSP matching the air-gapped deployment posture on every
non-OPTIONS response: default-src 'self' everywhere; inline scripts
banned; WASM allowed for shiki; inline styles allowed for Tailwind/
shadcn; frame-ancestors 'none' to block clickjacking. Wired as the
outermost router wrapper so 401/500/404 responses also carry the
header.

Task 4 will extend the same middleware with baseline security headers
(nosniff, Referrer-Policy, Permissions-Policy, HSTS).

Playwright smoke not run locally; CI Playwright job will catch any
CSP-triggered regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends securityHeadersMiddleware with:
- X-Content-Type-Options: nosniff
- Referrer-Policy: strict-origin-when-cross-origin
- Permissions-Policy: disables camera/mic/geo/payment/usb
- Strict-Transport-Security: 1y max-age + includeSubDomains, gated by
  server.hsts_enabled (default off) so HTTP-only dev stays unaffected.

Operators behind a TLS terminator or direct TLS should set
DOCSIQ_SERVER_HSTS_ENABLED=true.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an additive helper next to the existing RequestIDFromContext:
ContextLogger(ctx) returns slog.Default() enriched with req_id=<id>
when the context carries one from loggingMiddleware. Refactored 4
representative error-path log sites in handlers.go (writeError 5xx
path, upload invalid-filename, upload path-traversal, oversize upload
rejection) to use it.

The rest of the codebase keeps using slog.Default() directly — a
wider sweep is tracked as tech debt for the observability sweep
(Block 4).

Also gofmt'd config.go (drift was pre-existing on main; fixed in-place
since this file was touched in Task 1/2/4).

Note: the roadmap's 2.5 bullet mentioned ULID. The existing
request_id.go generates 16-char hex from crypto/rand which is
collision-safe for the air-gapped single-binary deploy. Not worth
changing as part of this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aksOps aksOps enabled auto-merge (squash) April 24, 2026 02:13
@aksOps aksOps merged commit f485e23 into main Apr 24, 2026
11 checks passed
@aksOps aksOps deleted the feat/block2-security branch April 24, 2026 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant