Skip to content

fix(security): close moodLog SSRF, central error redaction, citation-drift fixes#148

Merged
MBombeck merged 1 commit intomainfrom
fix/v141-security-audit-findings
May 8, 2026
Merged

fix(security): close moodLog SSRF, central error redaction, citation-drift fixes#148
MBombeck merged 1 commit intomainfrom
fix/v141-security-audit-findings

Conversation

@MBombeck
Copy link
Copy Markdown
Owner

@MBombeck MBombeck commented May 8, 2026

Summary

Z-final.1 audit pass (security + truthfulness + performance + test theatre + i18n) turned up:

  • 1 CRITICAL (moodLog SSRF — exploitable cloud-metadata leak)
  • 2 HIGH (Telegram bot token leaking via Glitchtip; idempotency replay race)
  • 4 truthfulness drift items (ESC/ESH year, steps source mislabel, Saint-Maurice plateau)
  • 3 P0 performance items, 5 medium security, 7 low nits, 5 test-theatre minors

This PR fixes everything that's safe to land in a v1.4.1 hardening release. The deeper architectural items (idempotency transactional rewrite, encryption-key NODE_ENV fallback, refresh-token reuse-detection serialisation, moodLog HMAC lookup column, recharts dynamic split for /insights, MoodEntry index migration, glucose-history bound) are tracked for v1.4.2 in a separate followup-issues doc that will land alongside this PR.

What changes

CRITICAL — moodLog SSRF

  • src/lib/validations/moodlog.ts refines the credentials URL with isPublicUrl() so an authenticated user cannot store http://169.254.169.254/ or RFC1918 hosts.
  • src/lib/moodlog/sync.ts re-checks isPublicUrl(baseUrl) at the fetch site (legacy rows stored before the guard are also refused) and switches to redirect: "manual" + treat 3xx as failure.
  • New unit suite src/lib/validations/__tests__/moodlog.test.ts asserts rejection of 169.254.x.x, 10.x, 172.16.x.x, 192.168.x.x, 127.0.0.1, and localhost.

HIGH — error-message secret leakage

  • src/lib/logging/redact.ts (new) — central redactor for Bearer …, Telegram bot<digits>:<token> URLs, and query-string secrets (?secret=, ?code=, ?token=, ?api_key=).
  • WideEventBuilder.setError() now scrubs err.message and err.stack before they reach Loki.
  • reportToGlitchtip() now scrubs the same fields before they reach the incident UI (defence in depth — different code path, same risk).
  • New unit suite src/lib/logging/__tests__/redact.test.ts covers the four patterns and the deliberate over-redaction trade-off (Bearer \S+ greedily consumes — safe).

Truthfulness drift (audit found 0 hallucinations, 4 drift)

  • messages/{en,de}.json bpClassificationTitle and avg30dEsc: ESC/ESH 2018 → ESH 2023 (user-visible on the dashboard + doctor-report PDF).
  • src/lib/analytics/classifications.ts: section header, getBpTargetsByAge() docblock, and the 65+ comment all updated to ESH 2023 with the "numerically unchanged from joint 2018" caveat.
  • src/app/api/insights/targets/route.ts:375: steps target source: "WHO""Saint-Maurice JAMA 2020". AI-prompt anti-pattern guard already forbade WHO; this surface was the last drift.
  • Saint-Maurice "mortality plateau 8000-12000" attribution softened to "continued dose-response benefit through ~12,000 steps/day, not a plateau" everywhere it appears (medical-citations.ts, effective-range.ts, classifications.ts, both AI prompts). The original JAMA 2020 paper shows continuing benefit (HR 0.49 at 8k vs 4k, HR 0.35 at 12k); the plateau-shaped finding belongs to Paluch 2022 Lancet Public Health, not Saint-Maurice.

Quality gates

  • pnpm typecheck clean
  • pnpm test669 / 669 pass (was 658; +11 from new suites)
  • pnpm test:integration — 10 / 10 (untouched)
  • pnpm lint — 0 errors

Audit reports

The full audit findings live under ~/infra/reports/v14-{security,truthfulness,performance,test-theatre,i18n-drift}-audit.md. The followup-issues file with the items deferred to v1.4.2 will be written alongside this PR's merge.

🤖 Generated with Claude Code

…tion drift

The v1.4.1 audit pass turned up one CRITICAL, two HIGH, and four
medical-citation drift findings. This commit addresses everything
that's safe to land in a hardening release; deeper architectural
fixes (idempotency-race transactional rewrite, encryption-key
fallback gate, refresh-token reuse-detection serialisation, moodLog
secret HMAC lookup column) are tracked for v1.4.2 in
docs/ops/v141-followup-issues.md.

Security
--------

CRITICAL — moodLog SSRF
  src/lib/validations/moodlog.ts now refines the credentials URL
  with isPublicUrl() so an authenticated user cannot store
  http://169.254.169.254/ (cloud metadata) or RFC1918 hosts.
  src/lib/moodlog/sync.ts re-checks isPublicUrl(baseUrl) at the
  fetch site (so legacy rows stored before the credential guard are
  also refused) and switches to redirect: "manual" + 3xx → fail so a
  public host cannot 302 the request to an internal target with the
  user's apiKey on the redirect hop. New unit test
  src/lib/validations/__tests__/moodlog.test.ts asserts the guard
  rejects 169.254.x.x, 10.x, 172.16.x.x, 192.168.x.x, 127.0.0.1, and
  localhost.

HIGH — error-message secret leakage
  New src/lib/logging/redact.ts redacts Bearer tokens (incl. our
  hlk_/hlr_), Telegram bot URLs (bot<digits>:<token>), and
  query-string secrets (?secret=, ?code=, ?token=, ?api_key=). The
  redactor is called from WideEventBuilder.setError() — every error
  reaching Loki is scrubbed once, centrally — and from the
  reportToGlitchtip() call in api-handler.ts so the Glitchtip
  incident UI is also protected. New unit suite covers the four
  patterns plus over-redaction (Bearer's \\S+ greedily consumes
  anything up to the next whitespace, which is the safer side of
  the trade-off).

Truthfulness — citation drift fixes (audit found 0 hallucinations,
4 drift items)
  - messages/{en,de}.json bpClassificationTitle and avg30dEsc:
    "ESC/ESH 2018" → "ESH 2023". User-visible on the dashboard and
    doctor-report PDF. The numeric thresholds are unchanged from
    2018; the 2023 update is ESH-only since ESC withdrew from
    joint authoring.
  - src/lib/analytics/classifications.ts: section header,
    getBpTargetsByAge() docblock, and inline comment updated to
    cite ESH 2023 with the same "numerically unchanged" caveat.
  - src/app/api/insights/targets/route.ts:375: source label "WHO"
    → "Saint-Maurice JAMA 2020". The AI-prompt anti-pattern guard
    explicitly forbids citing WHO for a step number; this surface
    was the last "WHO" mislabel in the codebase.
  - Saint-Maurice "mortality plateau 8000-12000" attribution
    softened to "continued dose-response benefit through ~12,000
    steps/day, not a plateau" in src/lib/medical-citations.ts,
    src/lib/analytics/effective-range.ts,
    src/lib/analytics/classifications.ts, and both AI prompts. The
    original JAMA 2020 paper shows continuing benefit (HR 0.49 at
    8k vs 4k, HR 0.35 at 12k); the plateau-shaped finding belongs
    to Paluch 2022 Lancet PH (PMID 35247352), not Saint-Maurice.

Quality gates
  pnpm typecheck         clean
  pnpm test              669 / 669 (was 658; +11 from new suites)
  pnpm test:integration  10 / 10 (untouched)
  pnpm lint              0 errors

Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
@MBombeck MBombeck merged commit 7035388 into main May 8, 2026
6 of 7 checks passed
@MBombeck MBombeck deleted the fix/v141-security-audit-findings branch May 8, 2026 14:30
MBombeck added a commit that referenced this pull request May 8, 2026
Production at healthlog.bombeck.io has been 503-ing since the v1.4.1
deploys started landing on apps-01 (Coolify). The container boots —
Next.js prints "Ready" and the pg-boss workers run — but never
accepts HTTP on :3000, so the Docker healthcheck fails and Traefik
takes the upstream out of rotation. A manual restart, a Coolify
force-rebuild, and a docker-compose pin to the GHCR :1.4.0 multi-arch
image all failed to bring the site back up — Coolify rebuilds the
image from main HEAD on every deploy regardless of the compose
directives.

This commit resets the working tree to commit 21bd46d (v1.4.0
release). Same content that's been running for self-hosters since
yesterday's tag-and-release. The next Coolify deploy will build
from this tree and produce a healthy container.

The v1.4.1 work is NOT lost:
  - PRs #144, #145, #137, #146, #147, #148, #149, #150 remain in
    git history.
  - Their commits are still tagged (`v1.4.1`), still on the GHCR
    multi-arch image (`ghcr.io/mbombeck/healthlog:1.4.1`), still in
    the GitHub Release notes.
  - Self-hosters who have already pulled the v1.4.1 image keep it.
  - Local development continues from main HEAD with the v1.4.1
    code — the regression only surfaced under the Coolify build
    flow.

Re-applying v1.4.1 to production will need a separate cycle to
reproduce the runtime failure under the Coolify build path. That
work is tracked in docs/ops/v141-followup-issues.md (added back
when the tree is reapplied) and the deploy gating in
.github/workflows/e2e.yml will catch this class of bug going
forward.

No DB migration. No env-var change. No API contract change.
Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant