Skip to content

fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28)#49

Merged
thinmintdev merged 2 commits into
mainfrom
fix/issue-28-caddy-public-paths
May 17, 2026
Merged

fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28)#49
thinmintdev merged 2 commits into
mainfrom
fix/issue-28-caddy-public-paths

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

Closes #28

Summary

The Caddy handle {} block applied basicauth to every path that didn't match /chat* or /v1/*, including the entire FastAPI PUBLIC_PATHS frozenset. The browser-side first-run wizard hits /api/install/state and /api/config/urls before any credential can exist, so hal0 install --auth=basic deployments were unbootstrappable from a browser.

Add a @public named matcher listing every entry from src/hal0/api/middleware/auth.py:PUBLIC_PATHS, paired with a handle @public { reverse_proxy 127.0.0.1:8080 } block placed before the default basicauth handler. Caddy evaluates handle blocks in source order (first match wins), so the matcher must precede the default block — that ordering is exactly what was wrong before.

path matchers in Caddy are exact-match (no prefix expansion unless * is given), which mirrors the request.url.path in PUBLIC_PATHS check on the API side. Two sources of truth stay in lockstep.

Files changed

  • packaging/caddy/Caddyfile.template — add @public path ... matcher + handle @public block before the default basicauth handler.
  • tests/harness/installer-test.sh — strengthen the auth-basic row to assert the rendered /etc/hal0/Caddyfile contains @public path, handle @public, and /api/install/state. A future edit that drops the allowlist (or re-orders it below the default handle) will be caught here.

Verification

Rendered the template with the same Python substitution the installer uses, ran caddy validate (Valid configuration), then booted Caddy against a stub backend on :18080 and probed 26 paths:

=== Public paths without creds (expect 200) ===
  PASS  public /api/health/system                          HTTP=200
  PASS  public /api/status                                 HTTP=200
  PASS  public /api/metrics                                HTTP=200
  PASS  public /api/metrics/prometheus                     HTTP=200
  PASS  public /api/features                               HTTP=200
  PASS  public /api/install/state                          HTTP=200
  PASS  public /api/install/complete                       HTTP=200
  PASS  public /api/config/urls                            HTTP=200
  PASS  public /api/auth/status                            HTTP=200
  PASS  public /api/auth/login                             HTTP=200
  PASS  public /api/auth/logout                            HTTP=200
  PASS  public /v1/models                                  HTTP=200

=== Protected paths without creds (expect 401) ===
  PASS  protected /api/slots                               HTTP=401
  PASS  protected /api/models                              HTTP=401
  PASS  protected /api/auth/tokens                         HTTP=401
  PASS  protected /                                        HTTP=401
  PASS  protected /api/install/cancel                      HTTP=401
  PASS  protected /api/logs/tail                           HTTP=401
  PASS  protected /api/settings                            HTTP=401

=== Protected paths with WRONG creds (expect 401) ===
  PASS  wrong-creds /api/slots                             HTTP=401
  PASS  wrong-creds /                                      HTTP=401

=== Protected paths with RIGHT creds (expect 200) ===
  PASS  right-creds /api/slots                             HTTP=200
  PASS  right-creds /api/auth/tokens                       HTTP=200
  PASS  right-creds /                                      HTTP=200

=== /v1/* passthrough (no basicauth, expect 200) ===
  PASS  v1-pass /v1/chat/completions                       HTTP=200
  PASS  v1-pass /v1/embeddings                             HTTP=200

===== TOTAL: PASS=26 FAIL=0 =====

The local installer harness auth-basic row could not be exercised end-to-end on this host because preflight blocked on /var/lib disk space and port 8080 (the host already runs an OpenWebUI dev process on 8080). Those preflights are unrelated to the fix; the harness's new grep-based assertions were validated against the rendered output above.

Acceptance criteria

  • Rendered Caddyfile contains a @public matcher block above the default handle, forwarding the public-path list to 127.0.0.1:8080 without basic_auth.
  • curl https://.../api/install/state returns 200 without credentials (verified against live Caddy with stub backend).
  • curl https://.../api/status returns 200 without credentials (same).
  • Default authenticated paths still 401 without creds (/api/slots and friends, verified).
  • Deferred to a follow-up: new Playwright γ spec opening the FirstRun wizard through Caddy without basic_auth — the e2e suite scaffolding is out of scope for this critical fix. The 26-probe harness above covers the wire-level contract; the Playwright spec belongs with the broader first-run wizard work tracked separately.
  • Installer regression test confirms the rendered Caddyfile contains the @public block when --auth=basic is used (new asserts in tests/harness/installer-test.sh row auth-basic).

Follow-up notes (not addressed here, per scope gate)

  • caddy validate warns that the basicauth directive is deprecated in favour of basic_auth in Caddy ≥ 2.10. Pre-existing on both protected blocks; rename is a small but separate sweep.
  • PUBLIC_PATHS in auth.py and the @public path … list in this template are duplicated by design but must be kept in sync by hand. A render-time generator (or unit test that diffs the two lists) would lock that contract — out of scope here. The harness assertion only checks structural presence + /api/install/state, not every path.
  • FINDINGS §21 notes /api/metrics/prometheus is in PUBLIC_PATHS but the route is unimplemented; left as-is to avoid coupling this fix to the route catalogue.

Test plan

  • caddy validate against rendered template — clean (one pre-existing deprecation warning).
  • Live Caddy + stub backend: 12 public paths → 200, 7 protected paths → 401, wrong-creds → 401, right-creds → 200, /v1/* passthrough → 200.
  • CI harness auth-basic row passes on a host where preflight succeeds (depends on the existing CI runner config).

The dashboard `handle {}` block applied basicauth to every path that
didn't match `/chat*` or `/v1/*`, including the entire FastAPI
PUBLIC_PATHS frozenset (`/api/install/state`, `/api/config/urls`,
`/api/auth/status`, `/api/status`, `/api/metrics`, …). The browser-side
first-run wizard hits those endpoints before any credential can exist,
so a `hal0 install --auth=basic` deployment was unbootstrappable.

Add a `@public` named matcher with the exact PUBLIC_PATHS list and a
`handle @public { reverse_proxy 127.0.0.1:8080 }` block placed BEFORE
the default basicauth handler. Caddy evaluates `handle` blocks in
source order — first match wins — so the matcher must precede the
default block (the prior arrangement is what produced the bug).

Path matching is exact (Caddy's `path` matcher is full-path match
unless a `*` is given), which mirrors `request.url.path in
PUBLIC_PATHS` on the API side, so the two stay in lockstep.

The harness `auth-basic` row now asserts the rendered `/etc/hal0/Caddyfile`
carries the `@public path` matcher + `handle @public` block + at least
the `/api/install/state` entry, so a future edit that drops the
allowlist (or moves it below the default handle) is caught.

Verified locally: rendered the template, started Caddy against a stub
backend, ran 26 curl probes. All public paths return 200 without
creds; all protected paths return 401 without creds, 401 with wrong
creds, 200 with right creds; `/v1/*` Bearer-passthrough unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Coordinated with #53 which removes prometheus from PUBLIC_PATHS in
src/hal0/api/middleware/auth.py — closes the dead /api/metrics/prometheus
route entirely rather than allowlisting a 404 at the edge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit 929acd3 into main May 17, 2026
0 of 6 checks passed
thinmintdev added a commit that referenced this pull request May 17, 2026
PUBLIC_PATHS advertised /api/metrics/prometheus as a public route, but
the health module only defines /api/metrics (JSON). The Prometheus
path returned 404 for everyone, including monitoring scrapers that
followed the documented bypass list — confusing operator-facing
behaviour ("your allowlist says this is a public endpoint but it
404s").

Drop the entry from PUBLIC_PATHS and remove the stale docstring line
in health.py that promised a 'text/plain prometheus exposition' route.
A real prometheus_client-backed exporter is out of scope for this PR;
the comment in PUBLIC_PATHS calls out that adding it back means landing
the route AND syncing the Caddyfile @public matcher at the same time.

Note: the matching Caddyfile @public path entry lives in PR #49 which
hasn't merged yet; that PR's branch should be rebased to drop the
prometheus line from its @public list before merging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 17, 2026
…/updates-apply (closes #34 #35 #36 #37) (#53)

* fix(v1/audio): 400 instead of 404 when 'model' is missing (closes #34)

POSTing /v1/audio/speech without a 'model' field used to fall through
the dispatcher's default-model + no-route path and surface a confusing
404 'dispatch.no_route'. OpenAI's reference returns 400 with a clear
"you must provide a model parameter" message — that's also how the
operator survey expects it.

Raise hal0.errors.BadRequest (validation.invalid) up front in
/v1/audio/speech and /v1/audio/transcriptions when the request body
(JSON for speech, multipart for transcriptions) omits 'model' or
provides an empty/whitespace-only value. _forward_multipart gains a
require_model flag so the multipart path can opt-in without changing
the others. _dispatch_and_forward now optionally accepts a pre-read
body so the speech route doesn't re-parse JSON twice.

The other /v1/* paths (chat, completions, embeddings, rerankings,
images) are unchanged — they continue to use the dispatcher's
default-model fallback for legacy clients.

Tests: three new cases in test_v1_dispatch.py — speech missing model,
speech with whitespace-only model, transcriptions missing model. All
assert 400 + 'validation.invalid' envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(slots): 404 (not 400) for unknown slot in get_config (closes #35)

GET /api/slots/<unknown>/config used to surface 400 'slot.config_error'
because _load_slot_config conflated "no TOML on disk and no in-memory
state" with a parse error. A UI distinguishing "slot doesn't exist"
from "slot exists but its config is broken" got ambiguous signals.

SlotManager.get_config now raises SlotNotFound (404, code
'slot.not_found') when neither the config file nor an in-memory state
record exists for the requested slot name. A real TOML parse failure
on a slot whose file IS on disk continues to raise SlotConfigError
(400, code 'slot.config_error') unchanged.

Tests: two new cases in test_slots_routes.py — unknown slot → 404
'slot.not_found' and existing slot with broken TOML → 400
'slot.config_error'. Existing get_slot/load lifecycle tests
unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(api): remove /api/metrics/prometheus from PUBLIC_PATHS (closes #36)

PUBLIC_PATHS advertised /api/metrics/prometheus as a public route, but
the health module only defines /api/metrics (JSON). The Prometheus
path returned 404 for everyone, including monitoring scrapers that
followed the documented bypass list — confusing operator-facing
behaviour ("your allowlist says this is a public endpoint but it
404s").

Drop the entry from PUBLIC_PATHS and remove the stale docstring line
in health.py that promised a 'text/plain prometheus exposition' route.
A real prometheus_client-backed exporter is out of scope for this PR;
the comment in PUBLIC_PATHS calls out that adding it back means landing
the route AND syncing the Caddyfile @public matcher at the same time.

Note: the matching Caddyfile @public path entry lives in PR #49 which
hasn't merged yet; that PR's branch should be rebased to drop the
prometheus line from its @public list before merging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(api/updates): return 202 from /api/updates/apply (closes #37)

POST /api/updates/apply accepts the request, queues a background job,
and returns the job snapshot immediately — same shape as
/api/models/{id}/pull (which already returns 202). Returning 200 for an
incomplete-but-accepted operation was a minor cross-route inconsistency.

Add ``status_code=202`` to the @router.post decorator and update the
docstring to call out the async-job contract explicitly. Failure paths
(invalid input, rate-limit) continue to surface their existing 4xx
codes through the error envelope middleware unchanged.

Test: update the existing apply-creates-queued-job assertion from 200
to 202. No client-facing breakage — every reasonable client treats
2xx as success.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 17, 2026
…d C — refs #54) (#60)

Wave 2, Child C of ADR-0001. Documentation pass for the auth collapse
shipped in PRs #58 (Child A — FastAPI password + session cookies +
dual cookie/Bearer middleware) and #59 (Child B — Caddyfile reduction
+ PUBLIC_PATHS deletion + --no-tls flag + HAL0_PUBLIC_HOST /
HAL0_HARNESS_TLS rename).

installer/README.md
-------------------
Rewrites the auth section to describe the single FastAPI auth layer.
Drops every reference to --auth=basic / HAL0_ADMIN_USER /
HAL0_ADMIN_PASSWORD / HAL0_HOSTNAME / Caddy basic_auth / htpasswd.
Documents the --no-tls flag (FastAPI binds 0.0.0.0:8080, reachable at
http://<host>:8080/). Renames HAL0_HOSTNAME to HAL0_PUBLIC_HOST in the
env-var table. Adds an "Upgrade notes (pre-v1)" subsection explaining
that existing --auth=basic installs lose edge auth on upgrade and the
two mitigations — set a password in the wizard, or --no-tls behind your
own reverse proxy. Calls out the wizard's password-setup step (POST
/api/auth/password, public on first run per Child A).

PLAN.md
-------
§1 "Auth + reverse proxy" rewritten to reflect the single FastAPI layer
and a "Trust posture" subsection: hal0 defaults to open on the LAN;
password auth is opt-in via the dashboard wizard; programmatic clients
use Bearer tokens unchanged from #29. Drops the Caddy basic_auth /
PUBLIC_PATHS prose; narrows Caddy's scope to TLS termination + reverse
proxy. §10 (harness flags) renames HAL0_HARNESS_AUTH → HAL0_HARNESS_TLS
to match what the harness scripts actually look for.

docs/api-errors.md
------------------
Adds a brief note in the 401 section linking to ADR-0001 / PR #58 and
naming the new endpoints (POST /api/auth/login, /api/auth/logout,
/api/auth/password). The envelope shapes themselves are unchanged.

tests/harness/FINDINGS.md
-------------------------
Prepends "FIXED BY ARCHITECTURE REMOVAL (ADR-0001)" notes to the three
historical entries that the auth collapse renders structurally
unrepeatable: §10 (Caddy basic_auth swallows PUBLIC_PATHS — the
original #28 critical bug, fixed in PR #49 and now historical because
Caddy no longer has matchers or basicauth per PR #59), §16 (basic_auth
password unrecoverable post-install — source of the #43 HITL decision,
fixed by deletion because credential capture moved into the wizard per
PRs #58 + #59), and §21 (/api/metrics/prometheus orphan in
PUBLIC_PATHS — fixed by deletion because PUBLIC_PATHS is gone). The
original report bodies are preserved verbatim below each note for
historical reference. Re-run instructions updated to the new
HAL0_HARNESS_TLS knob.

README.md, tests/harness/README.md
----------------------------------
Sync the auth-posture summary in the repo root README and the harness
opt-in flags table to match the new single-FastAPI model and the
HAL0_HARNESS_TLS rename. These weren't called out in the spec but were
left stale after Child B; updating them keeps the docs internally
consistent.

Closes #43 and #51 (per the parent ADR plan). Issue close comments
follow the merge via the gh CLI; this PR body is the GH-semantics
hook in case the manual close doesn't land.

closes #43
closes #51
refs #54
refs #57

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev deleted the fix/issue-28-caddy-public-paths branch May 21, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Caddy basic_auth swallows PUBLIC_PATHS allowlist — first-run wizard unbootstrappable

1 participant