fix(api): wave-2 status code cleanup — speech/slots-config/prometheus/updates-apply (closes #34 #35 #36 #37)#53
Merged
Conversation
thinmintdev
added a commit
that referenced
this pull request
May 17, 2026
Coordinated with #53 which removes prometheus from PUBLIC_PATHS in src/hal0/api/middleware/auth.py — closes the dead /api/metrics/prometheus route entirely rather than allowlisting a 404 at the edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 17, 2026
* fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28) The dashboard `handle {}` block applied basicauth to every path that didn't match `/chat*` or `/v1/*`, including the entire FastAPI PUBLIC_PATHS frozenset (`/api/install/state`, `/api/config/urls`, `/api/auth/status`, `/api/status`, `/api/metrics`, …). The browser-side first-run wizard hits those endpoints before any credential can exist, so a `hal0 install --auth=basic` deployment was unbootstrappable. Add a `@public` named matcher with the exact PUBLIC_PATHS list and a `handle @public { reverse_proxy 127.0.0.1:8080 }` block placed BEFORE the default basicauth handler. Caddy evaluates `handle` blocks in source order — first match wins — so the matcher must precede the default block (the prior arrangement is what produced the bug). Path matching is exact (Caddy's `path` matcher is full-path match unless a `*` is given), which mirrors `request.url.path in PUBLIC_PATHS` on the API side, so the two stay in lockstep. The harness `auth-basic` row now asserts the rendered `/etc/hal0/Caddyfile` carries the `@public path` matcher + `handle @public` block + at least the `/api/install/state` entry, so a future edit that drops the allowlist (or moves it below the default handle) is caught. Verified locally: rendered the template, started Caddy against a stub backend, ran 26 curl probes. All public paths return 200 without creds; all protected paths return 401 without creds, 401 with wrong creds, 200 with right creds; `/v1/*` Bearer-passthrough unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(caddy): drop /api/metrics/prometheus from @public matcher Coordinated with #53 which removes prometheus from PUBLIC_PATHS in src/hal0/api/middleware/auth.py — closes the dead /api/metrics/prometheus route entirely rather than allowlisting a 404 at the edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POSTing /v1/audio/speech without a 'model' field used to fall through the dispatcher's default-model + no-route path and surface a confusing 404 'dispatch.no_route'. OpenAI's reference returns 400 with a clear "you must provide a model parameter" message — that's also how the operator survey expects it. Raise hal0.errors.BadRequest (validation.invalid) up front in /v1/audio/speech and /v1/audio/transcriptions when the request body (JSON for speech, multipart for transcriptions) omits 'model' or provides an empty/whitespace-only value. _forward_multipart gains a require_model flag so the multipart path can opt-in without changing the others. _dispatch_and_forward now optionally accepts a pre-read body so the speech route doesn't re-parse JSON twice. The other /v1/* paths (chat, completions, embeddings, rerankings, images) are unchanged — they continue to use the dispatcher's default-model fallback for legacy clients. Tests: three new cases in test_v1_dispatch.py — speech missing model, speech with whitespace-only model, transcriptions missing model. All assert 400 + 'validation.invalid' envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET /api/slots/<unknown>/config used to surface 400 'slot.config_error' because _load_slot_config conflated "no TOML on disk and no in-memory state" with a parse error. A UI distinguishing "slot doesn't exist" from "slot exists but its config is broken" got ambiguous signals. SlotManager.get_config now raises SlotNotFound (404, code 'slot.not_found') when neither the config file nor an in-memory state record exists for the requested slot name. A real TOML parse failure on a slot whose file IS on disk continues to raise SlotConfigError (400, code 'slot.config_error') unchanged. Tests: two new cases in test_slots_routes.py — unknown slot → 404 'slot.not_found' and existing slot with broken TOML → 400 'slot.config_error'. Existing get_slot/load lifecycle tests unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PUBLIC_PATHS advertised /api/metrics/prometheus as a public route, but
the health module only defines /api/metrics (JSON). The Prometheus
path returned 404 for everyone, including monitoring scrapers that
followed the documented bypass list — confusing operator-facing
behaviour ("your allowlist says this is a public endpoint but it
404s").
Drop the entry from PUBLIC_PATHS and remove the stale docstring line
in health.py that promised a 'text/plain prometheus exposition' route.
A real prometheus_client-backed exporter is out of scope for this PR;
the comment in PUBLIC_PATHS calls out that adding it back means landing
the route AND syncing the Caddyfile @public matcher at the same time.
Note: the matching Caddyfile @public path entry lives in PR #49 which
hasn't merged yet; that PR's branch should be rebased to drop the
prometheus line from its @public list before merging.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POST /api/updates/apply accepts the request, queues a background job,
and returns the job snapshot immediately — same shape as
/api/models/{id}/pull (which already returns 202). Returning 200 for an
incomplete-but-accepted operation was a minor cross-route inconsistency.
Add ``status_code=202`` to the @router.post decorator and update the
docstring to call out the async-job contract explicitly. Failure paths
(invalid input, rate-limit) continue to surface their existing 4xx
codes through the error envelope middleware unchanged.
Test: update the existing apply-creates-queued-job assertion from 200
to 202. No client-facing breakage — every reasonable client treats
2xx as success.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17a3756 to
e9705f2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wave-2 status-code cleanup riding on the typed-error foundation that landed in #47. Four targeted bug fixes, one commit each, no overlap across the four routers touched.
/v1/audio/speechand/v1/audio/transcriptionsnow raiseBadRequest(400,validation.invalid) up front when the request omitsmodel, instead of falling through the dispatcher's default-model + no-route path and surfacing a misleading 404dispatch.no_route.SlotManager.get_confignow raisesSlotNotFound(404,slot.not_found) when neither a TOML config nor in-memory state exists for the slot. A real TOML parse failure on an existing slot still raisesSlotConfigError(400,slot.config_error) unchanged./api/metrics/prometheusis dropped fromPUBLIC_PATHS. The route never had an implementation, so advertising it as a public bypass produced a 404 for monitoring scrapers that followed the documented allowlist. The staletext/plain prometheus expositionline inhealth.py's module docstring is also removed.POST /api/updates/applynow returns202 Accepted(was 200) to match/api/models/{id}/pulland the rest of hal0's queued-async-job endpoints.All fixes use the typed
hal0.errorssubclasses (BadRequest,NotFound-via-SlotNotFound) that landed in #47 — no separate OpenAI-shaped error translator, the same envelope middleware handles both/v1/*and/api/*paths.Files changed per issue
src/hal0/api/routes/v1.py,tests/api/test_v1_dispatch.pysrc/hal0/slots/manager.py,tests/api/test_slots_routes.pysrc/hal0/api/middleware/auth.py,src/hal0/api/routes/health.pysrc/hal0/api/routes/updater.py,tests/api/test_updater_routes.pyVerification
Ran against the worktree's
src/(editable install pointed at main worktree, so PYTHONPATH overrides were used per the agent-worktree convention):Per-issue:
tests/api/test_v1_dispatch.py— 3 new cases: speech missing model, speech whitespace-only model, transcriptions missing model. All assert 400 +validation.invalid. Existingdispatch.no_routeregressions still green.tests/api/test_slots_routes.py— 2 new cases: unknown slot → 404slot.not_found, existing slot with broken TOML → 400slot.config_error. Existing slot lifecycle + slot-manager unit tests unaffected (55 passed in tests/slots/).tests/api/test_auth_middleware.py+test_smoke.py— 33 passed; no test asserted prometheus was in the allowlist, so the removal is a no-op for the suite.tests/api/test_updater_routes.py— updated the existingtest_apply_creates_a_queued_job_returning_idassertion from 200 → 202. 11 passed.ruff check+ruff format --checkclean on all 8 touched files.Caddyfile note (#36)
The task brief mentioned dropping
/api/metrics/prometheusfrom the Caddyfile@public pathmatcher added in PR #49. That PR is still open and hasn't merged intomain, so the@publicblock doesn't exist on this branch's Caddyfile yet. PR #49's branch should be rebased after this lands to drop the/api/metrics/prometheusline from its@publicallowlist — otherwise the two will be out of sync again. Called out in the commit message for #36.CI note
The Hal0ai GitHub org has a billing block on Actions — CI on this PR may not run. Verified locally per the test output above. Please admin-merge if CI doesn't kick off.
Test plan
ruff check+ruff formatclean🤖 Generated with Claude Code