fix(api): wave-2 status code cleanup — speech/slots-config/prometheus/updates-apply (closes #34 #35 #36 #37) by thinmintdev · Pull Request #53 · Hal0ai/hal0

thinmintdev · 2026-05-16T23:01:26Z

Summary

Wave-2 status-code cleanup riding on the typed-error foundation that landed in #47. Four targeted bug fixes, one commit each, no overlap across the four routers touched.

/v1/audio/speech returns 400 (not 404) when model is omitted #34 /v1/audio/speech and /v1/audio/transcriptions now raise BadRequest (400, validation.invalid) up front when the request omits model, instead of falling through the dispatcher's default-model + no-route path and surfacing a misleading 404 dispatch.no_route.
/api/slots/{name}/config returns 404 (not 400) for unknown slot #35 SlotManager.get_config now raises SlotNotFound (404, slot.not_found) when neither a TOML config nor in-memory state exists for the slot. A real TOML parse failure on an existing slot still raises SlotConfigError (400, slot.config_error) unchanged.
Implement or remove /api/metrics/prometheus (PUBLIC_PATHS lists a 404) #36 /api/metrics/prometheus is dropped from PUBLIC_PATHS. The route never had an implementation, so advertising it as a public bypass produced a 404 for monitoring scrapers that followed the documented allowlist. The stale text/plain prometheus exposition line in health.py's module docstring is also removed.
POST /api/updates/apply returns 202 (not 200) on accepted async work #37 POST /api/updates/apply now returns 202 Accepted (was 200) to match /api/models/{id}/pull and the rest of hal0's queued-async-job endpoints.

All fixes use the typed hal0.errors subclasses (BadRequest, NotFound-via-SlotNotFound) that landed in #47 — no separate OpenAI-shaped error translator, the same envelope middleware handles both /v1/* and /api/* paths.

Files changed per issue

/v1/audio/speech returns 400 (not 404) when model is omitted #34 src/hal0/api/routes/v1.py, tests/api/test_v1_dispatch.py
/api/slots/{name}/config returns 404 (not 400) for unknown slot #35 src/hal0/slots/manager.py, tests/api/test_slots_routes.py
Implement or remove /api/metrics/prometheus (PUBLIC_PATHS lists a 404) #36 src/hal0/api/middleware/auth.py, src/hal0/api/routes/health.py
POST /api/updates/apply returns 202 (not 200) on accepted async work #37 src/hal0/api/routes/updater.py, tests/api/test_updater_routes.py

Verification

Ran against the worktree's src/ (editable install pointed at main worktree, so PYTHONPATH overrides were used per the agent-worktree convention):

PYTHONPATH=<worktree>/src .venv/bin/pytest tests/ --ignore=tests/harness
670 passed, 6 skipped in 36.64s

Per-issue:

/v1/audio/speech returns 400 (not 404) when model is omitted #34 tests/api/test_v1_dispatch.py — 3 new cases: speech missing model, speech whitespace-only model, transcriptions missing model. All assert 400 + validation.invalid. Existing dispatch.no_route regressions still green.
/api/slots/{name}/config returns 404 (not 400) for unknown slot #35 tests/api/test_slots_routes.py — 2 new cases: unknown slot → 404 slot.not_found, existing slot with broken TOML → 400 slot.config_error. Existing slot lifecycle + slot-manager unit tests unaffected (55 passed in tests/slots/).
Implement or remove /api/metrics/prometheus (PUBLIC_PATHS lists a 404) #36 tests/api/test_auth_middleware.py + test_smoke.py — 33 passed; no test asserted prometheus was in the allowlist, so the removal is a no-op for the suite.
POST /api/updates/apply returns 202 (not 200) on accepted async work #37 tests/api/test_updater_routes.py — updated the existing test_apply_creates_a_queued_job_returning_id assertion from 200 → 202. 11 passed.

ruff check + ruff format --check clean on all 8 touched files.

Caddyfile note (#36)

The task brief mentioned dropping /api/metrics/prometheus from the Caddyfile @public path matcher added in PR #49. That PR is still open and hasn't merged into main, so the @public block doesn't exist on this branch's Caddyfile yet. PR #49's branch should be rebased after this lands to drop the /api/metrics/prometheus line from its @public allowlist — otherwise the two will be out of sync again. Called out in the commit message for #36.

CI note

The Hal0ai GitHub org has a billing block on Actions — CI on this PR may not run. Verified locally per the test output above. Please admin-merge if CI doesn't kick off.

Test plan

All four issue acceptance criteria covered by unit tests
ruff check + ruff format clean
Full pytest suite (excluding integration-only harness) passes — 670 passed
Caddyfile sync with PR fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28) #49 once fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28) #49 rebases on top of this

🤖 Generated with Claude Code

Coordinated with #53 which removes prometheus from PUBLIC_PATHS in src/hal0/api/middleware/auth.py — closes the dead /api/metrics/prometheus route entirely rather than allowlisting a 404 at the edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@public

* fix(caddy): allowlist PUBLIC_PATHS before basic_auth (closes #28) The dashboard `handle {}` block applied basicauth to every path that didn't match `/chat*` or `/v1/*`, including the entire FastAPI PUBLIC_PATHS frozenset (`/api/install/state`, `/api/config/urls`, `/api/auth/status`, `/api/status`, `/api/metrics`, …). The browser-side first-run wizard hits those endpoints before any credential can exist, so a `hal0 install --auth=basic` deployment was unbootstrappable. Add a `@public` named matcher with the exact PUBLIC_PATHS list and a `handle @public { reverse_proxy 127.0.0.1:8080 }` block placed BEFORE the default basicauth handler. Caddy evaluates `handle` blocks in source order — first match wins — so the matcher must precede the default block (the prior arrangement is what produced the bug). Path matching is exact (Caddy's `path` matcher is full-path match unless a `*` is given), which mirrors `request.url.path in PUBLIC_PATHS` on the API side, so the two stay in lockstep. The harness `auth-basic` row now asserts the rendered `/etc/hal0/Caddyfile` carries the `@public path` matcher + `handle @public` block + at least the `/api/install/state` entry, so a future edit that drops the allowlist (or moves it below the default handle) is caught. Verified locally: rendered the template, started Caddy against a stub backend, ran 26 curl probes. All public paths return 200 without creds; all protected paths return 401 without creds, 401 with wrong creds, 200 with right creds; `/v1/*` Bearer-passthrough unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(caddy): drop /api/metrics/prometheus from @public matcher Coordinated with #53 which removes prometheus from PUBLIC_PATHS in src/hal0/api/middleware/auth.py — closes the dead /api/metrics/prometheus route entirely rather than allowlisting a 404 at the edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

POSTing /v1/audio/speech without a 'model' field used to fall through the dispatcher's default-model + no-route path and surface a confusing 404 'dispatch.no_route'. OpenAI's reference returns 400 with a clear "you must provide a model parameter" message — that's also how the operator survey expects it. Raise hal0.errors.BadRequest (validation.invalid) up front in /v1/audio/speech and /v1/audio/transcriptions when the request body (JSON for speech, multipart for transcriptions) omits 'model' or provides an empty/whitespace-only value. _forward_multipart gains a require_model flag so the multipart path can opt-in without changing the others. _dispatch_and_forward now optionally accepts a pre-read body so the speech route doesn't re-parse JSON twice. The other /v1/* paths (chat, completions, embeddings, rerankings, images) are unchanged — they continue to use the dispatcher's default-model fallback for legacy clients. Tests: three new cases in test_v1_dispatch.py — speech missing model, speech with whitespace-only model, transcriptions missing model. All assert 400 + 'validation.invalid' envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GET /api/slots/<unknown>/config used to surface 400 'slot.config_error' because _load_slot_config conflated "no TOML on disk and no in-memory state" with a parse error. A UI distinguishing "slot doesn't exist" from "slot exists but its config is broken" got ambiguous signals. SlotManager.get_config now raises SlotNotFound (404, code 'slot.not_found') when neither the config file nor an in-memory state record exists for the requested slot name. A real TOML parse failure on a slot whose file IS on disk continues to raise SlotConfigError (400, code 'slot.config_error') unchanged. Tests: two new cases in test_slots_routes.py — unknown slot → 404 'slot.not_found' and existing slot with broken TOML → 400 'slot.config_error'. Existing get_slot/load lifecycle tests unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@public

PUBLIC_PATHS advertised /api/metrics/prometheus as a public route, but the health module only defines /api/metrics (JSON). The Prometheus path returned 404 for everyone, including monitoring scrapers that followed the documented bypass list — confusing operator-facing behaviour ("your allowlist says this is a public endpoint but it 404s"). Drop the entry from PUBLIC_PATHS and remove the stale docstring line in health.py that promised a 'text/plain prometheus exposition' route. A real prometheus_client-backed exporter is out of scope for this PR; the comment in PUBLIC_PATHS calls out that adding it back means landing the route AND syncing the Caddyfile @public matcher at the same time. Note: the matching Caddyfile @public path entry lives in PR #49 which hasn't merged yet; that PR's branch should be rebased to drop the prometheus line from its @public list before merging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

POST /api/updates/apply accepts the request, queues a background job, and returns the job snapshot immediately — same shape as /api/models/{id}/pull (which already returns 202). Returning 200 for an incomplete-but-accepted operation was a minor cross-route inconsistency. Add ``status_code=202`` to the @router.post decorator and update the docstring to call out the async-job contract explicitly. Failure paths (invalid input, rate-limit) continue to surface their existing 4xx codes through the error envelope middleware unchanged. Test: update the existing apply-creates-queued-job assertion from 200 to 202. No client-facing breakage — every reasonable client treats 2xx as success. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

thinmintdev and others added 4 commits May 17, 2026 17:33

thinmintdev force-pushed the fix/issue-34-35-36-37-status-codes branch from 17a3756 to e9705f2 Compare May 17, 2026 21:33

thinmintdev merged commit cec38b9 into main May 17, 2026
0 of 6 checks passed

thinmintdev deleted the fix/issue-34-35-36-37-status-codes branch May 21, 2026 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): wave-2 status code cleanup — speech/slots-config/prometheus/updates-apply (closes #34 #35 #36 #37)#53

fix(api): wave-2 status code cleanup — speech/slots-config/prometheus/updates-apply (closes #34 #35 #36 #37)#53
thinmintdev merged 4 commits into
mainfrom
fix/issue-34-35-36-37-status-codes

thinmintdev commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thinmintdev commented May 16, 2026

Summary

Files changed per issue

Verification

Caddyfile note (#36)

CI note

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant