Skip to content

fix(slots): POST normalizes Lemonade-shape model + auto-assigns port (#275 bugs 1+2)#281

Merged
thinmintdev merged 2 commits into
mainfrom
fix/slot-post-schema-and-port
May 23, 2026
Merged

fix(slots): POST normalizes Lemonade-shape model + auto-assigns port (#275 bugs 1+2)#281
thinmintdev merged 2 commits into
mainfrom
fix/slot-post-schema-and-port

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

Closes parts of #275. Normalizes POST /api/slots body to canonical nested shape: top-level string model becomes {"default": <string>} and missing/zero port auto-assigns from 8081-8099 via a new _next_free_slot_port() helper. Surfaced by the real-hardware CRUD sweep — cards created via POST had blank model + port=0 in the dashboard.

…275 bugs 1+2)

Surfaced 2026-05-23 by the v3 dashboard CRUD sweep. Two compat hops at
the POST /api/slots boundary:

**Bug 1** — body accepts top-level `model: "name"` (Lemonade-shape, per
the slots audit + the v3 dashboard's create-slot modal) but the
serializer at slots.py:191 reads `cfg.get("model").get("default")` (the
nested [model] table that the SlotConfig pydantic model and persistent
TOML loaders both use). Result: `model_default` MISSING from /api/slots
responses for any slot created via POST. Cards rendered with no model
name despite the TOML having the model. Workaround was hand-writing
TOMLs in nested shape.

**Bug 2** — POST never auto-assigns a port. New slots persisted as
`port=0`, dashboard card chips showed port=0 instead of a useable port.

Fix: add `_normalize_create_body()` that runs before `sm.create()`:
- Top-level string `model` → nested `{"default": <string>}`.
- Missing/zero `port` → next free port in 8081-8099 via the new
  `_next_free_slot_port()` helper (walks existing slot TOMLs to find
  the lowest unclaimed port).

Closes parts of #275 (bugs 1+2 of 7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 23, 2026
…havior (#285)

PR #277 changed the dispatcher: NoRouteFound now delegates to the
lemonade_proxy catch-all instead of bubbling up as 404. Six existing
tests asserted the OLD 404 contract, breaking CI for #281, #282, #283
and any future PR.

Fixes:
- tests/api/test_v1_dispatch.py: 3 `_no_route_envelope` tests renamed +
  asserts widened to accept either 404 (no proxy) or 503
  (proxy→lemonade unreachable).
- tests/api/test_v1_proxy.py: test_v1_chat_completions_still_hits_dispatcher
  renamed to ..._falls_through_to_proxy + assertion flipped (proxy now
  MUST be consulted on dispatcher no_route, not skipped).
- tests/omni_router/test_api_wiring.py: test_chat_completions_without_omni
  widened to either 404 or 503 + lemonade.unavailable code.
- tests/slots/test_manager.py::test_status_reconciles_drift: drop the
  "evicted" message assertion (the operator-facing message is reset to
  empty in the post-transition Slot rebuild path; only state==OFFLINE
  is contract per #276).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit aa03d71 into main May 23, 2026
4 checks passed
@thinmintdev thinmintdev deleted the fix/slot-post-schema-and-port branch May 23, 2026 23:14
thinmintdev added a commit that referenced this pull request May 23, 2026
 bug 3) (#282)

* fix(slots): POST normalizes Lemonade-shape model + auto-assigns port (#275 bugs 1+2)

Surfaced 2026-05-23 by the v3 dashboard CRUD sweep. Two compat hops at
the POST /api/slots boundary:

**Bug 1** — body accepts top-level `model: "name"` (Lemonade-shape, per
the slots audit + the v3 dashboard's create-slot modal) but the
serializer at slots.py:191 reads `cfg.get("model").get("default")` (the
nested [model] table that the SlotConfig pydantic model and persistent
TOML loaders both use). Result: `model_default` MISSING from /api/slots
responses for any slot created via POST. Cards rendered with no model
name despite the TOML having the model. Workaround was hand-writing
TOMLs in nested shape.

**Bug 2** — POST never auto-assigns a port. New slots persisted as
`port=0`, dashboard card chips showed port=0 instead of a useable port.

Fix: add `_normalize_create_body()` that runs before `sm.create()`:
- Top-level string `model` → nested `{"default": <string>}`.
- Missing/zero `port` → next free port in 8081-8099 via the new
  `_next_free_slot_port()` helper (walks existing slot TOMLs to find
  the lowest unclaimed port).

Closes parts of #275 (bugs 1+2 of 7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): hal0 slot create gains --type + derives Lemonade device (#275 bug 3)

Surfaced 2026-05-23 by the v3 dashboard CRUD sweep: the CLI couldn't
create embed/rerank/transcription/tts slots because `hal0 slot create`
had no `--type` flag and used the v0.1 hardware enum ([vulkan|rocm|cpu])
instead of the Lemonade `device` enum ([gpu-vulkan|gpu-rocm|cpu|npu]).
Operators creating non-LLM slots had to bypass the CLI and POST to
/api/slots directly.

Adds:
- `SlotType` enum (`llm | embedding | reranking | transcription | tts |
  image`) — the Lemonade type vocab.
- `--type` / `-t` flag on `hal0 slot create`, defaults to `llm` for
  backward compat with the v0.1 chat-only create path.
- Derives `device` from `--hardware` (vulkan→gpu-vulkan, rocm→gpu-rocm,
  cpu→cpu) so the body POST'd matches the audit-recommended
  Lemonade-shape SlotConfig.

Keeps `--provider` + `--hardware` flags (now documented as legacy v0.1
compat). Pairs with PR #281 which normalizes the POST body server-side.

Closes part of #275 (bug 3 of 7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 25, 2026
PR #281 (aa03d71) introduced `_next_free_slot_port()` importing
`hal0_etc_dir` from `hal0.config.paths` — but no such symbol exists,
breaking every POST /api/slots with a 500 ImportError.

The existing `slots_config_dir()` helper returns `etc() / "slots"`,
exactly what the new code wants. Swap the import; the function body
becomes a no-op simpler.

Hot-patched + smoke-tested on hal0 LXC: POST /api/slots → 201 with
correct auto-assigned port 8086 (lowest free in 8081-8099).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 28, 2026
…rough + gut installer auth section (#390)

- docs/operate/lemonade.md (new, .md canonical): operator reference for
  the v0.2 Lemonade runtime — what it is, where state lives, the /v1/*
  proxy + dispatcher fallthrough (PRs #248/#277), slot ↔ Lemonade
  model mapping (PRs #281/#282), max_loaded_models = 8 LRU cap (PR
  #283), per-type LRU eviction per ADR-0008 (supersedes nuclear-evict
  ADR-0007), OFFLINE-on-eviction (PR #276), and the three known v0.3
  caveats (Vulkan KV gauge missing, whisper RUNPATH workaround, GPU
  cleanup unload hang).

- docs/dashboard/v3.md (new, .md canonical, new docs/dashboard/ dir):
  page-by-page tour of the v3 React dashboard shipped in
  v0.3.0-alpha.1 (PR #235). Covers the shell + Mock-badge convention,
  /dashboard (system overview after #356), /chat (real surface per
  #309/#314/#315/#351), /slots (sidebar mirror per #357 + #344 UX
  sweep), /models (#313/#319/#353), /mcp (#304/#300), /agents (Peers
  per #299), /memory (graph #297, throughput #308), Settings (no Auth
  tab post-ADR-0012), and the footer journal (Epic #322 — PRs
  #321/#328/#329/#330/#332). Mock-fallback issues linked via the
  dashboard-v3 label, not enumerated.

- installer/README.md: gut ~95 lines of stale auth prose (Caddy,
  Bearer-token mint/use/revoke, first-run OTP claim wizard,
  HAL0_AUTH_ENABLED/HAL0_AUTH_DISABLED, password recovery, basic_auth
  upgrade path, the TLS recipe). Replace with one paragraph pointing
  at docs/operate/auth.mdx for the reverse-proxy recipe and
  docs/agents/identity.md for the X-hal0-Agent identity model. Auth
  was removed in v0.3.0-alpha.1 per ADR-0012; the README hadn't
  caught up.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant