feat(lemonade): extend client — fix llamacpp_args + add /internal/* endpoints (PR-3)#156
Merged
Conversation
…ndpoints (PR-3) PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the LemonadeClient skeleton shipped in #137 in line with the locked adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and ADR-0008 before any later PR depends on it. Changes: - Fix the llamacpp_args serialization bug. Wire format is a single space-separated string; Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike #2 findings + lemonade-research-2026-05-22/api.md §1.3). The client now accepts str | list[str] | None: None omits the key (never send JSON null, per the v1_load_schema memory + nlohmann unconditional accessor), str passes through verbatim, list joins on single spaces, [] becomes the empty-string sentinel ("use default" via is_empty_option). - DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 + plan §3 + §12.2 lock the port). - Add the four loopback-only /internal/* endpoints from plan §2.2: shutdown() (POST /internal/shutdown — systemd ExecStop), internal_config() (GET /internal/config — admin panel source of truth), internal_set(values) (POST /internal/set — atomic config setter for both immediate-effect and deferred-until-next-load keys), and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly HF cache hygiene cron). All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError (reserved for /v1/load's evict-all blast radius). - Update stale ADR references in docstrings/comments. ADR-0006 → ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic is still valid but the ADR ref is stale; rephrased to cite ADR-0008 §3's nuclear-evict + not-found exemption. Drop "preload validation" and "preload validator" — preload was removed from main in #155. - Stats docstring now points at plan §12.1's KV%-missing caveat so the metrics-shim author doesn't re-discover it later. Tests: 13 new cases extending tests/lemonade/test_client.py. - llamacpp_args matrix: None omits key; "--threads 8" forwards verbatim; ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8"; [] → "". - Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape (or absence for the no-body endpoints). One shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError). Out of scope (per the PR-3 brief): preload module (removed in #155 already), forward-plane endpoints, hal0-api wiring of the idle driver, and any Pydantic/TypedDict response layer (plain dicts are fine for v0.2). Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 + §11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7; docs/internal/lemonade-spike-2-findings-2026-05-22.md; docs/internal/lemonade-research-2026-05-22/api.md §1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's `ruff format --check` step caught two formatting deltas the author agent missed (it ran `ruff check` but not `ruff format`). Pure formatter output; no logic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the
LemonadeClientskeleton shipped in #137 in line with the lockedadoption plan + ADR-0008 before any later PR depends on it. Rebased
on top of #155 (preload removal).
Summary
llamacpp_argsserialization (the bug from feat(lemonade): client skeleton — control-plane HTTP wrapper (ADR-0006 PR-2/16) #137). Wire format is a single space-separated string — Lemonade'snlohmann::jsonparser raises"type must be string, but is array"on a list (spike fix(ci): clear ruff lint + format errors blocking CI #2 findings + research noteapi.md §1.3). Acceptstr | list[str] | None:Noneomits the key (never JSONnull, per thehal0_lemonade_v1_load_schemamemory),strpasses through,listjoins on single spaces,[]becomes""(the "use default" sentinel).DEFAULT_BASE_URL:127.0.0.1:9100→127.0.0.1:13305per ADR-0008 §1 + plan §3 / §12.2./internal/*endpoints from plan §2.2:shutdown()—POST /internal/shutdown(systemdExecStop).internal_config()—GET /internal/config(admin panel).internal_set(values)—POST /internal/set(atomic config setter).internal_cleanup_cache()—POST /internal/cleanup-cache(weekly cron).All four route through
_raise_for_status, so non-2xx surfaces asLemonadeHTTPError— notLemonadeLoadError, which stays reserved for/v1/load's evict-all blast radius (ADR-0008 §3).ADR-0006→ADR-0008; ADR-0007's no-retry-on-/v1/loadlogic stays but the ref rephrases to cite ADR-0008 §3 (nuclear-evict + not-found exemption). Droppreload validation/preload validatormentions — preload was removed on main in chore(lemonade): remove preload validation per ADR-0008 §3 #155. Stats docstring now points at plan §12.1's KV%-missing caveat.Out of scope (per PR-3 brief)
preload.pymodule (removed in chore(lemonade): remove preload validation per ADR-0008 §3 #155)./v1/chat/completionsetc.) — dispatcher already handles them unchanged.hal0-apiwiring of the idle driver.Test plan
tests/lemonade/test_client.pyextended with 13 new cases (75 lemonade-suite tests total, all passing).llamacpp_argsmatrix:Noneomits key,"--threads 8"forwards verbatim,["--parallel", "1", "--threads", "8"]joins to"--parallel 1 --threads 8",[]→""./internal/*endpoint verified for HTTP method, path, Bearer auth header, and request body shape; one shared parametrised test confirms all four raiseLemonadeHTTPErroron403(notLemonadeLoadError)..venv/bin/python -m pytest tests/ -q.ruff check src/hal0/lemonade tests/lemonade.References
docs/internal/lemonade-adoption-plan-2026-05-22.md§2.2 (endpoint table), §3 (port + config baseline), §11 (PR-3 scope).docs/internal/adr/0008-lemonade-adoption.md§1 (loopback port), §3 (per-type LRU + evict-all), §4 (--threads N), §7 (noextra.*).docs/internal/lemonade-spike-2-findings-2026-05-22.md(the--threadsroot-cause chain).docs/internal/lemonade-research-2026-05-22/api.md§1 (wire-format reverse-engineering).🤖 Generated with Claude Code