Skip to content

feat(lemonade): extend client — fix llamacpp_args + add /internal/* endpoints (PR-3)#156

Merged
thinmintdev merged 2 commits into
mainfrom
feat/lemonade-client-pr3
May 23, 2026
Merged

feat(lemonade): extend client — fix llamacpp_args + add /internal/* endpoints (PR-3)#156
thinmintdev merged 2 commits into
mainfrom
feat/lemonade-client-pr3

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the
LemonadeClient skeleton shipped in #137 in line with the locked
adoption plan + ADR-0008 before any later PR depends on it. Rebased
on top of #155 (preload removal).

Summary

  • Fix llamacpp_args serialization (the bug from feat(lemonade): client skeleton — control-plane HTTP wrapper (ADR-0006 PR-2/16) #137). Wire format is a single space-separated string — Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike fix(ci): clear ruff lint + format errors blocking CI #2 findings + research note api.md §1.3). Accept str | list[str] | None: None omits the key (never JSON null, per the hal0_lemonade_v1_load_schema memory), str passes through, list joins on single spaces, [] becomes "" (the "use default" sentinel).
  • DEFAULT_BASE_URL: 127.0.0.1:9100127.0.0.1:13305 per ADR-0008 §1 + plan §3 / §12.2.
  • Add four loopback-only /internal/* endpoints from plan §2.2:
    • shutdown()POST /internal/shutdown (systemd ExecStop).
    • internal_config()GET /internal/config (admin panel).
    • internal_set(values)POST /internal/set (atomic config setter).
    • internal_cleanup_cache()POST /internal/cleanup-cache (weekly cron).
      All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError — not LemonadeLoadError, which stays reserved for /v1/load's evict-all blast radius (ADR-0008 §3).
  • Refresh stale ADR refs in docstrings/comments. ADR-0006ADR-0008; ADR-0007's no-retry-on-/v1/load logic stays but the ref rephrases to cite ADR-0008 §3 (nuclear-evict + not-found exemption). Drop preload validation / preload validator mentions — preload was removed on main in chore(lemonade): remove preload validation per ADR-0008 §3 #155. Stats docstring now points at plan §12.1's KV%-missing caveat.

Out of scope (per PR-3 brief)

Test plan

  • tests/lemonade/test_client.py extended with 13 new cases (75 lemonade-suite tests total, all passing).
  • llamacpp_args matrix: None omits key, "--threads 8" forwards verbatim, ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8", []"".
  • Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape; one shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError).
  • Full suite green: .venv/bin/python -m pytest tests/ -q.
  • Lint clean: ruff check src/hal0/lemonade tests/lemonade.

References

  • docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 (endpoint table), §3 (port + config baseline), §11 (PR-3 scope).
  • docs/internal/adr/0008-lemonade-adoption.md §1 (loopback port), §3 (per-type LRU + evict-all), §4 (--threads N), §7 (no extra.*).
  • docs/internal/lemonade-spike-2-findings-2026-05-22.md (the --threads root-cause chain).
  • docs/internal/lemonade-research-2026-05-22/api.md §1 (wire-format reverse-engineering).

🤖 Generated with Claude Code

thinmintdev and others added 2 commits May 22, 2026 21:30
…ndpoints (PR-3)

PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the
LemonadeClient skeleton shipped in #137 in line with the locked
adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and
ADR-0008 before any later PR depends on it.

Changes:

- Fix the llamacpp_args serialization bug. Wire format is a single
  space-separated string; Lemonade's nlohmann::json parser raises
  "type must be string, but is array" on a list (spike #2 findings +
  lemonade-research-2026-05-22/api.md §1.3). The client now accepts
  str | list[str] | None: None omits the key (never send JSON null,
  per the v1_load_schema memory + nlohmann unconditional accessor),
  str passes through verbatim, list joins on single spaces, [] becomes
  the empty-string sentinel ("use default" via is_empty_option).
- DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 +
  plan §3 + §12.2 lock the port).
- Add the four loopback-only /internal/* endpoints from plan §2.2:
  shutdown() (POST /internal/shutdown — systemd ExecStop),
  internal_config() (GET /internal/config — admin panel source of
  truth), internal_set(values) (POST /internal/set — atomic config
  setter for both immediate-effect and deferred-until-next-load keys),
  and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly
  HF cache hygiene cron). All four route through _raise_for_status,
  so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError
  (reserved for /v1/load's evict-all blast radius).
- Update stale ADR references in docstrings/comments. ADR-0006 →
  ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic
  is still valid but the ADR ref is stale; rephrased to cite ADR-0008
  §3's nuclear-evict + not-found exemption. Drop "preload validation"
  and "preload validator" — preload was removed from main in #155.
- Stats docstring now points at plan §12.1's KV%-missing caveat so
  the metrics-shim author doesn't re-discover it later.

Tests: 13 new cases extending tests/lemonade/test_client.py.
- llamacpp_args matrix: None omits key; "--threads 8" forwards
  verbatim; ["--parallel", "1", "--threads", "8"] joins to
  "--parallel 1 --threads 8"; [] → "".
- Each /internal/* endpoint verified for HTTP method, path, Bearer
  auth header, and request body shape (or absence for the no-body
  endpoints). One shared parametrised test confirms all four raise
  LemonadeHTTPError on 403 (not LemonadeLoadError).

Out of scope (per the PR-3 brief): preload module (removed in #155
already), forward-plane endpoints, hal0-api wiring of the idle
driver, and any Pydantic/TypedDict response layer (plain dicts are
fine for v0.2).

Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 +
§11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7;
docs/internal/lemonade-spike-2-findings-2026-05-22.md;
docs/internal/lemonade-research-2026-05-22/api.md §1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's `ruff format --check` step caught two formatting deltas the
author agent missed (it ran `ruff check` but not `ruff format`).

Pure formatter output; no logic changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit be222c0 into main May 23, 2026
6 checks passed
@thinmintdev thinmintdev deleted the feat/lemonade-client-pr3 branch May 23, 2026 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant