chore(lemonade): remove preload validation per ADR-0008 §3#155
Merged
Conversation
ADR-0008 §3 explicitly rescinds ADR-0007's preload validation work: per-type LRU + nuclear-evict's not-found exemption list reduce the original hazard below the cost of sha256+GGUF checks on every load. Parallel session merged the ADR-0007 implementation (#144) before ADR-0008 became visible on main. Code had zero live callers (re-exported in __init__.py only). Removed: - src/hal0/lemonade/preload.py (466 LOC) - tests/lemonade/test_preload.py (404 LOC) - tests/slots/test_manager_lemonade_integration.py (271 LOC — tested the ADR-0007 preload chain end-to-end; obsolete with ADR-0008) - PreloadError + preload_validate exports from src/hal0/lemonade/__init__.py - Module-docstring ADR-0006/0007 references → ADR-0008. Kept: - src/hal0/lemonade/idle.py + tests/lemonade/test_idle.py — idle-unload driver is live in src/hal0/api/__init__.py and not rejected by ADR-0008. Stale ADR-0006/0007 references in src/hal0/lemonade/client.py docstring + comments are left for PR-3 (client extension) to clean up alongside the typed-endpoint additions. Tests: 1385 passed, 8 skipped, 3 xfailed (pre-existing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0d3c2b8 to
8c9c862
Compare
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…ndpoints (PR-3) PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the LemonadeClient skeleton shipped in #137 in line with the locked adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and ADR-0008 before any later PR depends on it. Changes: - Fix the llamacpp_args serialization bug. Wire format is a single space-separated string; Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike #2 findings + lemonade-research-2026-05-22/api.md §1.3). The client now accepts str | list[str] | None: None omits the key (never send JSON null, per the v1_load_schema memory + nlohmann unconditional accessor), str passes through verbatim, list joins on single spaces, [] becomes the empty-string sentinel ("use default" via is_empty_option). - DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 + plan §3 + §12.2 lock the port). - Add the four loopback-only /internal/* endpoints from plan §2.2: shutdown() (POST /internal/shutdown — systemd ExecStop), internal_config() (GET /internal/config — admin panel source of truth), internal_set(values) (POST /internal/set — atomic config setter for both immediate-effect and deferred-until-next-load keys), and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly HF cache hygiene cron). All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError (reserved for /v1/load's evict-all blast radius). - Update stale ADR references in docstrings/comments. ADR-0006 → ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic is still valid but the ADR ref is stale; rephrased to cite ADR-0008 §3's nuclear-evict + not-found exemption. Drop "preload validation" and "preload validator" — preload was removed from main in #155. - Stats docstring now points at plan §12.1's KV%-missing caveat so the metrics-shim author doesn't re-discover it later. Tests: 13 new cases extending tests/lemonade/test_client.py. - llamacpp_args matrix: None omits key; "--threads 8" forwards verbatim; ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8"; [] → "". - Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape (or absence for the no-body endpoints). One shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError). Out of scope (per the PR-3 brief): preload module (removed in #155 already), forward-plane endpoints, hal0-api wiring of the idle driver, and any Pydantic/TypedDict response layer (plain dicts are fine for v0.2). Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 + §11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7; docs/internal/lemonade-spike-2-findings-2026-05-22.md; docs/internal/lemonade-research-2026-05-22/api.md §1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…ndpoints (PR-3) (#156) * feat(lemonade): extend client — fix llamacpp_args + add /internal/* endpoints (PR-3) PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the LemonadeClient skeleton shipped in #137 in line with the locked adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and ADR-0008 before any later PR depends on it. Changes: - Fix the llamacpp_args serialization bug. Wire format is a single space-separated string; Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike #2 findings + lemonade-research-2026-05-22/api.md §1.3). The client now accepts str | list[str] | None: None omits the key (never send JSON null, per the v1_load_schema memory + nlohmann unconditional accessor), str passes through verbatim, list joins on single spaces, [] becomes the empty-string sentinel ("use default" via is_empty_option). - DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 + plan §3 + §12.2 lock the port). - Add the four loopback-only /internal/* endpoints from plan §2.2: shutdown() (POST /internal/shutdown — systemd ExecStop), internal_config() (GET /internal/config — admin panel source of truth), internal_set(values) (POST /internal/set — atomic config setter for both immediate-effect and deferred-until-next-load keys), and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly HF cache hygiene cron). All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError (reserved for /v1/load's evict-all blast radius). - Update stale ADR references in docstrings/comments. ADR-0006 → ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic is still valid but the ADR ref is stale; rephrased to cite ADR-0008 §3's nuclear-evict + not-found exemption. Drop "preload validation" and "preload validator" — preload was removed from main in #155. - Stats docstring now points at plan §12.1's KV%-missing caveat so the metrics-shim author doesn't re-discover it later. Tests: 13 new cases extending tests/lemonade/test_client.py. - llamacpp_args matrix: None omits key; "--threads 8" forwards verbatim; ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8"; [] → "". - Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape (or absence for the no-body endpoints). One shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError). Out of scope (per the PR-3 brief): preload module (removed in #155 already), forward-plane endpoints, hal0-api wiring of the idle driver, and any Pydantic/TypedDict response layer (plain dicts are fine for v0.2). Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 + §11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7; docs/internal/lemonade-spike-2-findings-2026-05-22.md; docs/internal/lemonade-research-2026-05-22/api.md §1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: apply ruff format to client.py + test_client.py CI's `ruff format --check` step caught two formatting deltas the author agent missed (it ran `ruff check` but not `ruff format`). Pure formatter output; no logic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ADR-0008 §3 explicitly rescinds ADR-0007's preload validation work — per-type LRU + nuclear-evict's not-found exemption list reduce the original hazard below the cost of sha256+GGUF checks on every load.
Parallel session merged the ADR-0007 implementation (#144) before ADR-0008 became visible on `main`. The preload module had zero live callers (re-exported in `init.py` only), so this is pure dead-code removal.
Net delta: −1155 / +11.
What's removed
What's kept
Test plan
🤖 Generated with Claude Code