Skip to content

v0.2: Pre-load validation + idle-unload + load timeout (ADR-0007) #144

@thinmintdev

Description

@thinmintdev

What to build

Operational hardening per ADR-0007 (nuclear-evict-all mitigation observed live in spike).

Pre-load validation (src/hal0/lemonade/preload.py):

  • Run BEFORE every LemonadeClient.load() call
  • Check file exists at registry path
  • Check sha256 matches registry entry
  • Check size matches registry
  • Check GGUF magic bytes
  • On any failure: slot state → error, do NOT call /v1/load (so Lemonade's nuclear-evict-all is NOT triggered for the other loaded models)

Idle-unload driver (src/hal0/lemonade/idle.py):

  • Poll /v1/health.all_models_loaded[].last_use periodically (existing 30s cadence)
  • For models stale beyond 300s (existing policy), call POST /v1/unload {"model_name": "..."}
  • Preserves hal0's existing idle policy that Lemonade has no built-in equivalent of

Load timeout in LemonadeClient.load():

  • Wrap in asyncio.wait_for with default 120s timeout (configurable per slot)
  • On timeout: surface PreloadError.LOAD_TIMEOUT, slot → error, do NOT retry (retry would risk evict-all again)

Acceptance criteria

  • Corrupt GGUF (wrong sha256) → slot enters error state without /v1/load being called
  • Other loaded models stay loaded when one slot fails pre-validation (the whole point — no nuclear evict)
  • Idle test: load model, wait 5+ minutes with no activity, model unloaded via /v1/unload
  • Load timeout test: simulate hung /v1/load → surfaces error in 120s, no retry
  • Race condition documented (file deleted between validate and load — bounded by policy, accepted)
  • Unit tests for each PreloadError variant
  • Integration test confirms nuclear-evict-all is not triggered on pre-validation failure
  • Code references ADR-0007 in docstrings at relevant call sites

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions