Skip to content

v0.6.66b480

Choose a tag to compare

@tobocop2 tobocop2 released this 21 May 17:22
· 183 commits to main since this release

Model architecture compatibility

HuggingFace has thousands of GGUF repos but the bundled llama.cpp only supports a subset of architectures. Before b480, a pull of an unsupported model failed at load time, after the multi-GB download. b480 surfaces the verdict up front and refuses unsupported native pulls by default.

Every catalog entry now carries a compat value (supported, unsupported, or unknown). The supported set comes from gguf.MODEL_ARCH_NAMES, vendored with llama-cpp-python, so it tracks the runtime automatically. When the HuggingFace API doesn't expose the architecture, a 64 KB Range GET of the GGUF header runs before the full download.

Where it shows up

  • TUI. Compat pill on every card and list row, full sentence in the detail drawer, confirm modal before pulling an unsupported model. Secondary pills (fit, compat) now stack onto their own row so labels stop truncating.
  • CLI. lilbee pull refuses unsupported architectures with a clear message; pass --allow-unsupported to override.
  • HTTP. compat field on catalog and show responses; 409 Conflict on a refused pull, with the same allow_unsupported override.
  • MCP. Same compat fields and a structured unsupported_arch error so agents can react.

The gate lives once in ModelManager.pull. All four entry points share it.

Security cleanup

CodeQL, OSV-Scanner, and Dependabot flagged five advisories against the b479 lock. b480 closes all of them.

  • idna 3.11 -> 3.15. Fixes CVE-2026-45409, a denial-of-service in idna.encode() for crafted long inputs. idna is transitive via httpx, anyio, requests, and yarl, so every install picks up the bumped version.
  • pyjwt 2.12.1 (CVE-2025-45768). Disputed weak-encryption finding; upstream has no fix planned because HMAC key length is the application's choice. pyjwt is transitive via mcp and lilbee never imports it. Documented as won't-fix.
  • joblib 1.5.3 (CVE-2024-34997). Disputed deserialization finding; NumpyArrayWrapper only reads joblib-produced cache files. Transitive via nltk via crawl4ai (crawler extra only). Documented as won't-fix.
  • nltk 3.9.4 (CVE-2026-0846). False positive. The OSV advisory fixes in 3.9.3 and enumerates affected versions only up to 3.9.2; the lock has been on 3.9.4 since b478. Guard suppression added so the scanner stops emitting the alert.

Bug fixes

  • serve: skip shutdown when startup never completed (#278). The server's shutdown hook ran even when the lifespan startup raised, which would mask the original error with a confusing "shutdown before startup" traceback. The fix gates shutdown on a startup-completed flag so the real exception surfaces.

What's Changed

  • feat: Surface model architecture compatibility on catalog and pull by @tobocop2 in #277
  • serve: skip shutdown when startup never completed by @tobocop2 in #278

Full Changelog: v0.6.66b479...v0.6.66b480