Skip to content

feat(models): scan + add-by-path + model-dir setting#313

Merged
thinmintdev merged 2 commits into
mainfrom
feat/models-scan-and-add-by-path
May 25, 2026
Merged

feat(models): scan + add-by-path + model-dir setting#313
thinmintdev merged 2 commits into
mainfrom
feat/models-scan-and-add-by-path

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

Discovery summary (before/after)

Before. The backend already had a full scan/preview pipeline (POST /api/models/scan/preview + POST /api/models/scan in two modes — empty-body auto-scan walks [models].roots; body {"rows":[...]} commits a user-edited preview). It also had POST /api/models for a raw Model payload. ModelsConfig already carried roots, pull_root, auto_scan_on_start, file_extensions. GET/PUT /api/settings already round-tripped the whole Hal0Config (deep-merge PUT, atomic TOML write). What was missing was the front door — Settings had no editor for models.roots/pull_root, the Models page had no scan or add-by-path buttons, and a single-file add required hand-crafting a full Model body (and dropped HF metadata per memory hal0_model_store_layout).

After. New thin endpoint POST /api/models/add-from-path reuses detect() + the registry write path so a one-line POST is enough to register an already-downloaded model. New Settings → Models section edits models.roots + pull_root + auto_scan_on_start. New Models-page actions "Scan directory" (preview + bulk register) and "Add by path" (single file). Five new UI hooks (useScanPreview, useAddModelFromPath, useSettings, useSettingsUpdate, useSettingsReload).

Endpoint specs

POST /api/models/add-from-path — single-file register

Request:

{
  "path":      "/mnt/ai-models/local/qwen3-4b-test/qwen3-4b-q4_k_m.gguf",
  "id":        "user.qwen3-4b-local",       // optional — derived from filename otherwise
  "name":      "My Display Name",            // optional
  "labels":    ["chat", "tool-calling"],     // optional — falls back to detect()
  "overwrite": false                          // default false
}

Response (201): the canonical Model row (id/name/path/size_bytes/capabilities/backends/metadata/ns).

Errors:

  • 400 model.path_missing — file does not exist / not readable
  • 400 model.unsupported_format — extension not in [models].file_extensions
  • 400 model.path_relative — path wasn't absolute
  • 409 model.already_exists — duplicate id; pass overwrite=true to replace

Settings field

[models] in /etc/hal0/hal0.toml. The Settings UI surfaces three fields:

  • roots (list of absolute scan directories — first one drives the default Scan path)
  • pull_root (where HF pulls land)
  • auto_scan_on_start (boolean)

file_extensions is shown read-only ("edit via hal0 config edit") to keep the v1 surface small.

Live verification (commands actually run against hal0 LXC)

Initial state — /etc/hal0/hal0.toml had no [models] section; /api/models returned count: 0.

1. PUT /api/settings to set the model dir:

$ curl -X PUT http://10.0.1.142:8080/api/settings -H 'Content-Type: application/json' \
    -d '{"models":{"roots":["/mnt/ai-models"],"pull_root":"/mnt/ai-models","auto_scan_on_start":true,"file_extensions":[".gguf",".safetensors"]}}'
→ models.roots = ["/mnt/ai-models"], models.pull_root = "/mnt/ai-models"

Verified cat /etc/hal0/hal0.toml: [models] block now present.

2. Seeded two fixture GGUFs:

$ ls /mnt/ai-models/local/
embed-bge/        qwen3-4b-test/

3. POST /api/models/scan/preview against /mnt/ai-models (recursive):

$ curl -X POST http://10.0.1.142:8080/api/models/scan/preview \
    -H 'Content-Type: application/json' \
    -d '{"paths":["/mnt/ai-models"],"recursive":true}'
→ {"preview":[
    {"path":".../embed-bge/bge-small-en-v1.5.gguf","kind":"llama",
     "suggested_capabilities":["embed"],"suggested_backends":["vulkan","rocm","cuda","cpu"]},
    {"path":".../qwen3-4b-test/qwen3-4b-q4_k_m.gguf","kind":"llama",
     "suggested_capabilities":["chat"],"suggested_backends":["vulkan","rocm","cuda","cpu"]}
  ],"count":2}

4. POST /api/models/add-from-path:

$ curl -X POST http://10.0.1.142:8080/api/models/add-from-path \
    -H 'Content-Type: application/json' \
    -d '{"path":"/mnt/ai-models/local/qwen3-4b-test/qwen3-4b-q4_k_m.gguf",
         "id":"user.qwen3-4b-local","labels":["chat","tool-calling"]}'
→ 201 {"id":"user.qwen3-4b-local","name":"qwen3-4b-q4_k_m",
       "path":"/mnt/ai-models/local/qwen3-4b-test/qwen3-4b-q4_k_m.gguf",
       "size_bytes":65536,"capabilities":["chat","tool-calling"],
       "backends":["vulkan","rocm","cuda","cpu"],"ns":"pulled"}

/api/models now lists the new entry with installed: true, owned_by: "local". The registry TOML at /var/lib/hal0/registry/registry.toml carries the entry verbatim (survives a reload).

5. Error paths exercised:

$ curl POST /api/models/add-from-path -d '{"path":"/mnt/ai-models/nope.gguf"}'
→ 400 {"error":{"code":"model.path_missing", ...}}

$ curl POST /api/models/add-from-path -d '{"path":"/tmp/junk.json"}'
→ 400 {"error":{"code":"model.unsupported_format",
                "details":{"allowed":[".gguf",".safetensors"], ...}}}

$ curl POST /api/models/add-from-path -d '{"path":"...same gguf...","id":"user.qwen3-4b-local"}'
→ 409 {"error":{"code":"model.already_exists", ...}}

6. UI smoke (Playwright headless against the live hal0-api on the LXC):

  • Models page renders new "Scan directory" + "Add by path" + "Add by HF coords" buttons in the header.
  • Models page lists the new user.qwen3-4b-local row, detail pane shows the /mnt/ai-models/local/qwen3-4b-test/qwen3-4b-q4_k_m.gguf path, ns=pulled, labels chat + tool-calling.
  • Settings → Models section shows roots /mnt/ai-models, pull_root /mnt/ai-models, auto-scan checkbox enabled, file-extensions read-only chip, "Stored at /etc/hal0/hal0.toml" hint, working Save/Reset buttons (disabled when no diff).
  • Scan modal opens with the configured /mnt/ai-models pre-filled, hits Scan, lists both fixture GGUFs with capability hints + new/registered badges (registered row has its checkbox disabled so the operator can't double-add).
  • Add-by-path modal pre-fills its placeholder from the configured scan dir.
  • Reload survives: settings + registry both round-trip from disk.

Tests

  • tests/api/test_models_add_from_path.py — 8 new tests:
    • test_add_from_path_registers_gguf
    • test_add_from_path_honours_explicit_id_and_labels
    • test_add_from_path_rejects_missing_file
    • test_add_from_path_rejects_unsupported_extension
    • test_add_from_path_rejects_relative_path
    • test_add_from_path_409_on_duplicate_id
    • test_add_from_path_overwrites_when_requested
    • test_add_from_path_rejects_bad_body
  • Existing tests/api/test_models_scan.py, test_models_preview.py, test_models_crud.py, test_settings_routes.py, tests/registry/test_discover.py continue to pass (43 passed in the relevant slice; 430 passed in the full tests/registry + tests/api run).
  • ruff check + ruff format --check both green for the changed files.

Out-of-scope deferrals noticed during the work

  • Editing [models].file_extensions from the Settings UI — surfaced read-only with a hal0 config edit hint; not commonly tweaked.
  • Multi-dir scan in one call — the scan UI accepts one path per click; operator can re-scan with a different path.
  • Batch /api/models/add-from-path endpoint — the Scan modal loops single-file POSTs instead (keeps failure mode obvious; one bad path doesn't poison the rest).
  • Filesystem watcher — explicit Scan only, no auto-discovery beyond the existing startup auto_scan_on_start.
  • Model deletion / unregister UI for the new entries — covered by the existing Delete dialog already shipping in the Models page.
  • Path-traversal hardening beyond "must be absolute + must exist" — the file has to be readable by hal0-api which already runs as root on the LXC; no additional sandbox per the v1 brief.

🤖 Generated with Claude Code

@thinmintdev thinmintdev force-pushed the feat/models-scan-and-add-by-path branch 4 times, most recently from 0dcc6e4 to af6f467 Compare May 25, 2026 18:21
thinmintdev and others added 2 commits May 25, 2026 15:05
The dashboard now lets operators point hal0 at any model directory
(e.g. /mnt/ai-models), scan it recursively, and register individual
files by absolute path — all without dropping to `hal0 config edit`.

Backend
  • New POST /api/models/add-from-path — single-file convenience
    register. Validates path readability + the [models].file_extensions
    allow-list, runs detect() for capabilities/backends, writes through
    ModelRegistry.add(). Typed errors: model.path_missing,
    model.unsupported_format, model.path_relative, model.already_exists.
    overwrite=true replaces the existing entry in place.
  • POST /api/models/scan/preview and POST /api/models/scan already
    existed (and back the discover.py walker) — wiring through unchanged.

Settings
  • New "Models" section in Settings — surfaces [models].roots,
    [models].pull_root, [models].auto_scan_on_start via the existing
    deep-merge PUT /api/settings, persisted to /etc/hal0/hal0.toml.

UI
  • Two new model-page actions: "Scan directory" (recursive walk with
    new/registered badges + checkbox commit) and "Add by path" (single
    file with optional id/name/labels overrides).
  • New hooks: useScanPreview, useAddModelFromPath, useSettings,
    useSettingsUpdate, useSettingsReload.

Tests: 8 new tests cover the happy path, missing path, unsupported
extension, relative path, duplicate id, overwrite, and bad body shapes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev force-pushed the feat/models-scan-and-add-by-path branch from af6f467 to 490c46c Compare May 25, 2026 19:06
@thinmintdev thinmintdev merged commit 8f120ca into main May 25, 2026
4 checks passed
@thinmintdev thinmintdev deleted the feat/models-scan-and-add-by-path branch May 25, 2026 19:12
thinmintdev added a commit that referenced this pull request May 25, 2026
…tion (#319)

One field replaces #313's roots + pull_root as the source of truth
for where hal0 reads and writes model files. Propagates atomically
to both consumers — the pull engine and Lemonade's extra_models_dir
— and restarts hal0-lemonade.service when the latter changed.

Backward-compatible: when ``[models].store`` is unset we fall back to
the legacy ``[models].pull_root`` so PR-#313 installs keep working
without an edit. The legacy field is documented as deprecated.

Two new endpoints surface the workflow:
  GET  /api/settings/models/store          — current state + suggestions
  POST /api/settings/models/store          — set + dry-run/migrate apply
  POST /api/settings/models/store/migrate  — explicit migrate-then-apply

The POST handler is a dry-run by default: when the current store has
files at a different path it returns ``{status: "needs_migration", plan}``
and persists nothing. The UI renders a confirmation modal; on confirm
the migrate endpoint moves files (same-fs rename or cross-fs copy via
shutil.move), propagates to Lemonade config.json, restarts the unit,
and persists hal0.toml. Move-first / persist-last so a failed move
leaves prior state intact.

Settings → Models swaps the two #313 fields for the single Storage
location field plus suggestion chips (preset paths with per-path
state probes — exists/files/free-bytes). FirstRun gains a new
"Storage" step between picker and confirm so the bundle's downloads
land in the right place.

The Lemonade admin panel's locked extra_models_dir invariant is now
derived from ``effective_store()`` so operators who set the store via
Settings → Models can still edit Lemonade config coherently.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 28, 2026
…rough + gut installer auth section (#390)

- docs/operate/lemonade.md (new, .md canonical): operator reference for
  the v0.2 Lemonade runtime — what it is, where state lives, the /v1/*
  proxy + dispatcher fallthrough (PRs #248/#277), slot ↔ Lemonade
  model mapping (PRs #281/#282), max_loaded_models = 8 LRU cap (PR
  #283), per-type LRU eviction per ADR-0008 (supersedes nuclear-evict
  ADR-0007), OFFLINE-on-eviction (PR #276), and the three known v0.3
  caveats (Vulkan KV gauge missing, whisper RUNPATH workaround, GPU
  cleanup unload hang).

- docs/dashboard/v3.md (new, .md canonical, new docs/dashboard/ dir):
  page-by-page tour of the v3 React dashboard shipped in
  v0.3.0-alpha.1 (PR #235). Covers the shell + Mock-badge convention,
  /dashboard (system overview after #356), /chat (real surface per
  #309/#314/#315/#351), /slots (sidebar mirror per #357 + #344 UX
  sweep), /models (#313/#319/#353), /mcp (#304/#300), /agents (Peers
  per #299), /memory (graph #297, throughput #308), Settings (no Auth
  tab post-ADR-0012), and the footer journal (Epic #322 — PRs
  #321/#328/#329/#330/#332). Mock-fallback issues linked via the
  dashboard-v3 label, not enumerated.

- installer/README.md: gut ~95 lines of stale auth prose (Caddy,
  Bearer-token mint/use/revoke, first-run OTP claim wizard,
  HAL0_AUTH_ENABLED/HAL0_AUTH_DISABLED, password recovery, basic_auth
  upgrade path, the TLS recipe). Replace with one paragraph pointing
  at docs/operate/auth.mdx for the reverse-proxy recipe and
  docs/agents/identity.md for the X-hal0-Agent identity model. Auth
  was removed in v0.3.0-alpha.1 per ADR-0012; the README hadn't
  caught up.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant