v0.1.6 — Hugging Face model download
Highlights
LlamaParams now accepts a Hugging Face repository id directly. Pass "TheBloke/Llama-2-7B-Chat-GGUF" and the library downloads the GGUF to the official HF cache (~/.cache/huggingface/hub) before loading. Local paths still work unchanged; the Tauri plugin inherits the new behavior automatically.
use llama_crab::{Llama, LlamaParams};
let mut llama = Llama::load(
LlamaParams::new("TheBloke/Llama-2-7B-Chat-GGUF")
.with_hf_filename("llama-2-7b-chat.Q4_K_M.gguf")
.with_n_ctx(2048),
)?;What's new
Library
hf-hubcargo feature (opt-in) — gates the new functionality. Mirror of the existingmtmdpattern.HfDownloadertrait +MockHfDownloader(always compiled, for tests) +RealHfDownloader(gated, useshf-hub0.5 sync API).- 5 new builders on
LlamaParams:with_hf_filename,with_hf_revision,with_hf_token,with_hf_cache_dir,with_hf_endpoint. LlamaError::ModelDownload(String)variant for download errors.HF_TOKENandHF_ENDPOINTenv vars honored (read inRealHfDownloader::new, never logged).HF_HOMErespected for cache location.- Auto-detect heuristic:
^[A-Za-z0-9._-]+(/[A-Za-z0-9._-]+)?$+!Path::new(s).exists()— falls through to local for existing paths and ambiguous local-path names (models/,model/). - Auto-pick logic: 0
.gguf→ error; 1 → auto-pick; >1 → error suggestingwith_hf_filename. tracing::info!at download start/end with repo, filename, size_bytes, elapsed_ms.
Server
--hf-filename <NAME>CLI flag (envLLAMA_CRAB_HF_FILENAME).hf-hubserver feature (opt-in):cargo install llama-crab-server --features hf-hub --force.
Tauri plugin
- Always pulls in the
hf-hubfeature so end-user Tauri apps can use HF repo ids without extra build config.
Install / Upgrade
# Library
cargo add llama-crab --features hf-hub
# Server
cargo install llama-crab-server --features hf-hub --forceTest
# Skip state
cargo test -p llama-crab --features hf-hub --test hf_download
# End-to-end (downloads TinyLlama, ~636 MB, verifies cache hit, loads into Metal)
LLAMA_CRAB_RUN_HF_INTEGRATION=1 cargo test \
-p llama-crab --features hf-hub --test hf_downloadVerification
| Check | Result |
|---|---|
cargo build -p llama-crab --no-default-features |
OK |
cargo build -p llama-crab --features hf-hub |
OK |
cargo build -p llama-crab-server --features hf-hub |
OK |
cargo clippy --all-targets --features hf-hub -- -D warnings |
clean |
cargo test --lib (no-default-features) |
120/120 pass |
cargo test --lib (hf-hub) |
120/120 pass (2 env-gated ignored) |
cargo test --doc (both states) |
11/11 pass |
cargo test --test hf_download (skip) |
clean skip |
| CI (16 jobs, Linux + macOS) | 16/16 pass |
| Release workflow (crates + npm) | success |
PRs
- #12: feat: add Hugging Face model download from LlamaParams
- #13: chore(release): bump version to 0.1.6
- #14: chore(release): bump npm package versions to 0.1.6
Guardrails (per design review)
- No
dep:tokioin thehf-hubfeature (sync API only) - No
#[from] hf_hub::Error(would leak the gated type into the always-compiled error enum) - No SHA256 verification (delegated to
hf-hubetag mechanism; documented limitation) - No async / progress callback API in v1
- No
hf:URL prefix syntax - No token / auth-bearing URLs at
tracing::info!level - Server
hf-hubis opt-in (kept out ofdefault = [])