Release v0.1.6 — Hugging Face model download · DominguesM/llama-crab

Highlights

LlamaParams now accepts a Hugging Face repository id directly. Pass "TheBloke/Llama-2-7B-Chat-GGUF" and the library downloads the GGUF to the official HF cache (~/.cache/huggingface/hub) before loading. Local paths still work unchanged; the Tauri plugin inherits the new behavior automatically.

use llama_crab::{Llama, LlamaParams};

let mut llama = Llama::load(
    LlamaParams::new("TheBloke/Llama-2-7B-Chat-GGUF")
        .with_hf_filename("llama-2-7b-chat.Q4_K_M.gguf")
        .with_n_ctx(2048),
)?;

What's new

Library

hf-hub cargo feature (opt-in) — gates the new functionality. Mirror of the existing mtmd pattern.
HfDownloader trait + MockHfDownloader (always compiled, for tests) + RealHfDownloader (gated, uses hf-hub 0.5 sync API).
5 new builders on LlamaParams: with_hf_filename, with_hf_revision, with_hf_token, with_hf_cache_dir, with_hf_endpoint.
LlamaError::ModelDownload(String) variant for download errors.
HF_TOKEN and HF_ENDPOINT env vars honored (read in RealHfDownloader::new, never logged).
HF_HOME respected for cache location.
Auto-detect heuristic: ^[A-Za-z0-9._-]+(/[A-Za-z0-9._-]+)?$ + !Path::new(s).exists() — falls through to local for existing paths and ambiguous local-path names (models/, model/).
Auto-pick logic: 0 .gguf → error; 1 → auto-pick; >1 → error suggesting with_hf_filename.
tracing::info! at download start/end with repo, filename, size_bytes, elapsed_ms.

Server

--hf-filename <NAME> CLI flag (env LLAMA_CRAB_HF_FILENAME).
hf-hub server feature (opt-in): cargo install llama-crab-server --features hf-hub --force.

Tauri plugin

Always pulls in the hf-hub feature so end-user Tauri apps can use HF repo ids without extra build config.

Install / Upgrade

# Library
cargo add llama-crab --features hf-hub

# Server
cargo install llama-crab-server --features hf-hub --force

Test

# Skip state
cargo test -p llama-crab --features hf-hub --test hf_download

# End-to-end (downloads TinyLlama, ~636 MB, verifies cache hit, loads into Metal)
LLAMA_CRAB_RUN_HF_INTEGRATION=1 cargo test \
  -p llama-crab --features hf-hub --test hf_download

Verification

Check	Result
`cargo build -p llama-crab --no-default-features`	OK
`cargo build -p llama-crab --features hf-hub`	OK
`cargo build -p llama-crab-server --features hf-hub`	OK
`cargo clippy --all-targets --features hf-hub -- -D warnings`	clean
`cargo test --lib` (no-default-features)	120/120 pass
`cargo test --lib` (hf-hub)	120/120 pass (2 env-gated ignored)
`cargo test --doc` (both states)	11/11 pass
`cargo test --test hf_download` (skip)	clean skip
CI (16 jobs, Linux + macOS)	16/16 pass
Release workflow (crates + npm)	success

PRs

#12: feat: add Hugging Face model download from LlamaParams
#13: chore(release): bump version to 0.1.6
#14: chore(release): bump npm package versions to 0.1.6

Guardrails (per design review)

No dep:tokio in the hf-hub feature (sync API only)
No #[from] hf_hub::Error (would leak the gated type into the always-compiled error enum)
No SHA256 verification (delegated to hf-hub etag mechanism; documented limitation)
No async / progress callback API in v1
No hf: URL prefix syntax
No token / auth-bearing URLs at tracing::info! level
Server hf-hub is opt-in (kept out of default = [])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.6 — Hugging Face model download

Choose a tag to compare

Sorry, something went wrong.