Skip to content

feat(importers): whisper.cpp HF repos pick a quant + nest under whisper/models#9630

Merged
mudler merged 1 commit into
masterfrom
feat/whisper-importer-multi-quant
May 1, 2026
Merged

feat(importers): whisper.cpp HF repos pick a quant + nest under whisper/models#9630
mudler merged 1 commit into
masterfrom
feat/whisper-importer-multi-quant

Conversation

@mudler
Copy link
Copy Markdown
Owner

@mudler mudler commented May 1, 2026

The WhisperImporter's Import() switch ordered LooksLikeURL ahead of the HuggingFace branch, so any https://huggingface.co// URI (e.g. LocalAI-io/whisper-large-v3-it-yodas-only-ggml) hijacked the URL path. FilenameFromUrl returned the repo slug, the gallery entry pointed at the HTML repo page, the SHA256 was empty, and the HF file listing was effectively dead code for HTTPS imports. The HF branch only fired for huggingface://owner/repo and hf://owner/repo references.

Gate the URL case on a "ggml-.bin" basename signal — mirroring how the llama-cpp importer gates on ".gguf" — so direct file URLs still take the URL path while HF repo URLs fall through to the HF branch. There the file listing is actually consulted: every ggml-.bin entry is collected and one is picked by the new preferences.quantizations preference (default q5_0; comma-separated for fallback ordering).

Pin the chosen file under whisper/models// so a single repo can ship q4_0/q5_0/q8_0 side-by-side without colliding on disk, matching the llama-cpp/models// layout. The fallback when no preference matches is the last available ggml file, mirroring llama-cpp's pickPreferredGroup behaviour.

Tests: replace the previous probe spec with positive assertions against LocalAI-io/whisper-large-v3-it-yodas-only-ggml (default → ggml-model-q5_0.bin, quantizations=q4_0 → ggml-model-q4_0.bin) plus two offline specs that build a fake hfapi.ModelDetails to cover the fallback rule and non-ggml filtering without touching the network.

Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit WebFetch]

…er/models

The WhisperImporter's Import() switch ordered LooksLikeURL ahead of the
HuggingFace branch, so any https://huggingface.co/<owner>/<repo> URI
(e.g. LocalAI-io/whisper-large-v3-it-yodas-only-ggml) hijacked the URL
path. FilenameFromUrl returned the repo slug, the gallery entry pointed
at the HTML repo page, the SHA256 was empty, and the HF file listing
was effectively dead code for HTTPS imports. The HF branch only fired
for huggingface://owner/repo and hf://owner/repo references.

Gate the URL case on a "ggml-*.bin" basename signal — mirroring how
the llama-cpp importer gates on ".gguf" — so direct file URLs still
take the URL path while HF repo URLs fall through to the HF branch.
There the file listing is actually consulted: every ggml-*.bin entry
is collected and one is picked by the new preferences.quantizations
preference (default q5_0; comma-separated for fallback ordering).

Pin the chosen file under whisper/models/<name>/<file> so a single
repo can ship q4_0/q5_0/q8_0 side-by-side without colliding on disk,
matching the llama-cpp/models/<name>/ layout. The fallback when no
preference matches is the last available ggml file, mirroring
llama-cpp's pickPreferredGroup behaviour.

Tests: replace the previous probe spec with positive assertions
against LocalAI-io/whisper-large-v3-it-yodas-only-ggml (default →
ggml-model-q5_0.bin, quantizations=q4_0 → ggml-model-q4_0.bin) plus
two offline specs that build a fake hfapi.ModelDetails to cover the
fallback rule and non-ggml filtering without touching the network.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit WebFetch]
@mudler mudler merged commit 8452068 into master May 1, 2026
46 of 47 checks passed
@mudler mudler deleted the feat/whisper-importer-multi-quant branch May 1, 2026 10:03
@localai-bot localai-bot added the enhancement New feature or request label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants