Skip to content

fix(importer): emit all shards for multi-part GGUF models#9513

Merged
mudler merged 1 commit into
masterfrom
fix/importer-multipart-gguf
Apr 23, 2026
Merged

fix(importer): emit all shards for multi-part GGUF models#9513
mudler merged 1 commit into
masterfrom
fix/importer-multipart-gguf

Conversation

@mudler
Copy link
Copy Markdown
Owner

@mudler mudler commented Apr 23, 2026

The llama-cpp HuggingFace importer iterated files one at a time and kept overwriting lastGGUFFile, so sharded repos such as unsloth/Kimi-K2.6-GGUF (14 Q8_K_XL parts) produced a gallery entry pointing only at the final shard — useless to llama.cpp's split loader, which needs shard 1 to discover the set.

Group shards up front via new helpers in pkg/huggingface-api (SplitShardSuffix, ShardGroup, GroupShards). The llama-cpp importer now picks a group (preferred quant, then last-group fallback) and emits every shard, with Model: pointing at shard 1. FindPreferredModelFile returns shard 1 of the first matching group so the gallery agent's preview stays coherent for sharded repos.

Adds unit coverage for the HuggingFace branch of the importer (which had none), plus shard-detection tests in the hfapi package.

Assisted-by: Claude:Opus-4.7 [Read] [Edit] [Bash]

The llama-cpp HuggingFace importer iterated files one at a time and
kept overwriting `lastGGUFFile`, so sharded repos such as
`unsloth/Kimi-K2.6-GGUF` (14 `Q8_K_XL` parts) produced a gallery entry
pointing only at the final shard — useless to llama.cpp's split loader,
which needs shard 1 to discover the set.

Group shards up front via new helpers in `pkg/huggingface-api`
(`SplitShardSuffix`, `ShardGroup`, `GroupShards`). The llama-cpp
importer now picks a group (preferred quant, then last-group fallback)
and emits every shard, with `Model:` pointing at shard 1.
`FindPreferredModelFile` returns shard 1 of the first matching group so
the gallery agent's preview stays coherent for sharded repos.

Adds unit coverage for the HuggingFace branch of the importer (which
had none), plus shard-detection tests in the hfapi package.

Assisted-by: Claude:Opus-4.7 [Read] [Edit] [Bash]
@mudler mudler merged commit c1f923b into master Apr 23, 2026
40 of 42 checks passed
@mudler mudler deleted the fix/importer-multipart-gguf branch April 23, 2026 13:00
@localai-bot localai-bot added the bug Something isn't working label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working needs-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants