Sync master with upstream release b9012 by jan-service-account · Pull Request #505 · janhq/llama.cpp

jan-service-account · 2026-05-04T01:05:11Z

Updates dev branch with latest release (b9012) from ggml-org/llama.cpp

…2611) Llama-architecture q_proj/k_proj weights need an axis-0 row permutation to match GGML's RoPE convention. The BF16 path applies this in LlamaModel.modify_tensors via LlamaModel.permute, but the NVFP4 path bypasses modify_tensors and writes weights directly through ModelBase._repack_nvfp4. Without the permutation, attention heads end up scrambled at inference and the model produces gibberish. This change overrides _repack_nvfp4 on LlamaModel and applies the same permutation to both the nibble-packed weight and the per-block scale before delegating to ModelBase._repack_nvfp4 via super(). Reuses the existing LlamaModel.permute static helper and respects the existing undo_permute flag, so subclasses (Mistral, Granite, Llama4, etc.) inherit the fix automatically. Verified on TinyLlama-1.1B reproducer: perplexity drops from 4419 (gibberish) to 43.9, matching the BF16-dequantized baseline (44.0). Also verified end-to-end on ALIA-40b-instruct-2601 (BSC, Llama architecture) with multilingual generation in Spanish/Catalan/Basque/ Galician all coherent with the fix applied. Co-authored-by: Chema <chema@montevive.ai>

* [BUGFIX] Mistral format apply_scale support. * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * fix misunderstood boolean parameters --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

jmrobles and others added 2 commits May 3, 2026 18:22

jan-service-account merged commit b88a55d into dev May 4, 2026
4 checks passed

jan-service-account deleted the update-dev-from-master-2026-05-04-01-05 branch May 4, 2026 01:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync master with upstream release b9012#505

Sync master with upstream release b9012#505
jan-service-account merged 2 commits into
devfrom
update-dev-from-master-2026-05-04-01-05

jan-service-account commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jan-service-account commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants