Sync master with upstream release b8992 by jan-service-account · Pull Request #502 · janhq/llama.cpp

jan-service-account · 2026-05-01T01:09:15Z

Updates dev branch with latest release (b8992) from ggml-org/llama.cpp

* port ggml-org#22358 PR to examples/speculative/speculative.cpp * use vocab_[tgt,dft] instead of ctx_[tgt,dft] when logging on draft model / target model vocabulary mismatch Co-authored-by: Petros Sideris <petros.sideris@nokia.com>

* spec : fix draft model checkpoints * cont : clean-up * cont : gate the ngram-mod reset warning behind verbose flag

…22513) * scripts : add wc2wt.sh - create worktree from current HEAD Add a script to create a git worktree on a new branch from the current HEAD. Similar to pr2wt.sh but for local development branches instead of PRs. Usage: ./scripts/wc2wt.sh gg/new-feature ./scripts/wc2wt.sh gg/new-feature "bash -l" Assisted-by: llama.cpp:local pi * cont : no need to try to delete the branch

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* bump ty to 0.0.33 * update typings

* vulkan: add get/set_tensor_2d functions * fix backend interface comments * Update ggml/src/ggml-metal/ggml-metal.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Update llama-mmap to work with 32-bit wasm and >2GB models * Update to gguf.cpp style

petersid2022 and others added 11 commits April 30, 2026 08:18

spec : fix draft model checkpoints (ggml-org#22521)

80afa33

* spec : fix draft model checkpoints * cont : clean-up * cont : gate the ngram-mod reset warning behind verbose flag

add fast matmul iquants (ggml-org#22504)

4515559

CUDA: fix tile FA kernel on Pascal (ggml-org#22541)

e82aaf2

vendor : update cpp-httplib to 0.43.2 (ggml-org#22548)

5f0ab72

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

ci : bump ty to 0.0.33 (ggml-org#22535)

6118c04

* bump ty to 0.0.33 * update typings

spec: fix argument typo (ggml-org#22552)

c20c445

vulkan: add get/set tensor 2d functions (ggml-org#22514)

660b1b4

* vulkan: add get/set_tensor_2d functions * fix backend interface comments * Update ggml/src/ggml-metal/ggml-metal.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

common : check for null getpwuid in hf-cache (ggml-org#22550)

beb42ff

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

Update llama-mmap to use ftello/fseeko (ggml-org#22497)

5cbfb18

* Update llama-mmap to work with 32-bit wasm and >2GB models * Update to gguf.cpp style

jan-service-account merged commit 5cbfb18 into dev May 2, 2026
16 of 17 checks passed

jan-service-account deleted the update-dev-from-master-2026-05-01-01-09 branch May 2, 2026 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync master with upstream release b8992#502

Sync master with upstream release b8992#502
jan-service-account merged 11 commits intodevfrom
update-dev-from-master-2026-05-01-01-09

jan-service-account commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

jan-service-account commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants