Sync master with upstream release b7108 #332

jan-service-account · 2025-11-20T00:35:18Z

Updates dev branch with latest release (b7108) from ggml-org/llama.cpp

…g#17343) Test 'Q4_K_M' quantization on https://huggingface.co/pfnet/plamo-2-translate The 'suffix_to_score' size is 193510, it needs 19 memory allocation with final capacity 262144 to hold the value, if not preserve the memory. Signed-off-by: Haiyue Wang <haiyuewa@163.com>

…l-org#17357) * chat: fix int overflow, prevent size calculation in float/double * Update common/chat.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

…org#17308)

…p crash (ggml-org#17356)

…on (ggml-org#17332) * Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition * Argh. * Making CISC happy ;) * Integrate CONT tests * Use loopy loop * Skip new tests for (B)F16 for now.

…gml-org#17359)

* feat: Add "Continue" action for assistant messages * feat: Continuation logic & prompt improvements * chore: update webui build output * feat: Improve logic for continuing the assistant message * chore: update webui build output * chore: Linting * chore: update webui build output * fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message * chore: update webui build output * feat: Enable "Continue" button based on config & non-reasoning model type * chore: update webui build output * chore: Update packages with `npm audit fix` * fix: Remove redundant error * chore: update webui build output * chore: Update `.gitignore` * fix: Add missing change * feat: Add auto-resizing for Edit Assistant/User Message textareas * chore: update webui build output

* vulkan: support larger argsort This is an extension of the original bitonic sorting shader that puts the temporary values in global memory and when more than 1024 threads are needed it runs multiple workgroups and synchronizes through a pipelinebarrier. To improve the memory access pattern, a copy of the float value is kept with the index value. I've applied this same change to the original shared memory version of the shader, which is still used when ncols <= 1024. * Reduce the number of shader variants. Use smaller workgroups when doing a single pass, for a modest perf boost * reduce loop overhead * run multiple cols per invocation, to reduce barrier overhead

…OOR, TRUNC (ggml-org#17319) * vulkan: initialize array * vulkan: implement ADD1 * vulkan: implement ARANGE * vulkan: implement FILL * vulkan: implement SOFTPLUS * vulkan: implement STEP * vulkan: implement ROUND * vulkan: implement CEIL * vulkan: implement FLOOR * vulkan: implement TRUNC * docs: update Vulkan ops Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

haiyuewa and others added 11 commits November 18, 2025 18:58

ggml-cpu: Don't pass -mpowerpc64 when -mcpu already implies it (ggml-…

c49daff

…org#17308)

vulkan: force full subgroups for flash attention to fix intel subgrou…

980b7cd

…p crash (ggml-org#17356)

Fix too relaxed check on CUDA "fast copy" (can_be_transposed) conditi…

6fd4f95

…on (ggml-org#17332) * Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition * Argh. * Making CISC happy ;) * Integrate CONT tests * Use loopy loop * Skip new tests for (B)F16 for now.

cuda: fix rope fusion for gemma3 (ggml-org#17378)

fd7353d

convert : use self.block_count everywhere instead of reading hparams (g…

07b0e7a

…gml-org#17359)

vulkan: Add copy_transpose shader (ggml-org#17371)

2eba631

jan-service-account merged commit 8a822af into dev Nov 20, 2025
3 checks passed

jan-service-account deleted the update-dev-from-master-2025-11-20-00-35 branch November 20, 2025 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b7108 #332

Sync master with upstream release b7108 #332

Uh oh!

jan-service-account commented Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Sync master with upstream release b7108 #332

Sync master with upstream release b7108 #332

Uh oh!

Conversation

jan-service-account commented Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants