Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

Split N into chunks to fit into shared memory.
If K > 128, use a larger workgroup with enough invocations. Add perf tests matching qwen3next.

See #17751 (comment)

Split N into chunks to fit into shared memory.
If K > 128, use a larger workgroup with enough invocations.
Add perf tests matching qwen3next.
@github-actions github-actions bot added testing Everything test related Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Dec 5, 2025
@0cc4m 0cc4m merged commit c6c5e85 into ggml-org:master Dec 6, 2025
72 of 78 checks passed
JayZenith pushed a commit to JayZenith/llama.cpp that referenced this pull request Dec 7, 2025
Split N into chunks to fit into shared memory.
If K > 128, use a larger workgroup with enough invocations.
Add perf tests matching qwen3next.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants