vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

jeffbolznv · 2025-10-18T21:34:41Z

Based on #16649.

jeffbolznv · 2025-10-21T13:15:25Z

CC @am17an I've included the ggml_check_edges change in this PR.

0cc4m

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

ggml/src/ggml-vulkan/ggml-vulkan.cpp

am17an · 2025-10-25T05:49:38Z

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

Usually I put a debug statement printing the number of nodes fused. We'll need to come up with a better way to assert that the nodes were actually fused

jeffbolznv · 2025-10-25T16:09:19Z

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

I've added some logging in the latest commit that I use to verify fusion and the effects of graph_optimize. You can see the whole sequence of ops without a sync in between, which implies the fusion is working.

Early softmax w/norm: qwen3
Early softmax w/o norm: deepseek2
Late softmax: gpt-oss

Based on ggml-org#16649.

jeffbolznv · 2025-10-26T17:42:11Z

I've rebased this and updated it to handle the clamp added in #16655.

0cc4m

LGTM

0cc4m · 2025-10-29T08:34:01Z

Are the non-Vulkan changes fine @slaren ?

ggml/src/ggml-impl.h

Co-authored-by: Diego Devesa <slarengh@gmail.com>

jeffbolznv requested a review from 0cc4m as a code owner October 18, 2025 21:34

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 18, 2025

jeffbolznv force-pushed the topk_gpt branch from 0111a34 to bee8468 Compare October 21, 2025 13:14

jeffbolznv requested review from ggerganov and slaren as code owners October 21, 2025 13:14

0cc4m reviewed Oct 25, 2025

View reviewed changes

ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated Show resolved Hide resolved

jeffbolznv added 4 commits October 26, 2025 11:51

vulkan: Update topk_moe fusion to handle gpt's late softmax

6cccaef

Based on ggml-org#16649.

Add ggml_check_edges

81853b5

Add sync logging to show fusion effects

180eef4

handle clamp added in ggml-org#16655

b046c73

jeffbolznv force-pushed the topk_gpt branch from 5c973c6 to b046c73 Compare October 26, 2025 17:41

github-actions bot added the testing Everything test related label Oct 26, 2025

0cc4m approved these changes Oct 29, 2025

View reviewed changes

slaren approved these changes Oct 29, 2025

View reviewed changes

ggml/src/ggml-impl.h Outdated Show resolved Hide resolved

Update ggml/src/ggml-impl.h

832ea83

Co-authored-by: Diego Devesa <slarengh@gmail.com>

0cc4m merged commit 10fcc41 into ggml-org:master Oct 29, 2025
60 of 63 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

jeffbolznv commented Oct 18, 2025

Uh oh!

jeffbolznv commented Oct 21, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

am17an commented Oct 25, 2025

Uh oh!

jeffbolznv commented Oct 25, 2025

Uh oh!

jeffbolznv commented Oct 26, 2025

Uh oh!

0cc4m left a comment

Uh oh!

0cc4m commented Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

Conversation

jeffbolznv commented Oct 18, 2025

Uh oh!

jeffbolznv commented Oct 21, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

am17an commented Oct 25, 2025

Uh oh!

jeffbolznv commented Oct 25, 2025

Uh oh!

jeffbolznv commented Oct 26, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

0cc4m commented Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants