Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

Based on #16649.

@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner October 18, 2025 21:34
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 18, 2025
@jeffbolznv
Copy link
Collaborator Author

CC @am17an I've included the ggml_check_edges change in this PR.

Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

@am17an
Copy link
Collaborator

am17an commented Oct 25, 2025

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

Usually I put a debug statement printing the number of nodes fused. We'll need to come up with a better way to assert that the nodes were actually fused

@jeffbolznv
Copy link
Collaborator Author

I understand what this change is doing, but how do I test it? The topk_moe tests pass before and after this change. Which model architectures correspond to the three modes?

I've added some logging in the latest commit that I use to verify fusion and the effects of graph_optimize. You can see the whole sequence of ops without a sync in between, which implies the fusion is working.

Early softmax w/norm: qwen3
Early softmax w/o norm: deepseek2
Late softmax: gpt-oss

@jeffbolznv
Copy link
Collaborator Author

I've rebased this and updated it to handle the clamp added in #16655.

@github-actions github-actions bot added the testing Everything test related label Oct 26, 2025
Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@0cc4m
Copy link
Collaborator

0cc4m commented Oct 29, 2025

Are the non-Vulkan changes fine @slaren ?

Co-authored-by: Diego Devesa <slarengh@gmail.com>
@0cc4m 0cc4m merged commit 10fcc41 into ggml-org:master Oct 29, 2025
60 of 63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants