Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

This allows some additional CPU/GPU overlap for large pp workloads. Also seems to help a bit for token gen, maybe getting rid of a small bubble between graph_compute and get_tensor.

Async set and copy functions seem to be very rarely used, so I didn't enable them because I didn't have a good way to test them.

The async commands need to be ordered against each other, so put them all on the compute queue. The non-async commands still use the transfer queue.

The fence for graph_compute/get_tensor_async is submitted and waited on in ggml_vk_synchronize.

See #17033 (comment).

@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner November 10, 2025 23:53
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 11, 2025
This allows some additional CPU/GPU overlap for large pp workloads. Also seems
to help a bit for token gen, maybe getting rid of a small bubble between
graph_compute and get_tensor.

Async set and copy functions seem to be very rarely used, so I didn't enable
them because I didn't have a good way to test them.

The async commands need to be ordered against each other, so put them all on
the compute queue. The non-async commands still use the transfer queue.

The fence for graph_compute/get_tensor_async is submitted and waited on in
ggml_vk_synchronize.
Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any performance differences, but no issues either.

@0cc4m 0cc4m merged commit 38eaf32 into ggml-org:master Nov 15, 2025
71 checks passed
@h9j6k
Copy link

h9j6k commented Nov 16, 2025

I can't see any performance differences, but no issues either.

Commit 38eaf32 makes my intel DG1 vomit gibberish. Will you be able to have another look? Thanks.

@0cc4m
Copy link
Collaborator

0cc4m commented Nov 16, 2025

Are you sure it's not just #17106?

@h9j6k
Copy link

h9j6k commented Nov 16, 2025

I git pull'ed this morning then it went crazy on 416e7c7. So I rolled back all those vulkan related commits one by one until the one right before 38eaf32, it became ok. Then I fast forward'ed to 416e7c7 again and simply git revert 38eaf32, it was still ok. So I think 38eaf32 is the issue?

@jeffbolznv
Copy link
Collaborator Author

Please file an issue for this and share more details. What were you running? Does test-backend-ops pass?

@h9j6k
Copy link

h9j6k commented Nov 16, 2025

Please file an issue for this and share more details. What were you running? Does test-backend-ops pass?

Thanks. I filed an issue here #17302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants