Skip to content

[ggml-webgpu] Handle buffer overlap / buffer aliasing for concat operator#24000

Merged
reeselevine merged 22 commits into
ggml-org:masterfrom
nikhilJain17:nikhilJain17/concat-overlap
Jun 8, 2026
Merged

[ggml-webgpu] Handle buffer overlap / buffer aliasing for concat operator#24000
reeselevine merged 22 commits into
ggml-org:masterfrom
nikhilJain17:nikhilJain17/concat-overlap

Conversation

@nikhilJain17
Copy link
Copy Markdown
Contributor

@nikhilJain17 nikhilJain17 commented Jun 2, 2026

Overview

While testing the WebGPU backend with stable-diffusion.cpp, I encountered the following error:

Device error! Reason: 2, Message: Writable storage buffer binding aliasing found between [BindGroup "concat_f32"] 
set at bind group index 0, binding index 0, and [BindGroup "concat_f32"] set at bind group index 0, binding index 1, 
with overlapping ranges (offset: 2924288, size: 1310720) and (offset: 2924288, size: 1310720) in [Buffer "tensor_buf2"].

Since WebGPU does not allow a read bind and a write bind on the same buffer, there was a workaround to merge two overlapping src buffers into a merged_src buffer in #22456. This diff applies the same technique to the concat shader.

cc @yomaytk

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: Yes, to root cause the issue and fix a git issue.

@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning WebGPU labels Jun 2, 2026
@nikhilJain17 nikhilJain17 marked this pull request as ready for review June 2, 2026 04:50
@nikhilJain17 nikhilJain17 requested a review from a team as a code owner June 2, 2026 04:50
@yomaytk
Copy link
Copy Markdown
Contributor

yomaytk commented Jun 2, 2026

Hi @nikhilJain17, thanks for working on this, and it all looks great! LGTM

Comment thread ggml/src/ggml-webgpu/wgsl-shaders/concat.wgsl Outdated
@reeselevine reeselevine requested a review from CISC June 5, 2026 17:45
shader_lib_ctx.src1 = nullptr;
shader_lib_ctx.dst = dst;
shader_lib_ctx.max_wg_size = ctx->global_ctx->capabilities.limits.maxComputeInvocationsPerWorkgroup;
shader_lib_ctx.max_wg_size = ctx->global_ctx->capabilities.limits.maxComputeInvocationsPerWorkgroup;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deja-vú. :)

@reeselevine reeselevine merged commit 1705d43 into ggml-org:master Jun 8, 2026
28 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning WebGPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants