Skip to content

[WebGPU] LinearAttention: increase tile_v when subgroups are available#28519

Draft
daijh wants to merge 2 commits into
microsoft:mainfrom
daijh:linear-attn-dev
Draft

[WebGPU] LinearAttention: increase tile_v when subgroups are available#28519
daijh wants to merge 2 commits into
microsoft:mainfrom
daijh:linear-attn-dev

Conversation

@daijh
Copy link
Copy Markdown
Contributor

@daijh daijh commented May 15, 2026

Description

  • Scale tile_v by 4x when subgroup is enabled and the vectorized dimension has enough columns, improving data reuse.
  • Remove redundant zero-initialization of the state tile (WGSL default- initializes vars to zero).

TODO: add some performance number

Motivation and Context

See above.

- Scale tile_v by 4x when subgroup is enabled and the vectorized
  dimension has enough columns, improving data reuse.
- Remove redundant zero-initialization of the state tile (WGSL default-
  initializes private vars to zero).
@daijh daijh marked this pull request as draft May 15, 2026 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant