[WebGPU] LinearAttention: increase tile_v when subgroups are available by daijh · Pull Request #28519 · microsoft/onnxruntime

daijh · 2026-05-15T03:43:56Z

Description

Scale tile_v by 4x when subgroup is enabled and the vectorized dimension has enough columns, improving data reuse.
Remove redundant zero-initialization of the state tile (WGSL default- initializes vars to zero).

TODO: add some performance number

Motivation and Context

See above.

- Scale tile_v by 4x when subgroup is enabled and the vectorized dimension has enough columns, improving data reuse. - Remove redundant zero-initialization of the state tile (WGSL default- initializes private vars to zero).

daijh marked this pull request as draft May 15, 2026 03:44

Add comments

58a09c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebGPU] LinearAttention: increase tile_v when subgroups are available#28519

[WebGPU] LinearAttention: increase tile_v when subgroups are available#28519
daijh wants to merge 2 commits into
microsoft:mainfrom
daijh:linear-attn-dev

daijh commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

daijh commented May 15, 2026

Description

Motivation and Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant