[webgpu]: optimize pool operators #24598

xhcao · 2025-04-30T08:50:50Z

The patch optimizes pool operators when output size is small and kernel size is big

Description

Motivation and Context

The patch optimizes pool operators when output size is small and kernel size is big

xhcao · 2025-04-30T08:59:01Z

The issue is from an user's model.
If the input data shape is [1, 64, 128, 128] (NCHW), the original code needs one work group for 64 output elements. The kernel size is 128128, so there is a 128128 loop.
If the output size is small, one work group is responsible for one output element, there is only 128 loop.
@jchen10 PTAL

guschmue · 2025-05-06T01:12:13Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-05-06T01:12:37Z

Azure Pipelines successfully started running 5 pipeline(s).

xhcao · 2025-05-06T06:17:23Z

#23614

guschmue · 2025-05-06T16:01:58Z

this ci error is unrelated to this PR - fix incoming

vadimkantorov · 2025-05-09T12:13:34Z

does this PR also introduce a shortcut when the input size is equal to kernel size - same code path as simply taking a mean?

xhcao · 2025-05-12T06:27:01Z

does this PR also introduce a shortcut when the input size is equal to kernel size - same code path as simply taking a mean?

Hi, @vadimkantorov The path is valid when

onnxruntime/onnxruntime/core/providers/webgpu/nn/pool.cc

Line 287 in 7d71975

bool are_small_output_big_kernel = output_size <= 128 && kernel_size >= 128;

, when the kernel size is large and the output size is small.
same code path as simply taking a mean?, Do you mean AveragePool and GlobalAveragePool? The path is also valid for these operators.

vadimkantorov · 2025-05-12T06:31:03Z

Yes, I mean global pooling, where output_size is strictly 1x1, while input_size (matching kernel_size as in #23614) can be 100 or anything else.

I guess output_size = 1, output_size = kernel_size = 100 would not pass this check, right?

xhcao · 2025-05-12T07:57:21Z

ut_size = kernel_size = 100 w

In #23614, the input shape is [1, 56, 80, 128], the kernel shape is [80, 128], so the output shape size is 56 (< 128), the kernel size is 80 * 128 (> 128), which will pass this check.

[webgpu]: optimize pool operators

6359285

The patch optimizes pool operators when output size is small and kernel size is big

guschmue added the ep:WebGPU label Apr 30, 2025

guschmue approved these changes May 6, 2025

View reviewed changes

guschmue merged commit cdff2c1 into microsoft:main May 6, 2025
81 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[webgpu]: optimize pool operators #24598

[webgpu]: optimize pool operators #24598

Uh oh!

xhcao commented Apr 30, 2025

Uh oh!

xhcao commented Apr 30, 2025

Uh oh!

guschmue commented May 6, 2025

Uh oh!

azure-pipelines bot commented May 6, 2025

Uh oh!

xhcao commented May 6, 2025

Uh oh!

guschmue commented May 6, 2025

Uh oh!

Uh oh!

vadimkantorov commented May 9, 2025 •

edited

Loading

Uh oh!

xhcao commented May 12, 2025

Uh oh!

vadimkantorov commented May 12, 2025 •

edited

Loading

Uh oh!

xhcao commented May 12, 2025

Uh oh!

Uh oh!

[webgpu]: optimize pool operators #24598

[webgpu]: optimize pool operators #24598

Uh oh!

Conversation

xhcao commented Apr 30, 2025

Description

Motivation and Context

Uh oh!

xhcao commented Apr 30, 2025

Uh oh!

guschmue commented May 6, 2025

Uh oh!

azure-pipelines bot commented May 6, 2025

Uh oh!

xhcao commented May 6, 2025

Uh oh!

guschmue commented May 6, 2025

Uh oh!

Uh oh!

vadimkantorov commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xhcao commented May 12, 2025

Uh oh!

vadimkantorov commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xhcao commented May 12, 2025

Uh oh!

Uh oh!

vadimkantorov commented May 9, 2025 •

edited

Loading

vadimkantorov commented May 12, 2025 •

edited

Loading