-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[webgpu]: optimize pool operators #24598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The patch optimizes pool operators when output size is small and kernel size is big
The issue is from an user's model. |
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
Azure Pipelines successfully started running 5 pipeline(s). |
this ci error is unrelated to this PR - fix incoming |
does this PR also introduce a shortcut when the input size is equal to kernel size - same code path as simply taking a mean? |
Hi, @vadimkantorov The path is valid when
same code path as simply taking a mean? , Do you mean AveragePool and GlobalAveragePool ? The path is also valid for these operators.
|
Yes, I mean global pooling, where output_size is strictly 1x1, while input_size (matching kernel_size as in #23614) can be 100 or anything else. I guess output_size = 1, output_size = kernel_size = 100 would not pass this check, right? |
In #23614, the input shape is [1, 56, 80, 128], the kernel shape is [80, 128], so the output shape size is 56 (< 128), the kernel size is 80 * 128 (> 128), which will pass this check. |
The patch optimizes pool operators when output size is small and kernel size is big
Description
Motivation and Context