Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22)#27733
Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22)#27733
Conversation
…1→22) Update kernel registrations for GlobalAveragePool and GlobalMaxPool in the CUDA execution provider from opset 1 (only) to versioned 1-21 plus opset 22. Changes: - onnxruntime/core/providers/cuda/nn/pool.cc: versioned 1-21 + opset 22 kernels - onnxruntime/core/providers/cuda/cuda_execution_provider.cc: class decls + BuildKernelCreateInfo - onnxruntime/core/providers/cuda/cuda_nhwc_kernels.cc: NHWC class decls + BuildKernelCreateInfo - onnxruntime/test/providers/cpu/nn/pool_op_test.cc: add GlobalAveragePool_22_CUDA test Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
|
/azp run Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@copilot, Please update docs/OperatorKernels.md. You can download it from https://aiinfra.visualstudio.com/_apis/resources/Containers/34055151/ContribOperators.md?itemPath=ContribOperators.md%2FContribOperators.md |
…l opset gap fill Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Updated |
|
/azp run Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…averagepool-opset
| test.Run(OpTester::ExpectResult::kExpectSuccess, "", {}); | ||
| } | ||
|
|
||
| TEST(PoolTest, GlobalAveragePool_22_CUDA) { |
There was a problem hiding this comment.
Just wondering should this be 21?
There was a problem hiding this comment.
It is for 22, which is the new opset for the op.
Description
Extends CUDA kernel registrations for
GlobalAveragePoolandGlobalMaxPoolfrom opset 1 only to the full opset 1–22 range. Follows the same pattern used forMaxPoolin #27715.core/providers/cuda/nn/pool.cc— Split single opset-1 registrations into versioned 1–21 + opset 22 for both NCHW and NHWC variantscore/providers/cuda/cuda_execution_provider.cc— Updated class declarations andBuildKernelCreateInfoentries (versioned 1–21, added opset 22)core/providers/cuda/cuda_nhwc_kernels.cc— Same for NHWC kernel registrationstest/providers/cpu/nn/pool_op_test.cc— AddedGlobalAveragePool_22_CUDAtestdocs/OperatorKernels.md— Updated GlobalAveragePool and GlobalMaxPool entries from1+to22+/[1, 21]in both the ai.onnx and com.microsoft.internal.nhwc domains under CUDAExecutionProviderNo functional changes to the kernel implementations—opsets 1 through 22 are spec-compatible for these ops.
Motivation and Context
GlobalAveragePoolandGlobalMaxPoolwere registered at opset 1 only in the CUDA provider, creating a 21-version gap to the latest ONNX opset 22. Models exported at higher opsets would fail to find a matching CUDA kernel. Identified as P1 gaps in #27729.Limitations
BF16 support for GlobalAveragePool-22 and GlobalMaxPool-22 is not added in this PR.