[ Cuda] ConvTranspose-22 by tianleiwu · Pull Request #27710 · microsoft/onnxruntime

tianleiwu · 2026-03-17T19:26:37Z

Description

This PR extends the CUDA ConvTranspose operator registration to support ONNX opset 22. The CUDA implementation already shares the same attribute and output-shape handling used by earlier supported opsets, so this change primarily exposes the existing kernel path for opset 22 and adds regression coverage to keep that path working.

Summary of Changes

CUDA Kernel Registration

File	Change
`onnxruntime/core/providers/cuda/nn/conv_transpose.cc`	Split `ConvTranspose` kernel registration into `1-10`, `11-21`, and `22` so the CUDA kernel can be selected for opset 22.
`onnxruntime/core/providers/cuda/cuda_execution_provider.cc`	Added matching provider-side kernel declarations and registry entries for `ConvTranspose(22)` on CUDA.
`onnxruntime/core/providers/cuda/cuda_nhwc_kernels.cc`	Added matching NHWC CUDA declarations and registry entries for `ConvTranspose(22)`.

Test Coverage

File	Change
`onnxruntime/test/providers/cpu/nn/conv_transpose_op_test.cc`	Added a CUDA-only opset 22 regression test that validates `ConvTranspose` with `output_shape`.
`onnxruntime/test/providers/cuda/nhwc/conv_transpose_test.cc`	Updated existing NHWC comparison coverage to instantiate `ConvTranspose` at opset 22.

Testing

Built the touched translation units successfully with targeted ninja object builds:
- onnxruntime/core/providers/cuda/cuda_execution_provider.cc
- onnxruntime/core/providers/cuda/cuda_nhwc_kernels.cc
- onnxruntime/core/providers/cuda/nn/conv_transpose.cc
- onnxruntime/test/providers/cpu/nn/conv_transpose_op_test.cc
- onnxruntime/test/providers/cuda/nhwc/conv_transpose_test.cc
Ran git diff --check to confirm the patch is formatting-clean.
A full onnxruntime_test_all / gtest runtime pass was not completed locally because the build tree regenerated and expanded into a broad rebuild; runtime verification of the new CUDA and NHWC tests should still be run in CI or a focused local test build.

Motivation and Context

Related issue: #26393

The ONNX ConvTranspose-22 schema keeps the same core padding and output-shape semantics used by the earlier CUDA-supported opsets, but CUDA registration stopped at opset 11. That meant models using ConvTranspose at opset 22 could miss CUDA kernel assignment even though the implementation path was already compatible.

This PR closes that gap by updating kernel registration and test coverage without changing the underlying CUDA compute logic.

Checklist

Tests added/updated
Documentation updated (if applicable)
No breaking changes (or documented in description)
CI passes

### Description Extends GRU CUDA kernel registration from opset 14 to opset 22, following the same pattern as other recent opset gap fills (e.g., ConvTranspose in #27710). - **`gru.cc`**: Cap existing opset-14 non-versioned kernel to versioned 14–21; add new non-versioned kernel at opset 22+ - **`cuda_execution_provider.cc`**: Update forward declarations and `BuildKernelCreateInfo` entries for versioned 14–21 and non-versioned 22+ - **`deep_cpu_gru_op_test.cc`**: Add CUDA-specific test for GRU at opset 22 with `linear_before_reset=1` (cuDNN requirement) - **`docs/OperatorKernels.md`**: Update CUDA provider GRU entry to reflect `22+`, `[14, 21]`, and `[7, 13]` version ranges No functional changes to the kernel implementation—the GRU spec is unchanged between opsets 14 and 22. ### Motivation and Context CUDA EP registered GRU only up to opset 14, while ONNX defines GRU through opset 22. Models exported at opset ≥15 would fail to find a matching CUDA kernel and fall back to CPU. This is one of the P1 gaps tracked in #27729. ### Limitation BF16 version is not added for GRU-22. It can be added later if needed. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Cuda ConvTranspose-22

f2c2a16

tianleiwu marked this pull request as draft March 17, 2026 19:26

tianleiwu changed the title ~~[ Cuda] Enable ConvTranspose opset 22~~ [ Cuda] ConvTranspose-22 Mar 17, 2026

tianleiwu mentioned this pull request Mar 18, 2026

[Feature Request] Extend CUDA ONNX Ops to latest opset version #27729

Open

Copilot AI mentioned this pull request Mar 18, 2026

Fill CUDA EP opset gap for GRU operator (14 → 22) #27738

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ Cuda] ConvTranspose-22#27710

[ Cuda] ConvTranspose-22#27710
tianleiwu wants to merge 1 commit intomainfrom
tlwu/20260317/cuda_ConvTranspose

tianleiwu commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tianleiwu commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary of Changes

CUDA Kernel Registration

Test Coverage

Testing

Motivation and Context

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tianleiwu commented Mar 17, 2026 •

edited

Loading