Skip to content

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Feb 6, 2026

Description

User reported build error in #27269.

This PR addresses several build issues and compilation warnings in the CUDA provider and associated contrib ops. These fixes ensure a clean build and improved compatibility with different CUDA versions (specifically CUDA 13.1) and compilers.

Changes

1. Fix ShardedMoE Compilation Error

  • Resolved a "no matching function for call to CheckInputs" error in sharded_moe.cc
  • Updated the moe_helper::CheckInputs call to provide the required zero_points arguments (passing nullptr), aligning with the updated function signature.

2. Suppress CUDA 13.1 System Header Warnings

  • Added GCC/Clang diagnostic pragmas to suppress -Wunused-parameter warnings in cuda_fp4.h.
  • These warnings were causing build failures in environments where warnings are treated as errors.
  • Affected files:
    • onnxruntime/core/providers/cuda/cuda_common.h
    • onnxruntime/core/providers/cuda/cuda_type_conversion.h
    • onnxruntime/contrib_ops/cuda/llm/cutlass_type_conversion.h

3. Resolve Sign-Comparison Warnings

  • Fixed several -Wsign-compare warnings that were being treated as errors:
    • Pad Op: Changed loop variable type to size_t in onnxruntime/core/providers/cuda/tensor/pad.cc.
    • Distributed Reshape: Added explicit casts to size_t for int64_t comparisons in onnxruntime/contrib_ops/cuda/collective/distributed_reshape.cc.

Verification

  • The build now completes successfully without errors or warnings using --cmake_extra_defines onnxruntime_USE_NCCL=ON
  • Builds tested with cuda 12.8, 13.0 and 13.1.1

@tianleiwu tianleiwu merged commit a3749f1 into main Feb 9, 2026
101 of 102 checks passed
@tianleiwu tianleiwu deleted the tlwu/fix_nccl_build_errors branch February 9, 2026 00:52
tianleiwu added a commit that referenced this pull request Feb 12, 2026
## Description

User reported build error in
#27269.

This PR addresses several build issues and compilation warnings in the
CUDA provider and associated contrib ops. These fixes ensure a clean
build and improved compatibility with different CUDA versions
(specifically CUDA 13.1) and compilers.

## Changes

### 1. Fix ShardedMoE Compilation Error
- Resolved a "no matching function for call to CheckInputs" error in
sharded_moe.cc
- Updated the `moe_helper::CheckInputs` call to provide the required
`zero_points` arguments (passing `nullptr`), aligning with the updated
function signature.

### 2. Suppress CUDA 13.1 System Header Warnings
- Added GCC/Clang diagnostic pragmas to suppress `-Wunused-parameter`
warnings in `cuda_fp4.h`.
- These warnings were causing build failures in environments where
warnings are treated as errors.
- Affected files:
    - onnxruntime/core/providers/cuda/cuda_common.h
    - onnxruntime/core/providers/cuda/cuda_type_conversion.h
    - onnxruntime/contrib_ops/cuda/llm/cutlass_type_conversion.h

### 3. Resolve Sign-Comparison Warnings
- Fixed several `-Wsign-compare` warnings that were being treated as
errors:
- **Pad Op:** Changed loop variable type to `size_t` in
onnxruntime/core/providers/cuda/tensor/pad.cc.
- **Distributed Reshape:** Added explicit casts to `size_t` for
`int64_t` comparisons in
onnxruntime/contrib_ops/cuda/collective/distributed_reshape.cc.

## Verification
- The build now completes successfully without errors or warnings using
`--cmake_extra_defines onnxruntime_USE_NCCL=ON`
- Builds tested with cuda 12.8, 13.0 and 13.1.1
tianleiwu added a commit that referenced this pull request Feb 13, 2026
This cherry-picks the following commits for the 1.24.2 release:
- #27096
- #27077
- #26677
- #27238
- #27213
- #27256
- #27278
- #27275
- #27276
- #27216
- #27271
- #27299
- #27294
- #27266
- #27176
- #27126
- #27252

---------

Co-authored-by: Xiaofei Han <xiaofeihan@microsoft.com>
Co-authored-by: Jiajia Qin <jiajiaqin@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: qti-monumeen <monumeen@qti.qualcomm.com>
Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: guschmue <22941064+guschmue@users.noreply.github.com>
Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: angelser <32746004+angelser@users.noreply.github.com>
Co-authored-by: Angela Serrano Brummett <angelser@microsoft.com>
Co-authored-by: Misha Chornyi <99709299+mc-nv@users.noreply.github.com>
Co-authored-by: hariharans29 <9969784+hariharans29@users.noreply.github.com>
Co-authored-by: eserscor <erscor@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Co-authored-by: Ti-Tai Wang <titaiwang@microsoft.com>
Co-authored-by: bmehta001 <bmehta001@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants