Skip to content

[Build] Build fails: 'error : no operator "+=" matches these operands' with nv_bfloat16 #25162

Open
@AndreyOrb

Description

@AndreyOrb

Describe the issue

Build fails on the latest version: SHA-1: 3a47bd2

No operator "+=" matches operands
nv_bfloat16 += nv_bfloat16
nv_bfloat16 += const nv_bfloat16
nv_bfloat16 += float

Urgency

Urgent! Build with CUDA 11.8 fails

Target platform

Windows

Build script

.\build.bat --config Debug --build_shared_lib --parallel --use_cuda --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\cuDNN\cudnn-windows-x86_64-8.9.0.131_cuda11-archive" --cuda_version 11.8 --use_tensorrt --tensorrt_home "C:\Program Files\NVIDIA GPU Computing Toolkit\TensorRT\TensorRT-10.9.0.34.Windows.win10.cuda-11.8\TensorRT-10.9.0.34" --use_tensorrt_oss_parser --cmake_generator "Visual Studio 16 2019" --compile_no_warning_as_error --cmake_path E:\3rdParties\cmake-4.0.3\build\bin\Release\cmake.exe --skip_tests --enable_cuda_line_info --use_mimalloc

Workaround of building without contrib ops (--disable_contrib_ops) does not work in my case:

CMake Error at onnxruntime_providers_tensorrt.cmake:4 (message):
  To compile TensorRT execution provider contrib ops have to be enabled to
  dump an engine using com.microsoft:EPContext node.
Call Stack (most recent call first):
  onnxruntime_providers.cmake:132 (include)
  CMakeLists.txt:1890 (include)

Error / output

E:\3rdParties\onnxruntime_v1.22.0\onnxruntime\contrib_ops\cuda\bert\skip_layer_norm_impl.cu(167): error : no operator "+=" matches these operands [E:\3rdParties\onnxruntime_v1.22.0\build\Windows\Debug\onnxrun
time_providers_cuda.vcxproj]
operand types are: nv_bfloat16 += nv_bfloat16
detected during:
instantiation of "void onnxruntime::contrib::cuda::SkipLayerNormKernelSmall<T,TPB,ILP,Simplified>(T *, T *, const T *, const T *, const T *, const T *, const T *, T, int, int) [with T=nv_bfloat1
6, TPB=32U, ILP=4, Simplified=true]"
(240): here
instantiation of "void onnxruntime::contrib::cuda::LaunchSkipLayerNormKernel<T,Simplified>(cudaStream_t, T *, T *, const T *, const T *, const T *, const T *, const T *, float, int, int, int) [w
ith T=nv_bfloat16, Simplified=true]"
(272): here

Visual Studio Version

VS 16 2019

GCC / Compiler Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    buildbuild issues; typically submitted using template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions