Skip to content

[Build] can't build CUDA (+ vino and directML) for latest v1.22 on windows #25081

Open
@SimonRelu

Description

@SimonRelu

Describe the issue

I can't build CUDA for v1.22 I use the following command:

I'm using CUDA 12.9 with CUDNN 9.10

I'm also using OpenVINO 2025.1.0.18503

python 3.11

It seeems like some build arguments are being passed to nvcc that are not valid and cause it to fail see nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified

Urgency

No response

Target platform

windows

Build script

$ProgressPreference = 'SilentlyContinue'
wget "https://storage.openvinotoolkit.org/repositories/openvino/packages/2025.1/windows/openvino_toolkit_windows_2025.1.0.18503.6fec06580ab_x86_64.zip" -o openvino.zip
Expand-Archive openvino.zip
.\openvino\openvino_toolkit_windows_2025.1.0.18503.6fec06580ab_x86_64\setupvars.bat

git clone https://github.com/Microsoft/onnxruntime.git
Set-Location onnxruntime
git checkout v1.22.0
git submodule update --init --recursive

py -m venv venv
.\venv\Scripts\activate
python -m pip install cmake==3.28.4

$env:CUDA_HOME="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9"
$env:CUDNN_HOME="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9"
$env:CudaToolkitDir="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9"
$env:CUDA_BIN_PATH="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9"
$env:OpenVINO_DIR="C:<LOCAL_PATH>\openvino\openvino_toolkit_windows_2025.1.0.18503.6fec06580ab_x86_64\runtime\cmake"

.\build.bat --config Release `
           --build_shared_lib `
           --cmake_generator="Visual Studio 17 2022" `
           --parallel `
           --use_cuda `
           --use_openvino "GPU" `
           --build_wheel `
           --enable_pybind `
           --use_dml `
           --disable_ml_ops `
           --disable_contrib_ops `
           --skip_tests `
           --skip_onnx_tests `
           --cmake_extra_defines 'onnxruntime_BUILD_UNIT_TESTS=OFF CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;86;89;120'

Error / output

  (venv) C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\b
  in\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\
  x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\U
  sers\Relu\simon\onnxbuilder\onnxruntime\build\Windows\packages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilde
  r\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\si
  mon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytorch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\b
  uild\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\
  Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\
  Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime
  \build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src"
  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntim
  e\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffe
  rs-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\sim
  on\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Wi
  ndows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cudnn
  _frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\mp11-src\include" -I"C:\Users\Relu
  \simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\i
  nclude" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --keep-dir onnxrunt.2465D215\x64\Release  -maxrregc
  ount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-compress-all --expt-relaxed-constexpr default-
  stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_compare_with_zero -Xcudafe --diag_suppress=
  expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_60,code=[compute_
  60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate
  -code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,sm_86] --generate-code=arch=compute_8
  9,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --diag_suppress=conversion_function_no
  t_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi /utf-8 /sdl /experimental:externa
  l /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/b
  uild/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4505 /wd4834
   /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\""
   -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DN
  OMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_AT
  TENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_
  LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLA
  TES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CU
  DA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MBCS -DEIGEN_HAS_
  C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO
  _SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D
  _USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1
  -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOCALE=0 -
  DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGE
  N_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC_OP
  S -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
  /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\unfold_impl.obj
  "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\tensor\unfold_impl.cu"
CUDACOMPILE : nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a fut
ure release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\
onnxruntime_providers_cuda.vcxproj]
  nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
CUDACOMPILE : nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a fut
ure release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\
onnxruntime_providers_cuda.vcxproj]
  nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified


  (venv) C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\b
  in\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\
  x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\U
  sers\Relu\simon\onnxbuilder\onnxruntime\build\Windows\packages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilde
  r\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\si
  mon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytorch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\b
  uild\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\
  Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\
  Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime
  \build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src"
  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntim
  e\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffe
  rs-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\sim
  on\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Wi
  ndows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cudnn
  _frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\mp11-src\include" -I"C:\Users\Relu
  \simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\i
  nclude" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --keep-dir onnxrunt.2465D215\x64\Release  -maxrregc
  ount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-compress-all --expt-relaxed-constexpr default-
  stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_compare_with_zero -Xcudafe --diag_suppress=
  expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_60,code=[compute_
  60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate
  -code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,sm_86] --generate-code=arch=compute_8
  9,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --diag_suppress=conversion_function_no
  t_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi /utf-8 /sdl /experimental:externa
  l /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/b
  uild/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4505 /wd4834
   /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\""
   -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DN
  OMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_AT
  TENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_
  LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLA
  TES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CU
  DA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MBCS -DEIGEN_HAS_
  C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO
  _SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D
  _USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1
  -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOCALE=0 -
  DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGE
  N_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC_OP
  S -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
  /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\greedy_search_to
  p_one.obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\transformers\greedy_search_top_one.cu"
  (venv) C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\b
  in\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\
  x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\U
  sers\Relu\simon\onnxbuilder\onnxruntime\build\Windows\packages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilde
  r\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\si
  mon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytorch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\b
  uild\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\
  Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\
  Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime
  \build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src"
  -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntim
  e\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffe
  rs-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\sim
  on\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Wi
  ndows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cudnn
  _frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\mp11-src\include" -I"C:\Users\Relu
  \simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\i
  nclude" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --keep-dir onnxrunt.2465D215\x64\Release  -maxrregc
  ount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-compress-all --expt-relaxed-constexpr default-
  stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_compare_with_zero -Xcudafe --diag_suppress=
  expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_60,code=[compute_
  60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate
  -code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,sm_86] --generate-code=arch=compute_8
  9,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --diag_suppress=conversion_function_no
  t_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi /utf-8 /sdl /experimental:externa
  l /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/b
  uild/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4505 /wd4834
   /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\""
   -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DN
  OMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_AT
  TENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_
  LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLA
  TES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CU
  DA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MBCS -DEIGEN_HAS_
  C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO
  _SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D
  _USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1
  -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOCALE=0 -
  DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGE
  N_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC_OP
  S -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
  /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\generation_cuda_
  impl.obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\transformers\generation_cuda_impl.cu"
  rnn_helpers.cc
CUDACOMPILE : nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a fut
ure release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\
onnxruntime_providers_cuda.vcxproj]
  nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
CUDACOMPILE : nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a fut
ure release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\
onnxruntime_providers_cuda.vcxproj]
  nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.9.targets(801,9): error
MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program File
s\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilde
r\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\pac
kages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onn
xbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytor
ch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxrunt
ime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_dep
s\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon
\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_d
eps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\on
nxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffers-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Window
s\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples"
-I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\cudnn_frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windo
ws\Release\_deps\mp11-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Progra
m Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --ke
ep-dir onnxrunt.2465D215\x64\Release  -maxrregcount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-c
ompress-all --expt-relaxed-constexpr default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_
compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --gen
erate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=comput
e_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,
sm_86] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --d
iag_suppress=conversion_function_not_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi
/utf-8 /sdl /experimental:external /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/
simon/onnxbuilder/onnxruntime/build/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /w
d5054 /w15038 /wd4505 /wd4834 /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=
0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLA
TFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUS
E_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_
cuda.dll\"" -DONLY_C_LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_
HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_D
LPACK -DENABLE_CUDA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MB
CS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0
\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -D
NOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATT
ENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOC
ALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -
DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC
_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
 /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\beam_search_topk.
obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\transformers\beam_search_topk.cu"" exited with code 1. [
C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.9.targets(801,9): error
MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program File
s\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilde
r\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\pac
kages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onn
xbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytor
ch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxrunt
ime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_dep
s\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon
\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_d
eps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\on
nxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffers-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Window
s\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples"
-I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\cudnn_frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windo
ws\Release\_deps\mp11-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Progra
m Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --ke
ep-dir onnxrunt.2465D215\x64\Release  -maxrregcount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-c
ompress-all --expt-relaxed-constexpr default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_
compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --gen
erate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=comput
e_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,
sm_86] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --d
iag_suppress=conversion_function_not_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi
/utf-8 /sdl /experimental:external /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/
simon/onnxbuilder/onnxruntime/build/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /w
d5054 /w15038 /wd4505 /wd4834 /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=
0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLA
TFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUS
E_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_
cuda.dll\"" -DONLY_C_LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_
HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_D
LPACK -DENABLE_CUDA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MB
CS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0
\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -D
NOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATT
ENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOC
ALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -
DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC
_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
 /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\image_scaler_impl
.obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\tensor\image_scaler_impl.cu"" exited with code 1. [C:\U
sers\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]
CUDACOMPILE : nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a fut
ure release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\
onnxruntime_providers_cuda.vcxproj]
  nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.9.targets(801,9): error
MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program File
s\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilde
r\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\pac
kages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onn
xbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytor
ch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxrunt
ime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_dep
s\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon
\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_d
eps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\on
nxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffers-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Window
s\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples"
-I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\cudnn_frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windo
ws\Release\_deps\mp11-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Progra
m Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --ke
ep-dir onnxrunt.2465D215\x64\Release  -maxrregcount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-c
ompress-all --expt-relaxed-constexpr default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_
compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --gen
erate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=comput
e_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,
sm_86] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --d
iag_suppress=conversion_function_not_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi
/utf-8 /sdl /experimental:external /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/
simon/onnxbuilder/onnxruntime/build/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /w
d5054 /w15038 /wd4505 /wd4834 /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=
0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLA
TFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUS
E_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_
cuda.dll\"" -DONLY_C_LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_
HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_D
LPACK -DENABLE_CUDA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MB
CS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0
\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -D
NOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATT
ENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOC
ALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -
DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC
_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
 /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\unfold_impl.obj "
C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\tensor\unfold_impl.cu"" exited with code 1. [C:\Users\Relu\si
mon\onnxbuilder\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.9.targets(801,9): error
MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program File
s\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilde
r\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\pac
kages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onn
xbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytor
ch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxrunt
ime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_dep
s\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon
\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_d
eps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\on
nxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffers-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Window
s\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples"
-I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\cudnn_frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windo
ws\Release\_deps\mp11-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Progra
m Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --ke
ep-dir onnxrunt.2465D215\x64\Release  -maxrregcount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-c
ompress-all --expt-relaxed-constexpr default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_
compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --gen
erate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=comput
e_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,
sm_86] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --d
iag_suppress=conversion_function_not_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi
/utf-8 /sdl /experimental:external /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/
simon/onnxbuilder/onnxruntime/build/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /w
d5054 /w15038 /wd4505 /wd4834 /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=
0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLA
TFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUS
E_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_
cuda.dll\"" -DONLY_C_LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_
HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_D
LPACK -DENABLE_CUDA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MB
CS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0
\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -D
NOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATT
ENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOC
ALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -
DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC
_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
 /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\greedy_search_top
_one.obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\transformers\greedy_search_top_one.cu"" exited with
 code 1. [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.9.targets(801,9): error
MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program File
s\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Users\Relu\simon\onnxbuilde
r\onnxruntime\build\Windows\Release\_deps\directx_headers-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\pac
kages\Microsoft.AI.DirectML.1.15.4\include -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\include\onnxruntime -IC:\Users\Relu\simon\onn
xbuilder\onnxruntime\include\onnxruntime\core\session -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\pytor
ch_cpuinfo-src\include" -IC:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release -IC:\Users\Relu\simon\onnxbuilder\onnxrunt
ime\onnxruntime -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\abseil_cpp-src" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\safeint-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_dep
s\gsl-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\date-src\include" -I"C:\Users\Relu\simon
\onnxbuilder\onnxruntime\build\Windows\Release\_deps\onnx-src" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_d
eps\onnx-build" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\protobuf-src\src" -I"C:\Users\Relu\simon\on
nxbuilder\onnxruntime\build\Windows\Release\_deps\flatbuffers-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Window
s\Release\_deps\cutlass-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\examples"
-I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\cutlass-src\tools\util\include" -I"C:\Users\Relu\simon\onnx
builder\onnxruntime\build\Windows\Release\_deps\cudnn_frontend-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windo
ws\Release\_deps\mp11-src\include" -I"C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\_deps\eigen3-src" -I"C:\Progra
m Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include"     --ke
ep-dir onnxrunt.2465D215\x64\Release  -maxrregcount=0    --machine 64 --compile -cudart shared -allow-unsupported-compiler -Xfatbin=-c
ompress-all --expt-relaxed-constexpr default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_
compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --gen
erate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=comput
e_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_86,code=[compute_86,
sm_86] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_120,code=[compute_120,sm_120] -Xcudafe --d
iag_suppress=conversion_function_not_usable --threads 1 --relocatable-device-code=true --diag-suppress=221 -Xcompiler="/EHsc -Ob2 -Zi
/utf-8 /sdl /experimental:external /external:W0 /external:IC:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake /external:IC:/Users/Relu/
simon/onnxbuilder/onnxruntime/build/Windows/Release /wd4251 /wd4201 /wd4324 /wd4800 /wd5054 /w15038 /wd4251 /wd4201 /wd4324 /wd4800 /w
d5054 /w15038 /wd4505 /wd4834 /wd4127 /Zc:__cplusplus"   -D_WINDOWS -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=
0 -D"VER_STRING=\"1.22.0\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLA
TFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUS
E_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_
cuda.dll\"" -DONLY_C_LOCALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_
HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_D
LPACK -DENABLE_CUDA_NHWC_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MB
CS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DVER_MAJOR=1 -DVER_MINOR=22 -DVER_BUILD=0 -DVER_PRIVATE=0 -D"VER_STRING=\"1.22.0
\"" -DCPUINFO_SUPPORTED_PLATFORM=1 -DORT_ENABLE_STREAM -DEIGEN_USE_THREADS -DDISABLE_CUSPARSE_DEPRECATED -DPLATFORM_WINDOWS -DNOGDI -D
NOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DONNXRUNTIME_ENABLE_MEMLEAK_CHECK -DUSE_CUDA=1 -DUSE_FLASH_ATT
ENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -DUSE_OPENVINO=1 -DUSE_DML=1 -D"FILE_NAME=\"onnxruntime_providers_cuda.dll\"" -DONLY_C_LOC
ALE=0 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -
DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DOPENVINO_CONFIG_GPU=1 -DENABLE_DLPACK -DENABLE_CUDA_NHWC
_OPS -DUSE_OVEP_NPU_MEMORY=1 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS
 /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc143.pdb" -o onnxruntime_providers_cuda.dir\Release\generation_cuda_i
mpl.obj "C:\Users\Relu\simon\onnxbuilder\onnxruntime\onnxruntime\contrib_ops\cuda\transformers\generation_cuda_impl.cu"" exited with c
ode 1. [C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]
  expand.cc
  matmul_integer.cc
  pool.cc
  DmlOperatorMatMulInteger.cpp
  DmlOperatorEinSum.cpp
  gather.cc
  eye_like.cc
  DmlOperatorLpNormalization.cpp
  layer_norm_impl.cc
  DmlOperatorMaxUnpool.cpp
  grid_sample.cc
  gather_elements.cc
  DmlOperatorMemcpy.cpp
  DmlOperatorMeanVarianceNormalization.cpp
  concat.cc
  compress.cc
  DmlOperatorNeg.cpp
  window_functions.cc
  gelu.cc
  DmlOperatorNonZero.cpp
  DmlOperatorMultiHeadAttention.cpp
  DmlOperatorPadding.cpp
  DmlOperatorMatMul.cpp
  col2im.cc
  nonzero_op.cc
  pad.cc
  DmlOperatorOneHot.cpp
  identity_op.cc
  scatter.cc
  DmlOperatorConstantOfShape.cpp
  DmlOperatorPooling.cpp
  gather_nd.cc
  DmlOperatorQLinearAdd.cpp
  isinf.cc
  DmlOperatorQLinearConcat.cpp
  DmlOperatorQLinearAveragePooling.cpp
  onehot.cc
  mean_variance_normalization.cc
  slice.cc
  DmlOperatorQLinearSigmoid.cpp
  DmlOperatorQLinearMatMul.cpp
  shape_op.cc
  DmlOperatorMatMulIntegerToFloat.cpp
  DmlOperatorMatMulNBits.cpp
  DmlOperatorQLinearConv.cpp
  squeeze.cc
  space_depth_ops.cc
  DmlOperatorResize.cpp
  size.cc
  reverse_sequence.cc
  DmlOperatorLayerNormalization.cpp
  DmlOperatorReduce.cpp
  tile.cc
  DmlOperatorQAttention.cpp
  unsqueeze.cc
  unique.cc
  DmlOperatorRoiPooling.cpp
  resize.cc
  DmlOperatorLocalResponseNormalization.cpp
  split.cc
  regex_full_match.cc
  DmlOperatorRecurrentNeuralNetwork.cpp
  trilu.cc
  DmlOperatorSize.cpp
  DmlOperatorSkipLayerNormalization.cpp
  string_normalizer.cc
  isnan.cc
  activations.cc
  DmlOperatorShape.cpp
  reshape.cc
  DmlOperatorQuickGelu.cpp
  attention_wrapper.cc
  string_split.cc
  DmlOperatorSpaceToDepth.cpp
  DmlOperatorRotaryEmbedding.cpp
  uni_dir_attn_lstm.cc
  DmlOperatorSlice.cpp
  DmlOperatorReverseSequence.cpp
  bahdanau_attention.cc
  DmlOperatorTile.cpp
  DmlOperatorTopk.cpp
  attention_base.cc
  attention_utils.cc
  DmlOperatorScatter.cpp
  DmlOperatorRoiAlign.cpp
  transpose.cc
  attention.cc
  bifurcation_detector.cc
  DmlOperatorValueScale2D.cpp
  embed_layer_norm_helper.cc
  OperatorRegistration.cpp
  decoder_masked_multihead_attention.cc
  string_concat.cc
  DmlOperatorTranspose.cpp
  where_op.cc
  TensorDesc.cpp
  scatter_nd.cc
  ReadbackHeap.cpp
  embed_layer_norm.cc
  bias_gelu.cc
  fast_gelu.cc
  upsample.cc
  DmlOperatorTrilu.cpp
  bn_mul_fusion.cc
  dml_provider_factory.cc
  cdist.cc
  bn_add_fusion.cc
  OperatorHelper.cpp
  DmlOperatorRange.cpp
  ngram_repeat_block.cc
  cpu_contrib_kernels.cc
  bias_gelu_helper.cc
  OperatorUtility.cpp
  DmlOperatorSplit.cpp
  element_wise_ops.cc
  conv_transpose_with_dynamic_pads.cc
  PooledUploadHeap.cpp
  crop.cc
  rotary_embedding.cc
  group_query_attention.cc
  dynamicslice.cc
  crop_and_resize.cc
  deep_cpu_attn_lstm.cc
  multihead_attention.cc
  expand_dims.cc
  fused_conv.cc
  longformer_attention_base.cc
  fused_matmul.cc
  grid_sample.cc
  fused_gemm.cc
  image_scaler.cc
  layer_norm.cc
  inverse.cc
  GraphTransformerHelpers.cc
  fused_activation.cc
  sparse_dense_matmul.cc
  matmul_fpq4.cc
  maxpool_with_mask.cc
  murmur_hash3.cc
  mean_variance_normalization_exp.cc
  nchwc_ops.cc
  attention_quant.cc
  dynamic_quantize_lstm.cc
  dynamic_quantize_matmul.cc
  gather_block_quantized.cc
  matmul_bnb4.cc
  matmul_integer16.cc
  matmul_nbits.cc
  matmul_nbits_impl.cc
  nhwc_max_pool.cc
  qembed_layer_norm.cc
  qlinear_activations.cc
  qlinear_binary_op.cc
  qlinear_concat.cc
  qlinear_global_average_pool.cc
  onnxruntime_providers_dml.vcxproj -> C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\Release\onnxruntime_providers
  _dml.lib
  qlinear_lookup_table.cc
  qlinear_pool.cc
  qlinear_softmax.cc
  qlinear_where.cc
  quant_gemm.cc
  sample.cc
  skip_layer_norm.cc
  sparse_attention.cc
  dynamic_time_warping.cc
  shrunken_gather.cc
  unfold.cc
  tokenizer.cc
  beam_search.cc
  beam_search_parameters.cc
  beam_search_scorer.cc
  generation_device_helper.cc
  greedy_search.cc
  greedy_search_parameters.cc
  logits_processor.cc
  sampling.cc
  sampling_parameters.cc
  sequences.cc
  subgraph_base.cc
  subgraph_gpt.cc
  subgraph_t5_decoder.cc
  subgraph_t5_encoder.cc
  subgraph_whisper_decoder.cc
  subgraph_whisper_encoder.cc
  unique.cc
  dump_tensor.cc
  word_conv_embedding.cc
  dlpack_converter.cc
  onnxruntime_providers.vcxproj -> C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\Release\onnxruntime_providers.lib
  Building Custom Rule C:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake/CMakeLists.txt
  delay_load_hook.cc
  dllmain.cc
     Creating library C:/Users/Relu/simon/onnxbuilder/onnxruntime/build/Windows/Release/Release/onnxruntime.lib and object C:/Users/Re
  lu/simon/onnxbuilder/onnxruntime/build/Windows/Release/Release/onnxruntime.exp
  onnxruntime.vcxproj -> C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\Release\onnxruntime.dll
  Building Custom Rule C:/Users/Relu/simon/onnxbuilder/onnxruntime/cmake/CMakeLists.txt
  example_plugin_ep.cc
     Creating library C:/Users/Relu/simon/onnxbuilder/onnxruntime/build/Windows/Release/Release/example_plugin_ep.lib and object C:/Us
  ers/Relu/simon/onnxbuilder/onnxruntime/build/Windows/Release/Release/example_plugin_ep.exp
  example_plugin_ep.vcxproj -> C:\Users\Relu\simon\onnxbuilder\onnxruntime\build\Windows\Release\Release\example_plugin_ep.dll
Traceback (most recent call last):
  File "C:\Users\Relu\simon\onnxbuilder\onnxruntime\\tools\ci_build\build.py", line 2551, in <module>
    sys.exit(main())
             ^^^^^^
  File "C:\Users\Relu\simon\onnxbuilder\onnxruntime\\tools\ci_build\build.py", line 2453, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "C:\Users\Relu\simon\onnxbuilder\onnxruntime\\tools\ci_build\build.py", line 1281, in build_targets
    run_subprocess(cmd_args, env=env)
  File "C:\Users\Relu\simon\onnxbuilder\onnxruntime\\tools\ci_build\build.py", line 147, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Relu\simon\onnxbuilder\onnxruntime\tools\python\util\run.py", line 50, in run
    completed_process = subprocess.run(
                        ^^^^^^^^^^^^^^^
  File "C:\Users\Relu\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['C:\\Users\\Relu\\simon\\onnxbuilder\\onnxruntime\\venv\\Scripts\\cmake.EXE', '--build', 'C:\\Users\\Relu\\simon\\onnxbuilder\\onnxruntime\\\\build\\Windows\\Release', '--config', 'Release', '--', '/maxcpucount:16', '/p:CL_MPCount=15', '/nodeReuse:False']' returned non-zero exit status 1.

Visual Studio Version

No response

GCC / Compiler Version

-- The C compiler identification is MSVC 19.44.35209.0
-- The CXX compiler identification is MSVC 19.44.35209.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    buildbuild issues; typically submitted using templateep:CUDAissues related to the CUDA execution providerep:DMLissues related to the DirectML execution providerep:OpenVINOissues related to OpenVINO execution providermodel:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.platform:windowsissues related to the Windows platform

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions