Skip to content

Conversation

@Xingchen1224
Copy link

This pull request fix the compile error when building opencv_photo with cuda 12.9 on windows.

Building NVCC (Device) object modules/photo/CMakeFiles/cuda_compile_1.dir/src/cuda/Release/cuda_compile_1_generated_nlm.cu.obj
  nlm.cu
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(202): error : calling a __host__ function("cuda::std::__4::tuple< ::cuda::std::__4::__unwrap_ref_decay<T1> ::type... >  cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > (T1 &&...)") from a __device__ function("cv::cuda::device::imgproc::Unroll<(int)1> ::op") is not allowed [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(202): error : identifier "cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > " is undefined in device code [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(221): error : calling a __host__ function("cuda::std::__4::tuple< ::cuda::std::__4::__unwrap_ref_decay<T1> ::type... >  cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > (T1 &&...)") from a __device__ function("cv::cuda::device::imgproc::Unroll<(int)2> ::op") is not allowed [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(221): error : identifier "cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > " is undefined in device code [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(240): error : calling a __host__ function("cuda::std::__4::tuple< ::cuda::std::__4::__unwrap_ref_decay<T1> ::type... >  cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > (T1 &&...)") from a __device__ function("cv::cuda::device::imgproc::Unroll<(int)3> ::op") is not allowed [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
C:\Users\xingchen\Desktop\xc-build\sources_opencv_4.12.0\modules\photo\src\cuda\nlm.cu(240): error : identifier "cuda::std::__4::make_tuple< ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  &,  ::cv::cuda::device::plus<float>  & > " is undefined in device code [C:\Users\xingchen\Desktop\xc-build\build_cuv12.9_cudnn9.10.1.4_opencv4.12.0\modules\photo\opencv_photo.vcxproj]
  
  6 errors detected in the compilation of "C:/Users/xingchen/Desktop/xc-build/sources_opencv_4.12.0/modules/photo/src/cuda/nlm.cu".
  nlm.cu
  CMake Error at cuda_compile_1_generated_nlm.cu.obj.Release.cmake:278 (message):
    Error generating file
    C:/Users/xingchen/Desktop/xc-build/build_cuv12.9_cudnn9.10.1.4_opencv4.12.0/modules/photo/CMakeFiles/cuda_compile_1.dir/src/cuda/Release/cuda_compile_1_generated_nlm.cu.obj

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov asmorkalov added category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: photo labels Oct 11, 2025
@Xingchen1224
Copy link
Author

Hello @asmorkalov, I cannot see any log details of the two failing ci nodes.

image

@asmorkalov
Copy link
Contributor

It's network issue. I re-triggered the build.

@asmorkalov asmorkalov self-requested a review October 11, 2025 12:57
@asmorkalov asmorkalov self-assigned this Oct 11, 2025
@asmorkalov asmorkalov added this to the 4.13.0 milestone Oct 11, 2025
@asmorkalov
Copy link
Contributor

cc @cudawarped

@asmorkalov asmorkalov merged commit a74374d into opencv:4.x Oct 11, 2025
52 of 55 checks passed
@cudawarped
Copy link
Contributor

@Xingchen1224 How are you building this as I can't recreate the error?

@Xingchen1224
Copy link
Author

@cudawarped hope the following information would help to reproduce the error.

Environment:

- Windows 11 64bit
- MSBuild (Visual Studio 2022)
- Cmake 4.1.2
- Powershell
- CUDA 12.9 update1
- CUDNN 9.10.1.4 or 8.9.7.29
CMake commands:
cmake -G "Visual Studio 17 2022" `
    -DCMAKE_CXX_STANDARD=17 `
    -DCMAKE_CUDA_STANDARD=17 `
    -DCMAKE_CUDA_STANDARD_REQUIRED=ON `
    -DCUDA_NVCC_FLAGS="--std=c++17 --expt-relaxed-constexpr --extended-lambda" `
    -DBUILD_LIST="cudev,photo" `
    -DCUDA_ARCH_BIN="10.0" `
    -DCUDA_ARCH_PTX="10.0" `
    ... Other Params...
    - DOPENCV_EXTRA_MODULES_PATH="$CONTRIB_DIR/modules" `
    -DCUDA_TOOLKIT_ROOT_DIR="$CUDA_PATH" `
    -DCUDNN_INCLUDE_DIR="$CUDNN_PATH/include" `
    -DCUDNN_LIBRARY="$CUDNN_PATH/lib/x64/cudnn.lib" `
     ... Other Params...
CMake Config key information
-- General configuration for OpenCV 4.12.0 =====================================
--   Version control:               unknown
-- 
--   Extra modules:
--     Location (extra):            C:/Users/xingchen/Desktop/xc-build/opencv_contrib-4.12.0/modules
--     Version control (extra):     unknown
-- 
--   Platform:
--     Timestamp:                   2025-10-10T09:48:03Z
--     Host:                        Windows 10.0.26100 AMD64
--     CMake:                       4.1.2
--     CMake generator:             Visual Studio 17 2022
--     CMake build tool:            C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/MSBuild/Current/Bin/amd64/MSBuild.exe
--     MSVC:                        1944

......

-- 
--   OpenCV modules:
--     To be built:                 core cudev imgproc photo
--     Disabled:                    highgui videoio world

......

--   NVIDIA CUDA:                   YES (ver 12.9, CUFFT CUBLAS)
--     NVIDIA GPU arch:             100
--     NVIDIA PTX archs:            100

......

--   cuDNN:                         YES (ver 9.10.1)
-- 
-- -----------------------------------------------------------------
-- 
-- Configuring done (73.7s)
-- Generating done (2.3s)
-- Build files have been written to: C:/Users/xingchen/Desktop/xc-build/build_cuv12.9_cudnn9.10.1.4_opencv4.12.0

@Xingchen1224
Copy link
Author

@cudawarped I thought under the specific condition (msbuild & nvcc),

  • the make_tuple was resolved to a host implementation
  • my "workaround" fix made the compiler resolve the make_tuple to a device implementation.

@cudawarped
Copy link
Contributor

@Xingchen1224 This was fixed in #27522 as the issue applied to a number of CUDA modules.

@asmorkalov asmorkalov mentioned this pull request Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: photo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants