Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNN module fails to compile against cuDNN 9.0 #24983

Closed
4 tasks done
cudawarped opened this issue Feb 9, 2024 · 34 comments
Closed
4 tasks done

DNN module fails to compile against cuDNN 9.0 #24983

cudawarped opened this issue Feb 9, 2024 · 34 comments
Labels
bug category: dnn category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib
Milestone

Comments

@cudawarped
Copy link
Contributor

System Information

OpenCV version: 4.x (09/02/2024)
OS: Windows 11
Compiler: VS 2022
CUDA: 12.3
cuDNN: 9.0

Detailed description

Switching from cuDNN 8.9.7 to 9.0 results in the following build error

D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(122): error C3861: 'cudnnSetRNNDescriptor_v6': identifier not found

when compiling the DNN module.

Full error trace
[372/491] Building CXX object modules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\recurrent_layers.cpp.obj
FAILED: modules/dnn/CMakeFiles/opencv_dnn.dir/Release/src/layers/recurrent_layers.cpp.obj
C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1436~1.325\bin\Hostx64\x64\cl.exe  /nologo /TP -DCVAPI_EXPORTS -DCV_CUDA4DNN=1 -DCV_OCL4DNN=1 -DENABLE_PLUGINS -DHAVE_FLATBUFFERS=1 -DHAVE_PROTOBUF=1 -D_CRT_SECURE_NO_WARNINGS=1 -D_USE_MATH_DEFINES -D_VARIADIC_MAX=10 -D_WIN32_WINNT=0x0601 -D__OPENCV_BUILD=1 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DCMAKE_INTDIR=\"Release\" -ID:\build\opencv\cuda_12_3_dnn\3rdparty\ippicv\ippicv_win\icv\include -ID:\build\opencv\cuda_12_3_dnn\3rdparty\ippicv\ippicv_win\iw\include -ID:\repos\opencv\opencv\modules\dnn\src -ID:\repos\opencv\opencv\modules\dnn\include -ID:\build\opencv\cuda_12_3_dnn\modules\dnn -ID:\repos\opencv\contrib\modules\cudev\include -ID:\repos\opencv\opencv\modules\core\include -ID:\repos\opencv\opencv\modules\imgproc\include -ID:\repos\opencv\opencv\modules\dnn\misc\caffe -ID:\repos\opencv\opencv\modules\dnn\misc\tensorflow -ID:\repos\opencv\opencv\modules\dnn\misc\onnx -ID:\repos\opencv\opencv\modules\dnn\misc\tflite -ID:\repos\opencv\opencv\3rdparty\include\opencl\1.2 -ID:\repos\opencv\opencv\modules\ts\include -ID:\repos\opencv\opencv\modules\imgcodecs\include -ID:\repos\opencv\opencv\modules\videoio\include -ID:\repos\opencv\opencv\modules\highgui\include -external:ID:\build\opencv\cuda_12_3_dnn -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -external:ID:\repos\opencv\opencv\3rdparty\flatbuffers\include -external:ID:\repos\opencv\opencv\3rdparty\protobuf\src -external:W0 /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise /FS     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /wd4244 /wd4267 /wd4018 /wd4355 /wd4800 /wd4251 /wd4996 /wd4146 /wd4305 /wd4127 /wd4100 /wd4512 /wd4125 /wd4389 /wd4510 /wd4610 /wd4702 /wd4456 /wd4457 /wd4065 /wd4310 /wd4661 /wd4506 /wd4125 /wd4267 /wd4127 /wd4244 /wd4512 /wd4702 /wd4456 /wd4510 /wd4610 /wd4800 /wd4701 /wd4703 /wd4505 /wd4458  /O2 /Ob2 /DNDEBUG  /Zi -MD /showIncludes /Fomodules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\recurrent_layers.cpp.obj /Fdlib\Release\opencv_dnn490.pdb /FS -c D:\repos\opencv\opencv\modules\dnn\src\layers\recurrent_layers.cpp
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(122): error C3861: 'cudnnSetRNNDescriptor_v6': identifier not found
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(100): note: while compiling class template member function 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<T>::RNNDescriptor(const cv::dnn::cuda4dnn::csl::cudnn::Handle &,cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<T>::RNNMode,int,int,bool,const cv::dnn::cuda4dnn::csl::cudnn::DropoutDescriptor &)'
        with
        [
            T=float
        ]
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../csl/tensor_ops.hpp(541): note: see the first reference to 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<T>::RNNDescriptor' in 'cv::dnn::cuda4dnn::csl::LSTM<T>::LSTM'
        with
        [
            T=float
        ]
D:\repos\opencv\opencv\modules\dnn\src\layers\../cuda4dnn/primitives/recurrent_cells.hpp(48): note: see the first reference to 'cv::dnn::cuda4dnn::csl::LSTM<T>::LSTM' in 'cv::dnn::cuda4dnn::LSTMOp<float>::LSTMOp'
        with
        [
            T=float
        ]
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../../op_cuda.hpp(196): note: see the first reference to 'cv::dnn::cuda4dnn::LSTMOp<float>::LSTMOp' in 'cv::dnn::make_cuda_node'
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../csl/tensor_ops.hpp(511): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<T>' being compiled
        with
        [
            T=float
        ]
D:\repos\opencv\opencv\modules\dnn\src\layers\../cuda4dnn/primitives/recurrent_cells.hpp(88): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::csl::LSTM<T>' being compiled
        with
        [
            T=float
        ]
D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../../op_cuda.hpp(196): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::LSTMOp<float>' being compiled
D:\repos\opencv\opencv\modules\dnn\src\layers\recurrent_layers.cpp(763): note: see reference to function template instantiation 'cv::Ptr<cv::dnn::dnn4_v20231225::BackendNode> cv::dnn::make_cuda_node<cv::dnn::cuda4dnn::LSTMOp,cv::dnn::cuda4dnn::csl::Stream,cv::dnn::cuda4dnn::csl::cudnn::Handle,cv::Mat&,cv::Mat&,cv::Mat&,cv::dnn::cuda4dnn::RNNConfiguration&>(int,cv::dnn::cuda4dnn::csl::Stream &&,cv::dnn::cuda4dnn::csl::cudnn::Handle &&,cv::Mat &,cv::Mat &,cv::Mat &,cv::dnn::cuda4dnn::RNNConfiguration &)' being compiled
[393/491] Building CXX object modules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\split_layer.cpp.obj
ninja: build stopped: subcommand failed.

Steps to reproduce

cmake --build . --target opencv_dnn

Issue submission checklist

  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
  • I updated to the latest OpenCV version and the issue is still there
  • There is reproducer code and related data files (videos, images, onnx, etc)
@cudawarped cudawarped added the bug label Feb 9, 2024
@opencv-alalek opencv-alalek added category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: dnn labels Feb 9, 2024
@opencv-alalek opencv-alalek added this to the 4.10.0 milestone Feb 9, 2024
@pfmephisto
Copy link

I have what I think is the same issue on Ubuntu 22.04.3 LTS
error: ‘cudnnSetRNNDescriptor_v6’ was not declared in this scope; did you mean ‘cudnnSetRNNDescriptor_v8’?
Tested on OpenCV 4.9.0 and 4.x dev branch.

@cudawarped
Copy link
Contributor Author

error: ‘cudnnSetRNNDescriptor_v6’ was not declared in this scope; did you mean ‘cudnnSetRNNDescriptor_v8’?

No, although I suspect its the same issue if your using cuDNN 9.0. You can see my error in the original issue if you expand Full error trace.

@pfmephisto
Copy link

Indeed I am on cuDNN 9.0.0-1. It appears to me that it's the default version.
I followed Nvidia's guide for installing using their sources, and cuDNN 9 is the only available version.

apt-cache search cudnn
➤ apt-cache search cudnn
nvidia-cudnn - NVIDIA CUDA Deep Neural Network library (install script)
libcudnn8 - cuDNN runtime libraries
libcudnn8-dev - cuDNN development libraries and headers
libcudnn8-samples - cuDNN samples
cudnn - NVIDIA CUDA Deep Neural Network library (cuDNN)
cudnn9 - NVIDIA CUDA Deep Neural Network library (cuDNN)
cudnn9-cuda-11-8 - NVIDIA cuDNN for CUDA 11.8
cudnn9-cuda-11 - NVIDIA cuDNN for CUDA 11
cudnn9-cuda-12-3 - NVIDIA cuDNN for CUDA 12.3
cudnn9-cuda-12 - NVIDIA cuDNN for CUDA 12
libcudnn9-cuda-11 - cuDNN runtime libraries for CUDA 11.8
libcudnn9-cuda-12 - cuDNN runtime libraries for CUDA 12.3
libcudnn9-dev-cuda-11 - cuDNN development headers and symlinks for CUDA 11.8
libcudnn9-dev-cuda-12 - cuDNN development headers and symlinks for CUDA 12.3
libcudnn9-samples - cuDNN samples
libcudnn9-static-cuda-11 - cuDNN static libraries for CUDA 11.8
libcudnn9-static-cuda-12 - cuDNN static libraries for CUDA 12.3
apt-cache show cudnn
➤ apt-cache show cudnn
Package: cudnn
Version: 9.0.0-1
Architecture: amd64
Priority: optional
Section: multiverse/devel
Maintainer: cudatools <cudatools@nvidia.com>
Installed-Size: 7
Depends: cudnn9 (>= 9.0.0)
Filename: ./cudnn_9.0.0-1_amd64.deb
Size: 2414
MD5sum: f29f79064f8fd192b766dc4faabc2506
SHA1: 653caa201310e3bbaac2f647f0f441925aeaa0f0
SHA256: 1efa4db76754bb59b7c5c9f7a4b012127ec73f900d94624dea314e8e77303496
SHA512: 520f532cd525581720e34ad5f8925348412e68014de2c4fbe222810770dc5ab6252e6474bcc0b792b431e7f7d31a8ad0a64799a7b77f762eb5fc319e9d1a1220
Description: NVIDIA CUDA Deep Neural Network library (cuDNN)
 NVIDIA CUDA Deep Neural Network library (cuDNN)
Description-md5: 0fdfd21e8870349f974c27dcb69de946

Package: cudnn
Version: 9.0.0-1
Architecture: amd64
Priority: optional
Section: multiverse/devel
Maintainer: cudatools <cudatools@nvidia.com>
Installed-Size: 7
Depends: cudnn9 (>= 9.0.0)
Filename: ./cudnn_9.0.0-1_amd64.deb
Size: 2414
MD5sum: aa0503956dee8137cb73ffa7401dc8ca
SHA1: c34a7b559b875b2136393b0a876eea0cd3ffe4cf
SHA256: c7f1d687e9d9222298b06c7d5eccb83862a59cc2eba1441275fe9f4fb29bdc71
SHA512: f42a571497a71f17a016dde9d7e31d96ec9a7c8f2fc329d3761b85c19d2a4e3e2dbfd4ff1704c7963b750ebb2958c3dc623b8c302b8c4d8cd939cdb47fa68d96
Description: NVIDIA CUDA Deep Neural Network library (cuDNN)
 NVIDIA CUDA Deep Neural Network library (cuDNN)
Description-md5: 0fdfd21e8870349f974c27dcb69de946
apt list --installed | grep cudnn
➤ apt list --installed | grep cudnn

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

cudnn9-cuda-12-3/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic]
cudnn9-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic]
cudnn9/unknown,now 9.0.0-1 amd64 [installed,automatic]
cudnn/unknown,now 9.0.0-1 amd64 [installed]
libcudnn9-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic]
libcudnn9-dev-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic]
libcudnn9-samples/unknown,unknown,now 9.0.0.312-1 all [installed,automatic]
libcudnn9-static-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic]

@cudawarped
Copy link
Contributor Author

The older versions are available under archived releases.

@pfmephisto
Copy link

@cudawarped Thanks; cuDNN 8.9.7 fixed the issue for me.

@RaceIsIm
Copy link

Having same issue here but installing cuDNN 8.9.7 did not fix it for me 😔

@cudawarped
Copy link
Contributor Author

@RaceIsIm Can you check your CMake configuration output to make sure you are using 8.9.7 and not still using 9.0, i.e.

-- cuDNN: YES (ver 8.9.7)

@RaceIsIm
Copy link

RaceIsIm commented Feb 29, 2024

Yeah i realised that cudNN files are in cuda folder aswell, made a spare folder with the 8.9.7 files and am now compiling. if i have further errors I will post here but consider it fixed for now. thanks. (edit: after changing the dirs on cmake gui it all worked as intended)

@lhlhth
Copy link

lhlhth commented Mar 1, 2024

Anybody have qustion about LNK2019 无法解析的外部符号 "public: void __cdecl cv::cuda::GpuMat::upload(class cv::debug_build_guard::_InputArray const &)" when you use opencv in your cmake project?

@henryse
Copy link

henryse commented Mar 2, 2024

FYI have hit the same issue with FROM nvidia/cuda:12.3.1-devel-ubuntu22.04,
-D CUDNN_VERSION=9.0
-D CUDA_ARCH_BIN=8.6
-D CUDA_ARCH_PTX=8.6
-D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12
-D CUDNN_INCLUDE_DIR=/usr/include
-D CUDNN_LIBRARY=/usr/lib/x86_64-linux-gnu/libcudnn.so
-D OPENCV_DNN_CUDA=ON
I'll try to switch to 8.9.7 and see what happens.

@kramamurthi
Copy link

kramamurthi commented Mar 4, 2024

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

@jiapei100
Copy link

Same issue here: #25192

@Ambarish-Ombrulla
Copy link

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

@kramamurthi
Copy link

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

I went down to cuDNN 8.9.7 and that fixed this issue for me.

@Ambarish-Ombrulla
Copy link

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

I went down to cuDNN 8.9.7 and that fixed this issue for me.

What is the Opencv version used and can you please mention what flags have you enabled

@cudawarped
Copy link
Contributor Author

cudawarped commented Mar 12, 2024

What is the Opencv version used and can you please mention what flags have you enabled

Use the latest version of OpenCV with CUDA 12.3 and cuDNN 8.97 (currently neither CUDA 12.4 or cuDNN 9.0 are supported by OpenCV).

For flags see https://cudawarped.github.io/opencv-experiments/qmd/opencv_cuda_python_windows.html#building-opencv-with-cmake

@mitchmahan
Copy link

mitchmahan commented Mar 18, 2024

Seeing a similar issue following the directions on http://jamesbowley.co.uk/qmd/opencv_cuda_python_windows.html#building-opencv-with-cmake.

CUDA 12.3 and CUDNN 8.9.7

The errors below all refer to "recurrent_layers.cpp.obj" ?

My build gets 90% done and then fails.

[3138/3698] Linking CXX shared library bin\Release\opencv_world490.dll FAILED: bin/Release/opencv_world490.dll lib/Release/opencv_world490.lib C:\WINDOWS\system32\cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_dll --intdir=modules\world\CMakeFiles\opencv_world.dir\Release --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\mt.exe --manifests -- C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1439~1.335\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opencv_world.Release.rsp /out:bin\Release\opencv_world490.dll /implib:lib\Release\opencv_world490.lib /pdb:bin\Release\opencv_world490.pdb /dll /version:4.9 /machine:x64 /INCREMENTAL:NO /NODEFAULTLIB:libc /DEBUG && cd ." LINK: command "C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1439~1.335\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opencv_world.Release.rsp /out:bin\Release\opencv_world490.dll /implib:lib\Release\opencv_world490.lib /pdb:bin\Release\opencv_world490.pdb /dll /version:4.9 /machine:x64 /INCREMENTAL:NO /NODEFAULTLIB:libc /DEBUG /MANIFEST:EMBED,ID=2" failed (exit code 1120) with the following output: Creating library lib\Release\opencv_world490.lib and object lib\Release\opencv_world490.exp LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnSetRNNDescriptor_v6 referenced in function "public: __cdecl cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float>::RNNDescriptor<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,enum cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float>::RNNMode,int,int,bool,class cv::dnn::cuda4dnn::csl::cudnn::DropoutDescriptor const &)" (??0?$RNNDescriptor@M@cudnn@csl@cuda4dnn@dnn@cv@@QEAA@AEBVHandle@12345@W4RNNMode@012345@HH_NAEBVDropoutDescriptor@12345@@Z) recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnGetRNNWorkspaceSize referenced in function "unsigned __int64 __cdecl cv::dnn::cuda4dnn::csl::cudnn::getRNNWorkspaceSize<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,class cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float> const &,int,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &)" (??$getRNNWorkspaceSize@M@cudnn@csl@cuda4dnn@dnn@cv@@YA_KAEBVHandle@01234@AEBV?$RNNDescriptor@M@01234@HAEBV?$TensorDescriptorsArray@M@01234@@Z) recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnRNNForwardInference referenced in function "void __cdecl cv::dnn::cuda4dnn::csl::cudnn::LSTMForward<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,class cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,int,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float>,class cv::dnn::cuda4dnn::csl::DevicePtr<float>,class cv::dnn::cuda4dnn::csl::WorkspaceInstance)" (??$LSTMForward@M@cudnn@csl@cuda4dnn@dnn@cv@@YAXAEBVHandle@01234@AEBV?$RNNDescriptor@M@01234@AEBV?$FilterDescriptor@M@01234@V?$DevicePtr@$$CBM@1234@AEBV?$TensorDescriptorsArray@M@01234@3AEBV?$TensorDescriptor@M@01234@353H4V?$DevicePtr@M@1234@6VWorkspaceInstance@1234@@Z) bin\Release\opencv_world490.dll : fatal error LNK1120: 3 unresolved externals ninja: build stopped: subcommand failed.

@cudawarped
Copy link
Contributor Author

and object lib\Release\opencv_world490.exp LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnSetRNNDescriptor_v6

The error looks to be the same, I suspect you need to clean your build directory after atempting to build with cuDNN 9.0. Check the NVIDIA output of the CMake configure step to confirm you are really using cuDNN 8.9.7. e.g.

> --   NVIDIA CUDA:                   YES (ver 12.3, CUFFT CUBLAS NVCUVID NVCUVENC)
> --     NVIDIA GPU arch:             50 52 60 61 70 75 80 86 89 90
> --     NVIDIA PTX archs:            90
> --
> --   cuDNN:                         YES (ver 8.9.7)

@ZelboK
Copy link
Contributor

ZelboK commented Apr 13, 2024

Out of boredom and curiosity I tried to update the code on my own to accommodate for the breaking changes. For now it seems to build with cudnn 9, I just need to update it to be compatible with CUDA 12.4 as well.

Will push a PR upwards when I'm done. I'm guessing this will need to be for OpenCV 5? @cudawarped This would be my first contribution to openCV.

@cudawarped
Copy link
Contributor Author

@ZelboK That's great, if you subit your PR ontop of the 4.x branch the 5.x branch will get manually updated in due time.

@josyulavt
Copy link

Can confirm downgrading to cuDNN 8.97 worked for me( cuda 11.8, opencv 4.9 )

@johnnynunez
Copy link

johnnynunez commented Apr 25, 2024

Yes, In my case is compatible with cuda 12.2 and cudnn 8.9.
Cudnn9 is the problem

@jiapei100
Copy link

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue???
opencv/opencv_contrib#3728

@johnnynunez
Copy link

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue??? opencv/opencv_contrib#3728

yes, I had this problem

@jiapei100
Copy link

@johnnynunez

I just tried the solution provided opencv/opencv_contrib#3690 , I mean:

template <int N, typename... P, typename... R, class... Op>
__device__ __forceinline__ void blockReduce(const tuple<P...>& smem,
                                            const tuple<R...>& val,
                                            uint tid,
                                            const tuple<Op...>& op)
{
    block_reduce_detail::Dispatcher<N>::reductor::template reduce<
        const tuple<P...>&,
        const tuple<R...>&,
        const tuple<Op...>&>(smem, val, tid, op);
}

But problem persists...
Did you solve it?

@ZelboK
Copy link
Contributor

ZelboK commented May 1, 2024

I have a PR up for this but it seems like one of hte builds broke. I don't have time to fix this for a while but I am pretty sure cuda 12.4's thrust toolkit has some problems when it comes to the tuples.

If you're getting errors like

: error: incomplete type is not allowed
      static_assert((VecTraits<DstType>::cn == tuple_size<SrcPtrTuple>::value), "" " " "VecTraits<DstType>::cn == tuple_size<SrcPtrTuple>::value");

then be aware that there was a bug in calculating the fake tuple_size. That is fixed in CCCL's main however.

@ZelboK
Copy link
Contributor

ZelboK commented May 1, 2024

@miscco Could you comment please?

@jiapei100
Copy link

@ZelboK Where is your PR? Let me try...

@josyulavt
Copy link

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue??? opencv/opencv_contrib#3728

Nope, I was using a lower version of opencv (4.9.0) and you are using opencv 5.x, that could've changed a few things

@jiapei100
Copy link

@josyulavt
I actually tried both 4.9 and 5.0 ... Both have similar issues...

@josyulavt
Copy link

@josyulavt I actually tried both 4.9 and 5.0 ... Both have similar issues...

Whats your cuda and cudnn versions?

@jiapei100
Copy link

@josyulavt

cuda: 12.4
cudnn: 8.9.7 ( I was using 9.1.0, but now downgraded to 8.9.7)

@josyulavt
Copy link

josyulavt commented May 1, 2024

cuDNN 8.97 worked for me( cuda 11.8, opencv 4.9 )

From the comments here, it looks like cuda 12.2 would work, however I used :
cuDNN 8.97, cuda 11.8, opencv 4.9 for my setup which is compatible with onnx too.
goodluck!

@johnnynunez
Copy link

anyone is working on cuda 12.4 update 1 and cudnn 9.1.1?

asmorkalov pushed a commit that referenced this issue May 28, 2024
Refactor DNN module to build with cudnn 9 #25412

A lot of APIs that are currently being used in the dnn module have been removed in cudnn 9. They were deprecated in 8. 
This PR updates said code accordingly to the newer API.

Some key notes:
1) This is my first PR. I am new to openCV. 
2) `opencv_test_core` tests pass
3) On a 3080, cuda 12.4(should be irrelevant since I didn't build the `opencv_modules`, gcc 11.4, WSL 2. 
4) For brevity I will avoid including macro code that will allow for older versions of cudnn to build.

I was unable to get the tests working for `opencv_test_dnn` and `opencv_perf_dnn`. The errors I get are of the following: 
```
 OpenCV tests: Can't find required data file: dnn/onnx/conformance/node/test_reduce_prod_default_axes_keepdims_example/model.onnx in function 'findData'
" thrown in the test body.
```
So before I spend more time investigating I was hoping to get a maintainer to point me in the right direction here. I would like to run these tests and confirm things are working as intended. I may have missed some details.


### Pull Request Readiness Checklist

relevant issue
(#24983

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug category: dnn category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib
Projects
None yet
Development

No branches or pull requests