New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda4dnn: fix version check diagnostic #19222
cuda4dnn: fix version check diagnostic #19222
Conversation
I believe this is related, but might not be: https://forum.opencv.org/t/dnn-gpu-broken-cuda-issues-pls-help/1298 |
This is mostly because of the new compatibility system and versioning scheme since CUDA 11.1. I will look into it and make appropriate changes here since it's related.
References: https://docs.nvidia.com/deploy/cuda-compatibility/index.html |
@YashasSamaga Friendly reminder about the patch. |
3a1c0f7
to
c42543a
Compare
c42543a
to
d0fe6ad
Compare
{ | ||
backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA)); | ||
if (cuda4dnn::doesDeviceSupportFP16()) | ||
backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA_FP16)); | ||
backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA_FP16)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a breaking change. The issue is that BackendRegistry
is a singleton object but the available targets for CUDA depend on the current device that has been selected.
If the user initially used a device without FP16 support and then switches to a device with FP16 support, FP16 target won't be returned with getAvailableTargets
since the registry is initialized when the device without FP16 support was present.
So now it always returns both targets and there is a fallback to FP32 in initCUDABackend
if FP16 isn't supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
The
checkVersions()
that was introduced in #17788 was partially useless since a version mismatch (that could create problems) would throw an exception beforecheckVersions()
.The FP16 and device compatibility checks have been removed from
BackendRegistry
.BackendRegistry
creates a static local object. Hence, subsequent calls to get targets with device id changed can return the wrong set of targets. These checks are instead done ininitCUDA
now. The target will switch to FP32 if FP16 is not supported.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.