New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No such file or directory #35
Comments
Hello,
|
Hi hadikoub, CUDA-version: 10000 (11060), cuDNN: 7.6.5, CUDNN_HALF=1, GPU count: 2 CUDNN_HALF=1 OpenCV version: 3.2.0 0 : compute_capability = 860, cudnn_half = 1, GPU: NVIDIA RTX A4000 layer filters size/strd(dil) input output And after half an hour in the log file is something like: v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 16, class_loss = -nan, iou_loss = -nan, total_loss = -nan v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 17, class_loss = -nan, iou_loss = -nan, total_loss = -nan v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 5, class_loss = -nan, iou_loss = -nan, total_loss = -nan It seems that with modern Nvidia GPUs is some problem. |
Hello Robert, Glad that the solution is now working. Please refer to: |
Hi Hadi, thanks for fast response. I will look at it.
Best regards. Robert
… Dňa 4. 3. 2022 o 18:29, Hadi Koubeissy ***@***.***> napísal:
Hello Robert,
Glad that the solution is now working.
Regarding the other issue you are facing on newer versions of the RTX 3080 and A4000; This issue is not caused by the solution but it's due to Nvidia Cuda support on newer devices.
Currently, Nvidia 30 RTX Series supports Cuda 11.x only without backward compatibility with older versions of Cuda like the one that's currently used in the solution (Cuda 10.0)
Please refer to:
https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/ <https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/>
https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html <https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html>
—
Reply to this email directly, view it on GitHub <#35 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIH2MACHIMKCYQYESJ7K4VTU6JB6FANCNFSM5PIX5PXQ>.
Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.
|
Hello @rsicak Again, I've created a branch for Cuda 11 support named (cuda11_support) Link: https://github.com/BMW-InnovationLab/BMW-YOLOv4-Training-Automation/tree/cuda11_support but it's still under testing and thus stability is not fully guaranteed. You can take a look at it in case this is convenient for you. |
Hi, I have tried the new branch. Docker compiled and then run. It worked up to 300 iterations and stopped on cudnn error. |
Hi, Thank you for the suggestion I'll try to change the docker image to the suggested one and do some tests. |
Hi I have the same issue as others in this issue history.
I have tried solution to set DOWNLOAD_ALL=1 in dockerfile but not works for me.
I have yolov4.weights in the right folder under config/darknet/yolov4_default_weights/
Any help? Thank you. Robert
The text was updated successfully, but these errors were encountered: