-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If you are running into problems with TensorFlow #173
Comments
@chogovadze , Thanks for outlining the steps here. I was having the same issues described here and followed the steps to fix the TF version and CUDA version incompatibility. After finishing these steps I got an error when I tried to run superpoint (script export_detections.py): |
I realized my error - I was pointing to my earlier venv in the makefile. After I removed that, I reran make install (which reinstalled superpoint). However, now I am getting the following error when I try to run the export_detections.py script: |
For those who have difficulties running on GPUs that can't match lower version CUDA (3080 in my case), |
Thanks. Solve my issue with loss nan, precision nan, recall 0.0000 on RTX 3090. |
你好,我的是RTX3080Ti,请问你的训练成功了吗?希望可以联系一下,可以相互学习学习,感谢 |
我在训练magicpoint的时候也遇到了loss nan的问题,请问您解决了吗?可以加QQ 972048746联系一下 |
Hello everyone,
It seems that several users are reporting the same kind of obstacles with regards to training/predicting.
After research, this problem appears to be a compatibility issue of old versions of tensorflow 1.x and newer GPUs when installing through pip. Compiling tensorflow from source resolves this issue however it is very time-consuming. I hope this write up could help other users that are having trouble with their environment.
This method requires the use of conda.
conda install tensorflow-gpu=1.12
(conda will automatically pull the correct cuda/cudnn versions).tensorflow-gpu==1.12
fromrequirement.txt
and run the makefile.batch_size
andeval_batch_size
in the config files to 1.export TF_FORCE_GPU_ALLOW_GROWTH=true
followed byexport TMPDIR=/tmp/
in your current terminal session.If you are still having issues be sure that you have NOT:
conda install cudnn=x.x.x=cudax.x_x
.References from:
I have successfully worked with this repository with the following setup:
If you are still having some issues, please do not hesitate to reach out.
The text was updated successfully, but these errors were encountered: