Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU not engaged during training - RTX 3070 #1078

Closed
dlramamurthy opened this issue Jan 13, 2021 · 1 comment
Closed

GPU not engaged during training - RTX 3070 #1078

dlramamurthy opened this issue Jan 13, 2021 · 1 comment

Comments

@dlramamurthy
Copy link

dlramamurthy commented Jan 13, 2021

OS: Windows 10
Graphics card: RTX3070
CUDA: 11.1 (also tried 11.0, see below)
Python: 3.7
I'm running deeplabcutcore with tensorflow 2.0 due to the issue of series 3000 cards not working with Tensorflow 1.x. I'm using a conda virtual environment and have tf-nightly-gpu '2.5.0-dev20210111' installed.
I'm new to using deeplabcut with a local GPU rather than colab and I'm having trouble with getting the GPU to engage during training.

When I test GPU with tf.config.list_physical_devices

2021-01-12 19:56:45.273987: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2021-01-12 19:56:45.278178: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-01-12 19:56:45.279096: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-01-12 19:56:45.279522: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU.

I also tried using CUDA 11.0 instead, and then GPU was engaged but it crashed with CUBLAS_STATUS_ALLOC_FAILED error...

Any suggestions on the best way to proceed? Thanks!

@MMathisLab
Copy link
Member

yes, please see this issue and resolution here: #944 (i.e, would be good to post there, or in deeplabcutcore, as that is the code you are using, vs. new issue here) & please note the you need to use the correct CUDA for 2.5, which is not CUDA 10; please carefully check your driver, CUDA, and TF versions: https://www.tensorflow.org/install/source#gpu

I would never recommend using the nightly build of TF until you know you have an env. that works; please try a stable release:

Screen Shot 2021-01-13 at 3 51 09 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants