New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure script automatically selects CUDA/cuDNN path instead of waiting for user input #60760
Comments
Hi @ramizouari , Tensorflow preconfigures paths of the CUDA and cuDNN toolkits which are installed as per Official instructions in documentation using Conda.If the script able to detect the path automatically then it won't ask the user to mention the paths.If the path not able to detectable by script then it will prompt the users to mention the path.Please refer the below example for same.
So if the script is able to identify the path then tensorflow only facilitating the users right. However if you want to keep the cuda and cudnn libraries at a particular directory or want to use particular version of cuda/cudnn you can done this by removing cuda/cuDNN from standard download path and then the script will ask to enter the cuda path as seen in above example. I would like to know how you installed the cuda/cuDNN and how the path has been set. Also please confirm whether the auto detection is causing any particular problem for your case. Please elaborate. Thanks! |
Hi @SuryanarayanaY , I installed both cuDNN and CUDA via Nvidia's RPM package. And so it is updated via the package manager. Now to be more precise, for any update with version xx.y of CUDA. the package manager will:
With this, I effectively have many CUDA versions installed on the path The path is set on login. In fact, my export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
I am going to slightly disagree on this. Also, the documentation itself hints that the script should do such behaviour upon detecting many CUDA versions, which is not what is happening. |
Hi @ramizouari , The script for ./configure can be found here. If you are interested then go through the source code and analyse the behaviour and may let us know if you have any pointers for this behaviour. Thanks! |
@nitins17 - Please share your pointers on this issue. CC - @learning-to-play |
Click to expand!
Issue Type
Bug
Have you reproduced the bug with TF nightly?
No
Source
source
Tensorflow Version
TF 2.10
Custom Code
No
OS Platform and Distribution
Fedora 37
Mobile device
No response
Python version
3.10
Bazel version
5.3.0
GCC/Compiler version
12.3.1
CUDA/cuDNN version
11.8,12.1/8.0
GPU model and memory
GTX 1660 Ti, 6 GB
Current Behaviour?
I am having multiple CUDA versions, and I am trying to build Tensorflow from source with CUDA support.
Now the problem lays when I try to configure the build system using
./configure
. It will asks for relevant information for the build system. This includes:Now, when I select CUDA support. the script seems to automatically selects my CUDA/cuDNN versions, and does not give me the possibility to select it manually, which is contradictory to what the documentation suggests at https://www.tensorflow.org/install/source#gpu_support: "If your system has multiple versions of CUDA or cuDNN installed, explicitly set the version instead of relying on the default"
Now, I was able to trace the issue exactly to the
configure.py
file.In fact, I strongly suspects that there is a logical error on the section that parses the user input (Line 1244 on branch r2.11):
Now, from my understanding, the script will validate the given environment, and then if that fails will ask for user input.
With that, on the first iteration of the loop, the validation will not contain the required environment variables.
I was able to solve the issue by swapping the order as follow:
Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: