-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF 2.16.1 Fails to work with GPUs #63362
Comments
It does not work with python=3.12.2 either. Same error. installed tensorflow with |
The same error on bare Ubuntu and WSL2 2.15 works without any problems with python 3.11 |
I have the same problem with Ubuntu 22.04.4 with the following environment:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0 |
I'm not sure if this is the root cause, but I resolved my own issue which also surfaced as a "Cannot dlopen some GPU libraries." error when trying to run To resolve my issue, I followed the tested build versions here: and I needed to update my existing installations from cuDNN 9 -> 8.9 and CUDA 12.4->12.3 When you're on an NVIDIA download page like this one for CUDA Toolkit, don't just download the latest version. See previous versions by hitting "Archive of Previous CUDA Releases" @JuanVargas can you try uninstalling your existing CUDA installation to a tested build configuration for TF 2.16 by downgrading to CUDA 12.3? I followed this post to uninstall my existing cuda installation: @DiegoMont can you try upgrading your cuDNN to 8.9 and CUDA to 12.3? |
I am having the same issue. Brand new Ubuntu 22.04 WSL2 image. Blank Conda environment with either Trying to list the physical devices results in:
Is this a new issue caused by the fact that it doesn't appear that any system cuda needs to be separately installed in WSL2 anymore. I certainly didn't install one manually and yet |
Hi @JuanVargas , For GPU package you need to ensure the installation of CUDA driver which can be verified with nvidia-smi command. Then you need to install TF-cuda package with I have checked in colab and able to detect GPU.Please refer attached gist. |
doublequotes in pip install because of ZSH
|
|
got it work :) first then download Local Installer for Ubuntu22.04 x86_64 (Deb) unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
|
Hi Krzysztof
I visited the site
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
where I found an entry listed as " Local Installer for UBuntu22.04
x86_64(Deb)" which I downloaded.
Unfortunately what I got is a package named
"cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb"
which is not the same as the name you suggest in your message, which is "
libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb"
I assume what you meant is to get the libcudnn8_8.9.7.29*amd64.deb and
the cuda12.2_amd64.deb separately and install both.
I have CUDA 12.4. I will not go back to trying to make TF 2.16.1 work with
older versions of CUDA (12.2 or 12.3) because sooner or later
the TF team will have to produce a version with the updated version of
CUDA. IMHO, rather than us wasting time going back in versions,
the TF beak should invest time going forward to update TF to the current
CUDA version.
Thank you, Juan
…On Mon, Mar 11, 2024 at 5:30 AM Krzysztof Radzikowski < ***@***.***> wrote:
got it work :) first
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
then download Local Installer for Ubuntu22.04 x86_64 (Deb)
<https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb/>
unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
`
sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
Selecting previously unselected package libcudnn8.
(Reading database ... 47318 files and directories currently installed.)
Preparing to unpack libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb ...
Unpacking libcudnn8 (8.9.7.29-1+cuda12.2) ...
Setting up libcudnn8 (8.9.7.29-1+cuda12.2) ...
`
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2024-03-11 10:27:47.879686: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-11 10:27:47.909157: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-11 10:27:48.316717: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-11 10:27:48.664469: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-11 10:27:48.688059: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-11 10:27:48.688111: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
—
Reply to this email directly, view it on GitHub
<#63362 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGK34PV2IIF5FUZ73EPKOTYXV2SZAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXHE3TANRRGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It's just tensorflow can't see the Cuda libraries.
Instal tensorflow[and-cuda] and add this to your .bashrc or conda activation script. Adjust python version in it according to your setup.
NVIDIA_PACKAGE_DIR="$CONDA_PREFIX/lib/python3.12/site-packages/nvidia"
for dir in $NVIDIA_PACKAGE_DIR/*; do
if [ -d "$dir/lib" ]; then
export LD_LIBRARY_PATH="$dir/lib:$LD_LIBRARY_PATH"
fi
done
You won't need to install cuda or cudnn on the system. only the cuda libraries that are installed with $ pip install tensorflow[and-cuda] would be enough.
On Mon, Mar 11, 2024, 7:04 a.m. Juan E. Vargas ***@***.***>
wrote:
… Hi Krzysztof
I visited the site
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
where I found an entry listed as " Local Installer for UBuntu22.04
x86_64(Deb)" which I downloaded.
Unfortunately what I got is a package named
"cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb"
which is not the same as the name you suggest in your message, which is "
libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb"
I assume what you meant is to get the libcudnn8_8.9.7.29*amd64.deb and
the cuda12.2_amd64.deb separately and install both.
I have CUDA 12.4. I will not go back to trying to make TF 2.16.1 work with
older versions of CUDA (12.2 or 12.3) because sooner or later
the TF team will have to produce a version with the updated version of
CUDA. IMHO, rather than us wasting time going back in versions,
the TF beak should invest time going forward to update TF to the current
CUDA version.
Thank you, Juan
On Mon, Mar 11, 2024 at 5:30 AM Krzysztof Radzikowski <
***@***.***> wrote:
> got it work :) first
>
>
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
>
> then download Local Installer for Ubuntu22.04 x86_64 (Deb)
> <
https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb/>
>
> unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
>
> `
>
> sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
> Selecting previously unselected package libcudnn8.
> (Reading database ... 47318 files and directories currently installed.)
> Preparing to unpack libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb ...
> Unpacking libcudnn8 (8.9.7.29-1+cuda12.2) ...
> Setting up libcudnn8 (8.9.7.29-1+cuda12.2) ...
>
> `
>
> python3 -c "import tensorflow as tf;
print(tf.config.list_physical_devices('GPU'))"
>
>
> 2024-03-11 10:27:47.879686: I tensorflow/core/util/port.cc:113] oneDNN
custom operations are on. You may see slightly different numerical results
due to floating-point round-off errors from different computation orders.
To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
> 2024-03-11 10:27:47.909157: I
tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary
is optimized to use available CPU instructions in performance-critical
operations.
> To enable the following instructions: AVX2 AVX_VNNI FMA, in other
operations, rebuild TensorFlow with the appropriate compiler flags.
> 2024-03-11 10:27:48.316717: W
tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could
not find TensorRT
> 2024-03-11 10:27:48.664469: I
external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not
open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> Your kernel may have been built without NUMA support.
> 2024-03-11 10:27:48.688059: I
external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not
open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> Your kernel may have been built without NUMA support.
> 2024-03-11 10:27:48.688111: I
external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not
open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> Your kernel may have been built without NUMA support.
> [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>
>
> —
> Reply to this email directly, view it on GitHub
> <
#63362 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAGK34PV2IIF5FUZ73EPKOTYXV2SZAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXHE3TANRRGU>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#63362 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZRPJAGFJU5ZGBGHOSUK6DTYXWTU3AVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGM4TINRSGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
will try that and will let you know. Thank you for the suggestion. Juan
On Mon, Mar 11, 2024 at 10:52 AM Shayan Shahrokhi ***@***.***>
wrote:
… It's just tensorflow can't see the Cuda libraries.
Instal tensorflow[and-cuda] and add this to your .bashrc or conda
activation script
NVIDIA_PACKAGE_DIR="$CONDA_PREFIX/lib/python3.12/site-packages/nvidia"
for dir in $NVIDIA_PACKAGE_DIR/*; do
if [ -d "$dir/lib" ]; then
export LD_LIBRARY_PATH="$dir/lib:$LD_LIBRARY_PATH"
fi
done
On Mon, Mar 11, 2024, 7:04 a.m. Juan E. Vargas ***@***.***>
wrote:
> Hi Krzysztof
>
> I visited the site
>
>
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
>
> where I found an entry listed as " Local Installer for UBuntu22.04
> x86_64(Deb)" which I downloaded.
> Unfortunately what I got is a package named
> "cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb"
> which is not the same as the name you suggest in your message, which is
"
> libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb"
>
> I assume what you meant is to get the libcudnn8_8.9.7.29*amd64.deb and
> the cuda12.2_amd64.deb separately and install both.
>
> I have CUDA 12.4. I will not go back to trying to make TF 2.16.1 work
with
> older versions of CUDA (12.2 or 12.3) because sooner or later
> the TF team will have to produce a version with the updated version of
> CUDA. IMHO, rather than us wasting time going back in versions,
> the TF beak should invest time going forward to update TF to the current
> CUDA version.
>
> Thank you, Juan
>
>
> On Mon, Mar 11, 2024 at 5:30 AM Krzysztof Radzikowski <
> ***@***.***> wrote:
>
> > got it work :) first
> >
> >
>
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
> >
> > then download Local Installer for Ubuntu22.04 x86_64 (Deb)
> > <
>
https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb/>
>
> >
> > unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
> >
> > `
> >
> > sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
> > Selecting previously unselected package libcudnn8.
> > (Reading database ... 47318 files and directories currently
installed.)
> > Preparing to unpack libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb ...
> > Unpacking libcudnn8 (8.9.7.29-1+cuda12.2) ...
> > Setting up libcudnn8 (8.9.7.29-1+cuda12.2) ...
> >
> > `
> >
> > python3 -c "import tensorflow as tf;
> print(tf.config.list_physical_devices('GPU'))"
> >
> >
> > 2024-03-11 10:27:47.879686: I tensorflow/core/util/port.cc:113] oneDNN
> custom operations are on. You may see slightly different numerical
results
> due to floating-point round-off errors from different computation
orders.
> To turn them off, set the environment variable
`TF_ENABLE_ONEDNN_OPTS=0`.
> > 2024-03-11 10:27:47.909157: I
> tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow
binary
> is optimized to use available CPU instructions in performance-critical
> operations.
> > To enable the following instructions: AVX2 AVX_VNNI FMA, in other
> operations, rebuild TensorFlow with the appropriate compiler flags.
> > 2024-03-11 10:27:48.316717: W
> tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning:
Could
> not find TensorRT
> > 2024-03-11 10:27:48.664469: I
> external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
not
> open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> > Your kernel may have been built without NUMA support.
> > 2024-03-11 10:27:48.688059: I
> external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
not
> open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> > Your kernel may have been built without NUMA support.
> > 2024-03-11 10:27:48.688111: I
> external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
not
> open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
> > Your kernel may have been built without NUMA support.
> > [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
> >
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <
>
#63362 (comment)>,
>
> > or unsubscribe
> > <
>
https://github.com/notifications/unsubscribe-auth/AAGK34PV2IIF5FUZ73EPKOTYXV2SZAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXHE3TANRRGU>
>
> > .
> > You are receiving this because you were mentioned.Message ID:
> > ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
> <
#63362 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AZRPJAGFJU5ZGBGHOSUK6DTYXWTU3AVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGM4TINRSGQ>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#63362 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGK34PLSBZFKQ5AQKIZ7X3YXXAMRAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGYZTAMRSGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi Shayan Shahrokhi
Thank you for your suggestion (adding the location of the site-packages. I
hope you would not mind if I ask :
I saw that in your suggestion the name python 3.12 is listed. Is that the
version of python that you used to test TF 2.16.1 compatibility with CUDA?
Thank you, Juan
…On Mon, Mar 11, 2024 at 11:01 AM Juan Vargas ***@***.***> wrote:
will try that and will let you know. Thank you for the suggestion. Juan
On Mon, Mar 11, 2024 at 10:52 AM Shayan Shahrokhi <
***@***.***> wrote:
> It's just tensorflow can't see the Cuda libraries.
>
> Instal tensorflow[and-cuda] and add this to your .bashrc or conda
> activation script
>
> NVIDIA_PACKAGE_DIR="$CONDA_PREFIX/lib/python3.12/site-packages/nvidia"
>
> for dir in $NVIDIA_PACKAGE_DIR/*; do
> if [ -d "$dir/lib" ]; then
> export LD_LIBRARY_PATH="$dir/lib:$LD_LIBRARY_PATH"
> fi
> done
>
>
>
> On Mon, Mar 11, 2024, 7:04 a.m. Juan E. Vargas ***@***.***>
> wrote:
>
> > Hi Krzysztof
> >
> > I visited the site
> >
> >
> https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
> >
> > where I found an entry listed as " Local Installer for UBuntu22.04
> > x86_64(Deb)" which I downloaded.
> > Unfortunately what I got is a package named
> > "cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb"
> > which is not the same as the name you suggest in your message, which is
> "
> > libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb"
> >
> > I assume what you meant is to get the libcudnn8_8.9.7.29*amd64.deb and
> > the cuda12.2_amd64.deb separately and install both.
> >
> > I have CUDA 12.4. I will not go back to trying to make TF 2.16.1 work
> with
> > older versions of CUDA (12.2 or 12.3) because sooner or later
> > the TF team will have to produce a version with the updated version of
> > CUDA. IMHO, rather than us wasting time going back in versions,
> > the TF beak should invest time going forward to update TF to the
> current
> > CUDA version.
> >
> > Thank you, Juan
> >
> >
> > On Mon, Mar 11, 2024 at 5:30 AM Krzysztof Radzikowski <
> > ***@***.***> wrote:
> >
> > > got it work :) first
> > >
> > >
> >
> https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
> > >
> > > then download Local Installer for Ubuntu22.04 x86_64 (Deb)
> > > <
> >
> https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb/>
>
> >
> > >
> > > unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
> > >
> > > `
> > >
> > > sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
> > > Selecting previously unselected package libcudnn8.
> > > (Reading database ... 47318 files and directories currently
> installed.)
> > > Preparing to unpack libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb ...
> > > Unpacking libcudnn8 (8.9.7.29-1+cuda12.2) ...
> > > Setting up libcudnn8 (8.9.7.29-1+cuda12.2) ...
> > >
> > > `
> > >
> > > python3 -c "import tensorflow as tf;
> > print(tf.config.list_physical_devices('GPU'))"
> > >
> > >
> > > 2024-03-11 10:27:47.879686: I tensorflow/core/util/port.cc:113]
> oneDNN
> > custom operations are on. You may see slightly different numerical
> results
> > due to floating-point round-off errors from different computation
> orders.
> > To turn them off, set the environment variable
> `TF_ENABLE_ONEDNN_OPTS=0`.
> > > 2024-03-11 10:27:47.909157: I
> > tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow
> binary
> > is optimized to use available CPU instructions in performance-critical
> > operations.
> > > To enable the following instructions: AVX2 AVX_VNNI FMA, in other
> > operations, rebuild TensorFlow with the appropriate compiler flags.
> > > 2024-03-11 10:27:48.316717: W
> > tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning:
> Could
> > not find TensorRT
> > > 2024-03-11 10:27:48.664469: I
> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
> not
> > open file to read NUMA node:
> /sys/bus/pci/devices/0000:01:00.0/numa_node
> > > Your kernel may have been built without NUMA support.
> > > 2024-03-11 10:27:48.688059: I
> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
> not
> > open file to read NUMA node:
> /sys/bus/pci/devices/0000:01:00.0/numa_node
> > > Your kernel may have been built without NUMA support.
> > > 2024-03-11 10:27:48.688111: I
> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could
> not
> > open file to read NUMA node:
> /sys/bus/pci/devices/0000:01:00.0/numa_node
> > > Your kernel may have been built without NUMA support.
> > > [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
> > >
> > >
> > > —
> > > Reply to this email directly, view it on GitHub
> > > <
> >
> #63362 (comment)>,
>
> >
> > > or unsubscribe
> > > <
> >
> https://github.com/notifications/unsubscribe-auth/AAGK34PV2IIF5FUZ73EPKOTYXV2SZAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXHE3TANRRGU>
>
> >
> > > .
> > > You are receiving this because you were mentioned.Message ID:
> > > ***@***.***>
> > >
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <
> #63362 (comment)>,
>
> > or unsubscribe
> > <
> https://github.com/notifications/unsubscribe-auth/AZRPJAGFJU5ZGBGHOSUK6DTYXWTU3AVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGM4TINRSGQ>
>
> > .
> > You are receiving this because you commented.Message ID:
> > ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
> <#63362 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAGK34PLSBZFKQ5AQKIZ7X3YXXAMRAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGYZTAMRSGY>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
It's the python in the environment that I installed tensorflow[and-cuda]
there.
On Mon, Mar 11, 2024, 9:30 a.m. Juan E. Vargas ***@***.***>
wrote:
… Hi Shayan Shahrokhi
Thank you for your suggestion (adding the location of the site-packages. I
hope you would not mind if I ask :
I saw that in your suggestion the name python 3.12 is listed. Is that the
version of python that you used to test TF 2.16.1 compatibility with CUDA?
Thank you, Juan
On Mon, Mar 11, 2024 at 11:01 AM Juan Vargas ***@***.***> wrote:
> will try that and will let you know. Thank you for the suggestion. Juan
>
>
> On Mon, Mar 11, 2024 at 10:52 AM Shayan Shahrokhi <
> ***@***.***> wrote:
>
>> It's just tensorflow can't see the Cuda libraries.
>>
>> Instal tensorflow[and-cuda] and add this to your .bashrc or conda
>> activation script
>>
>> NVIDIA_PACKAGE_DIR="$CONDA_PREFIX/lib/python3.12/site-packages/nvidia"
>>
>> for dir in $NVIDIA_PACKAGE_DIR/*; do
>> if [ -d "$dir/lib" ]; then
>> export LD_LIBRARY_PATH="$dir/lib:$LD_LIBRARY_PATH"
>> fi
>> done
>>
>>
>>
>> On Mon, Mar 11, 2024, 7:04 a.m. Juan E. Vargas ***@***.***>
>> wrote:
>>
>> > Hi Krzysztof
>> >
>> > I visited the site
>> >
>> >
>>
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
>> >
>> > where I found an entry listed as " Local Installer for UBuntu22.04
>> > x86_64(Deb)" which I downloaded.
>> > Unfortunately what I got is a package named
>> > "cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb"
>> > which is not the same as the name you suggest in your message, which
is
>> "
>> > libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb"
>> >
>> > I assume what you meant is to get the libcudnn8_8.9.7.29*amd64.deb
and
>> > the cuda12.2_amd64.deb separately and install both.
>> >
>> > I have CUDA 12.4. I will not go back to trying to make TF 2.16.1 work
>> with
>> > older versions of CUDA (12.2 or 12.3) because sooner or later
>> > the TF team will have to produce a version with the updated version
of
>> > CUDA. IMHO, rather than us wasting time going back in versions,
>> > the TF beak should invest time going forward to update TF to the
>> current
>> > CUDA version.
>> >
>> > Thank you, Juan
>> >
>> >
>> > On Mon, Mar 11, 2024 at 5:30 AM Krzysztof Radzikowski <
>> > ***@***.***> wrote:
>> >
>> > > got it work :) first
>> > >
>> > >
>> >
>>
https://developer.nvidia.com/rdp/cudnn-archive?source=post_page-----bfbeb77e7c89--------------------------------
>> > >
>> > > then download Local Installer for Ubuntu22.04 x86_64 (Deb)
>> > > <
>> >
>>
https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb/>
>>
>> >
>> > >
>> > > unpack and install libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
>> > >
>> > > `
>> > >
>> > > sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb
>> > > Selecting previously unselected package libcudnn8.
>> > > (Reading database ... 47318 files and directories currently
>> installed.)
>> > > Preparing to unpack libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb ...
>> > > Unpacking libcudnn8 (8.9.7.29-1+cuda12.2) ...
>> > > Setting up libcudnn8 (8.9.7.29-1+cuda12.2) ...
>> > >
>> > > `
>> > >
>> > > python3 -c "import tensorflow as tf;
>> > print(tf.config.list_physical_devices('GPU'))"
>> > >
>> > >
>> > > 2024-03-11 10:27:47.879686: I tensorflow/core/util/port.cc:113]
>> oneDNN
>> > custom operations are on. You may see slightly different numerical
>> results
>> > due to floating-point round-off errors from different computation
>> orders.
>> > To turn them off, set the environment variable
>> `TF_ENABLE_ONEDNN_OPTS=0`.
>> > > 2024-03-11 10:27:47.909157: I
>> > tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow
>> binary
>> > is optimized to use available CPU instructions in
performance-critical
>> > operations.
>> > > To enable the following instructions: AVX2 AVX_VNNI FMA, in other
>> > operations, rebuild TensorFlow with the appropriate compiler flags.
>> > > 2024-03-11 10:27:48.316717: W
>> > tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning:
>> Could
>> > not find TensorRT
>> > > 2024-03-11 10:27:48.664469: I
>> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984]
could
>> not
>> > open file to read NUMA node:
>> /sys/bus/pci/devices/0000:01:00.0/numa_node
>> > > Your kernel may have been built without NUMA support.
>> > > 2024-03-11 10:27:48.688059: I
>> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984]
could
>> not
>> > open file to read NUMA node:
>> /sys/bus/pci/devices/0000:01:00.0/numa_node
>> > > Your kernel may have been built without NUMA support.
>> > > 2024-03-11 10:27:48.688111: I
>> > external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984]
could
>> not
>> > open file to read NUMA node:
>> /sys/bus/pci/devices/0000:01:00.0/numa_node
>> > > Your kernel may have been built without NUMA support.
>> > > [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>> > >
>> > >
>> > > —
>> > > Reply to this email directly, view it on GitHub
>> > > <
>> >
>>
#63362 (comment)>,
>>
>> >
>> > > or unsubscribe
>> > > <
>> >
>>
https://github.com/notifications/unsubscribe-auth/AAGK34PV2IIF5FUZ73EPKOTYXV2SZAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXHE3TANRRGU>
>>
>> >
>> > > .
>> > > You are receiving this because you were mentioned.Message ID:
>> > > ***@***.***>
>> > >
>> >
>> > —
>> > Reply to this email directly, view it on GitHub
>> > <
>>
#63362 (comment)>,
>>
>> > or unsubscribe
>> > <
>>
https://github.com/notifications/unsubscribe-auth/AZRPJAGFJU5ZGBGHOSUK6DTYXWTU3AVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGM4TINRSGQ>
>>
>> > .
>> > You are receiving this because you commented.Message ID:
>> > ***@***.***>
>> >
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <
#63362 (comment)>,
>> or unsubscribe
>> <
https://github.com/notifications/unsubscribe-auth/AAGK34PLSBZFKQ5AQKIZ7X3YXXAMRAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGYZTAMRSGY>
>> .
>> You are receiving this because you were mentioned.Message ID:
>> ***@***.***>
>>
>
—
Reply to this email directly, view it on GitHub
<#63362 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZRPJACFR4YH7UCXMOEH7MTYXXEXVAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYG4YTQMZVGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ you can get .deb file there directrly |
Thanks @sh-shahrokhi. I thought it was path related. Modified slightly to make it python version independent if you put it in your conda environment activation (
This is not a resolution as this post install step should not be necessary.
I can't seem to do similar tricks to resolve the TensorRT issues when installed similarly into the conda environment. Any ideas? |
I don't actually use TensorRT, but I would check if the required .so file for it is visible to tensorflow. Maybe I would need to find the name of required file in tensorflow source code. This actually doesn't change the fact that the new tensorflow version should be tested by google team before release, or the bugs should be fixed. It seems they only care about having a working docker image, not anything else. |
I have given up on TensorRT. I guess I won't be using it either.
Agreed. Installing TF has always been hit or miss and it seems that in the many years since I last used TF that hasn't changed one bit. |
Well, I wasted 8hr of my Sunday on this setting up another pc from scratch. Before reverting to the old version. Now looking to move off tensor flow. |
In general, we used to test RC versions before release. For example, we used to have RC0, RC1 and RC2 for TF 2.9. This gave people and downstream teams enough time to test and report issues. It seems that 2.16.1 only had an RC0 (for 2.16.0). The release process is (was?) like this:
Overall, this process would take However, for 2.16 release, although the branch was cut on Feb 8th, there has been only one RC. Most likely issues can be solved by a patch release |
I am closing this (unresolved issue) because I am told by the Keras/TF team that the issue is related to TF. |
@MrOxMasTer the purpose of installing WSL 2 is to "install TensorFlow inside" and as a process it is irrelevant to installing the CUDA Toolkit and cuDNN on Windows. Generally, installing WSL 2 allows you to run a full Linux environment within Windows, making it easier to develop and run applications that rely on Linux-based tools and libraries as TensorFlow version |
@sgkouzias Thank you. The solution from @niko247 worked and it is what I am using |
I also have to guess what the installation problem is: The most terrible setup I've ever seen in my entire life. It is not clear where I need to enter these commands, and why, if all commands need to be entered in wsl2, then why do you have a command in your sentence that is entered only on the side where the graphics driver is installed, because clearly After all, it is clearly written in white in nvidia that the graphics driver does not need to be installed in wsl, and for some reason you offer me to enter a command that works only on the windows side, but at the same time you say that the entire installation in wsl takes place inside. How am I supposed to know what to install inside wsl, if it's just from the point of view of a person who uses such functionality for the first time, it sounds like nonsense, because I don't understand how to use it and naturally I will try to run something in windows, because I work in windows. Even if I install it, how would I know that for example in vs code there is an extension like “wsl” that allows you to connect to wsl? |
@MrOxMasTer you can do every installation (mostly python packages) "pip install" or "sudo apt install" in WSL ubuntu like you are using a PC running ubuntu OS , except CUDA, CuDNN installation for Windows. |
@MrOxMasTer since you work in Windows you could simply refer to the TensorFlow official documentation to install TensorFlow with pip for Windows WSL2 (aka Windows Subsystem for Linux) and open the provided link for the official CUDA on WSL User Guide. Notice that in the official CUDA on WSL User Guide is clearly stated that: "Once a Windows NVIDIA GPU driver is installed on the system, CUDA becomes available within WSL 2. The CUDA driver installed on Windows host will be stubbed inside the WSL 2 as libcuda.so, therefore users must not install any NVIDIA GPU Linux driver within WSL 2...." Also kindly note that the current issue opened "TF 2.16.1 Fails to work with GPUs" involves Linux Operating Systems and potentially the additional steps to be specified in the official TensorFlow documentation in order to utilize GPUs locally. Until today the officially documented TensorFlow standard installation procedure for Linux users with GPUs does not include the additional steps required to perform deep learning experiments with TensorFlow version Hope that the next patch version of TensorFlow will fix the bug as soon as possible! |
Yes, I say for this that you do not need to install graphics drivers in wsl, but installing tensorflow involves not only installing graphics drivers, but also cuDNN and cudaToolkit (possibly TensorRT) and it was not clear to me specifically where it needed to be installed. Then I saw that I needed to install everything inside wsl except the graphics driver |
I started a not very pleasant acquaintance with tensorflow with this version. As I understand it, the specific reason is 2.16.1 and it does not work in wsl. Because nothing worked for me. And the question is which version can be installed so that it works normally in wsl. Also, for the future, I will say that installing anaconda does not help either. You can install a maximum of 2.10 version on it |
@MrOxMasTer I totally understand your frustration but I reassure you that TensorFlow version You can try the following:
conda create --name tf python=3.11
conda activate tf
pip install --upgrade pip
pip install tensorflow[and-cuda]
Locate the directory for the conda environment in your terminal window by running in the terminal:
Enter that directory and create these subdirectories and files: cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh Edit #!/bin/sh
# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))
# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))
# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}} Edit #!/bin/sh
# Restore original LD_LIBRARY_PATH
export LD_LIBRARY_PATH="${ORIGINAL_LD_LIBRARY_PATH}"
# Unset environment variables
unset CUDNN_DIR
unset PTXAS_DIR Verify the GPU setup: Additionally, as I was informed the next version of TensorFlow will hopefully arrive within the next days! I hope it helps! |
Thanks, but it doesn't work for me @sgkouzias. It does at least find some files. However, despite the verify gpu setup saying that 1 gpu is available, no gpu activity... all cpu |
@GorillaDaddy well in order to work your setup should meet some certain technical requirements (please first check the official TensorFlow documentation). As I am not aware of your setup I could not possibly guess (even if I could provide some useful assistance for you) why it seems that your GPU is not properly utilized (if it is the case). However, here are some hints that I hope will help you: |
Hooray, it worked, thanks, but I have a mistake with this NUMA. Is this normal? Could it be because I did not install Cuda and Cdn on behalf of the administrator? |
@MrOxMasTer congratulations and thanks for the feedback. The error "Your kernel may have been built without NUMA support" refers to the lack of NUMA (Non-Uniform Memory Access) support in the kernel you are using. NUMA is a memory architecture used in multiprocessor systems where the memory access time depends on the memory location relative to the processor. NUMA support is important for optimizing memory access on systems with multiple CPUs or GPUs. It allows the operating system to allocate memory and schedule processes in a way that reduces memory access latency. The Windows Subsystem for Linux (aka WSL) provides a Linux-compatible kernel interface developed by Microsoft and allows you to run Linux binaries on Windows. However, WSL's kernel might lack certain features present in a full-fledged Linux kernel, including NUMA support. The lack of NUMA support might lead to suboptimal performance on systems with multiple processors or GPUs because the memory allocation might not be as efficient. Consequently you can safely ignore the warning (you can read more about the warning in this discussion in nvidia.developer.com ). |
I'm also facing this issue as the op since I've upgraded to 2.16.1. After the downgrade to 2.15.1 everything runs smooth. TensorFlow version OS platform and distribution Python version CUDA/cuDNN version Actually I want to import the ops package from keras, but it seems it is firstly available on keras 3. If I upgrade keras I also have to upgrade tensorflow due to incompatibilities... but after the upgrade I'm not able to use the GPU anymore. |
@rednag as I understand it you have two available options: 1) Keep TensorFlow version 2) Upgrade Tensorflow to version
conda create --name tf python=3.11
conda activate tf
Locate the directory for the conda environment in your terminal window by running in the terminal:
Enter that directory and create these subdirectories and files: cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh Edit #!/bin/sh
# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))
# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))
# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}} Edit #!/bin/sh
# Restore original LD_LIBRARY_PATH
export LD_LIBRARY_PATH="${ORIGINAL_LD_LIBRARY_PATH}"
# Unset environment variables
unset CUDNN_DIR
unset PTXAS_DIR
I have submitted the respective pull request to update the official TensorFlow installation guide and is currently pending review. Additionally, as I was informed the next version of TensorFlow will hopefully arrive within the next days! I hope it helps! |
Thank you for the fast reply. At the moment I'm using the old functions from the keras.src.utils and tf packages, but I'm looking forward to the new release. |
@rednag great. Another option to consider for fast model training with Keras 3 and GPU acceleration is to use JAX as keras backend. |
Can someone care to explain why TF I.e. What is the problem and why is it not being addressed by the community? |
Google removed the native windows cuda build starting TF 2.11 |
Everyone that cared about full support of TF is no longer in the team. See above comments for more details and differences |
Unfortunately that doesn't say anything. I don't see how you can "remove" any of that, apart from breaking the build scripts. Whatever you "remove" must still be present for all other nix builds. WSL is not that different from MSYS, MinGW, which (no longer) is too far from VS C/C++ builds. |
|
Doesn't work for me :/ I even reinstalled completely WSL, but I still get an empty list when showing the available devices... Should CUDA be unistalled on Windows side ? When I use "nvidia-smi", it is written that I have the 12.5 Cuda Version, even if I didn't install anything on WSL... Is that normal ? +-----------------------------------------------------------------------------------------+ |
@ben-jy frankly I have no clue. Did you check the official documentation? Your setup meets the technical requirements? What's the Python version in WSL2? Is it compatible with TensorFlow |
Official documentation has not been updated fot TF 2.16 and still refers to cuda 11.8 not 12.
…________________________________
From: Sotiris Gkouzias ***@***.***>
Sent: Monday, June 17, 2024 9:54 AM
To: tensorflow/tensorflow ***@***.***>
Cc: Shayan Shahrokhi ***@***.***>; Mention ***@***.***>
Subject: Re: [tensorflow/tensorflow] TF 2.16.1 Fails to work with GPUs (Issue #63362)
Also kindly note that the current issue opened "TF 2.16.1 Fails to work with GPUs" involves Linux Operating Systems and potentially the additional steps to be specified in the official TensorFlow documentation in order to utilize GPUs locally.
I started a not very pleasant acquaintance with tensorflow with this version. As I understand it, the specific reason is 2.16.1 and it does not work in wsl. Because nothing worked for me. And the question is which version can be installed so that it works normally in wsl.
Also, for the future, I will say that installing anaconda does not help either. You can install a maximum of 2.10 version on it
@MrOxMasTer<https://github.com/MrOxMasTer> I totally understand your frustration but I reassure you that TensorFlow version 2.16.1 can actually work with your cuda-enabled GPU.
You can try the following:
1. Create a fresh conda virtual environment in WSL and activate it, like this:
conda create --name tf python=3.11
conda activate tf
1. Within the fresh conda virtual environment tf created in the previous step run the following commands sequentially:
pip install --upgrade pip
pip install tensorflow[and-cuda]
1. Set environment variables:
Note: This step is required in order to utilize your GPU but not yet included in the official TensorFlow documentation. All NVIDIA libs are installed with TensorFlow due to the fact you ran the command pip install tensorflow[and-cuda] in the previous step!
Locate the directory for the conda environment in your terminal window by running in the terminal:
echo $CONDA_PREFIX
Enter that directory and create these subdirectories and files:
cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh
Edit ./etc/conda/activate.d/env_vars.sh as follows:
#!/bin/sh
# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))
# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))
# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}}
Edit ./etc/conda/deactivate.d/env_vars.sh as follows:
#!/bin/sh
# Restore original LD_LIBRARY_PATH
export LD_LIBRARY_PATH="${ORIGINAL_LD_LIBRARY_PATH}"
# Unset environment variables
unset CUDNN_DIR
unset PTXAS_DIR
Verify the GPU setup: python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Additionally, as I was informed the next version of TensorFlow will hopefully arrive within the next days!
I hope it helps!
Doesn't work for me :/ I even reinstalled completely WSL, but I still get an empty list when showing the available devices... Should CUDA be unistalled on Windows side ? When I use "nvidia-smi", it is written that I have the 12.5 Cuda Version, even if I didn't install anything on WSL... Is that normal ?
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01 Driver Version: 555.99 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
@ben-jy<https://github.com/ben-jy> frankly I have no clue. Did you check the official documentation<https://www.tensorflow.org/install/pip#windows-wsl2>? Your setup meets the technical requirements? What's the Python version in WSL2? Is it compatible with TensorFlow 2.16.1? What's the name of your NVIDIA GPU? The output of the command nvidia-smi in WSL2 seems normal since your GPU driver is installed in Windows. However you could try reinstalling everything (compatible GPU driver, afterwards WSL2 and then TensorFlow)...
—
Reply to this email directly, view it on GitHub<#63362 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AZRPJAD4P5CMSQLZQOQO5CTZH4BEJAVCNFSM6AAAAABEOPWBC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTG43TKOBSGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@sgkouzias I checked the official documentation, but I find it not very clear, and seems a bit contradictory: the software requirements state that CUDA and cuDNN should be installed on the machine, but the pip package should install them automatically with Tensorflow right ? Besides, this medium tutorial explain that CUDA should not be installed on Windows side, neither on WSL side, and be installed using the pip package. Maybe I should try to uninstall all CUDA-related on Windows...
I will try to make a clean reinstall of my GPU driver, as well as unistalling CUDA on Windows side. If it doesn't work, I think it is better to install CUDA and cuDNN manually, along with an oldest TensorFlow version. It is still a shame that the official documentation of such a large and important library is so unclear. |
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
No
Source
binary
TensorFlow version
TF 2.16.1
Custom code
No
OS platform and distribution
Linux Ubuntu 22.04.4 LTS
Mobile device
No response
Python version
3.10.12
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
12.4
GPU model and memory
No response
Current behavior?
I created a python venv in which I installed TF 2.16.1 following your instructions: pip install tensorflow
When I run python, import tf, and issue tf.config.list_physical_devices('GPU')
I get an empty list [ ]
I created another python venv, installed TF 2.16.1, only this time with the instructions:
python3 -m pip install tensorflow[and-cuda]
When I run that version, import tensorflow as tf, and issue
tf.config.list_physical_devices('GPU')
I also get an empty list.
BTW, I have no problems running on my box TF 2.15.1 with GPUs. Julia also works just fine with GPUs and so does PyTorch.
the
Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: