tensorflow gpu #65035

soheil-asgari · 2024-04-04T07:29:20Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.16

Custom code

Yes

OS platform and distribution

ubuntu

Mobile device

No response

Python version

3.11

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

nvidia gtx 1650

Current behavior?

I did all the installation steps step by step but still I can't use the GPU and this is really bothering me.

Standalone code to reproduce the issue

2024-04-04 10:56:25.510348: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

Relevant log output

2024-04-04 10:56:25.510348: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

sushreebarsa · 2024-04-05T07:38:50Z

@soheil-asgari Could you verify if you have a compatible Nvidia GPU by running nvidia-smi in your terminal. If there's no output, you might not have an Nvidia GPU or the drivers might not be installed?
Thank you!

Retalak · 2024-04-05T21:06:32Z

It is horrendous how hard it is to get this working on Windows.

alexanderbeatson · 2024-04-07T13:36:26Z

It is horrendous how hard it is to get this working on Windows.

Hello @Retalak
Please note that Tensorflow 2.10 was the last version that support GPU on Windows. If you want to use Tensorflow-gpu on Windows, you need to install Tensorflow in WSL2.

alexanderbeatson · 2024-04-07T13:46:56Z

I had the similar issue with Tensorflow GPU, but the wheel install from this wheel solved the issue.

Make sure your nvidia-smi running properly and your cuda version is compatible with Tensorflow 2.15 (Default 2.16 version is yet to be supported).

You can also search compatible wheel file here!

zzj0402 · 2024-04-08T07:37:06Z

Python 3.9.19 (main, Mar 21 2024, 17:11:28) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
NVIDIA-SMI 550.54.10              Driver Version: 551.61         CUDA Version: 12.4
Skipping registering GPU devices...
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 6427527073389244107
xla_global_id: -1
]
TF Version
2.16.1

On WSL2 Windows 11 23H2 4090 laptop version. Followed pip install tensorflow[and-cuda] no luck

Retalak · 2024-04-08T20:23:48Z

I eventually got it working. Part of the problem with the documentation IMO is that it does not clearly state what is required on the host vs the guest machine for WSL2. I have the CUDA Toolkit 12.4 installed on my Windows host, not sure if that is necessary but I'm not messing with it now that I got it working. I did not install anything on the guest before running the above install command, it installed the needed versions of the toolkit and CUDNN (I also did a fresh install of my WSL2 - Ubuntu LTS 22.04.3 - to clear anything I had done previously). Another issue is there are a lot of error messages that are considered "normal" and may throw off new users. Here is the test command result on my WSL2 guest with it working (I think):

retalak@**-********:~$ python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2024-04-08 16:07:56.717457: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-08 16:07:57.579692: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-08 16:07:58.688620: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-04-08 16:07:58.852701: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-04-08 16:07:58.852827: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

I also had to add this to the end of ~/.bashrc on my WSL2 guest (replace USER_NAME and make sure to exit WSL2 terminal and start a new one after saving and before running the test command):

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/USER_NAME/.local/bin;
export NVIDIA_DIR=$(dirname $(dirname $(python3 -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")))
export LD_LIBRARY_PATH=$(echo ${NVIDIA_DIR}/*/lib/ | sed -r 's/\s+/:/g')${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note I am not using miniconda, so this may be different if you are trying to do this in a miniconda environment.

sushreebarsa · 2024-04-10T06:11:48Z

@soheil-asgari Could you please let us know any update on this issue?
Thank you!

github-actions · 2024-04-18T01:47:28Z

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

svkgn4DL · 2024-04-19T13:33:07Z

This is still an issue. Documentation not states to set environment variable required for GPU like mentioned by @Retalak. If i remember correctly tensorflow used to have this documented earlier versions.

github-actions · 2024-04-30T01:47:47Z

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

google-ml-butler · 2024-04-30T01:47:49Z

Are you satisfied with the resolution of your issue?
Yes
No

google-ml-butler bot added the type:bug Bug label Apr 4, 2024

google-ml-butler bot assigned sushreebarsa Apr 4, 2024

sushreebarsa added stat:awaiting response Status - Awaiting response from author type:build/install Build and install issues subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.16 and removed type:bug Bug labels Apr 5, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 5, 2024

sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Apr 10, 2024

github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 18, 2024

google-ml-butler bot removed stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author labels Apr 19, 2024

sushreebarsa added stat:awaiting response Status - Awaiting response from author stale This label marks the issue/pr stale - to be closed automatically if no activity labels Apr 22, 2024

github-actions bot closed this as completed Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorflow gpu #65035

tensorflow gpu #65035

soheil-asgari commented Apr 4, 2024

sushreebarsa commented Apr 5, 2024

Retalak commented Apr 5, 2024

alexanderbeatson commented Apr 7, 2024

alexanderbeatson commented Apr 7, 2024

zzj0402 commented Apr 8, 2024 •

edited

Retalak commented Apr 8, 2024 •

edited

sushreebarsa commented Apr 10, 2024 •

edited

github-actions bot commented Apr 18, 2024

svkgn4DL commented Apr 19, 2024

github-actions bot commented Apr 30, 2024

google-ml-butler bot commented Apr 30, 2024

tensorflow gpu #65035

tensorflow gpu #65035

Comments

soheil-asgari commented Apr 4, 2024

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sushreebarsa commented Apr 5, 2024

Retalak commented Apr 5, 2024

alexanderbeatson commented Apr 7, 2024

alexanderbeatson commented Apr 7, 2024

zzj0402 commented Apr 8, 2024 • edited

Retalak commented Apr 8, 2024 • edited

sushreebarsa commented Apr 10, 2024 • edited

github-actions bot commented Apr 18, 2024

svkgn4DL commented Apr 19, 2024

github-actions bot commented Apr 30, 2024

google-ml-butler bot commented Apr 30, 2024

zzj0402 commented Apr 8, 2024 •

edited

Retalak commented Apr 8, 2024 •

edited

sushreebarsa commented Apr 10, 2024 •

edited