Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow WSL GPU CUDA recognition issue RTX3090 #63948

Closed
jeanswiegers opened this issue Mar 19, 2024 · 9 comments
Closed

Tensorflow WSL GPU CUDA recognition issue RTX3090 #63948

jeanswiegers opened this issue Mar 19, 2024 · 9 comments
Assignees
Labels
TF 2.16 type:build/install Build and install issues wsl2 Windows Subsystem for Linux

Comments

@jeanswiegers
Copy link

jeanswiegers commented Mar 19, 2024

tf_env.txt

Issue type

Build/Install

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.16.1

Custom code

No

OS platform and distribution

Linux Ubuntu 20.04.6

Mobile device

No response

Python version

3.11.7

Bazel version

No response

GCC/compiler version

11.2.0

CUDA/cuDNN version

8.6

GPU model and memory

NVIDIA GeForce RTX 3090

Current behavior?

I installed Tensorflow using gpu install guide for WSL as follows:

wsl --install
sudo apt update
sudo apt upgrade
sudo apt install python3-pip
cd Desktop
Download anaconda linux
https://www.anaconda.com/download#downloads
bash Anaconda3-2024.02-1-Linux-x86_64.sh
~/anaconda3/bin/conda init bash
~/anaconda3/bin/conda init zsh
conda create --name tf-gpu python==3.11
conda activate tf-gpu
pip install --upgrade pip
pip install tensorflow[and-cuda]

Pytorch is working perfectly in WSL with Cuda enabled. Tensorflow is not recognizing my GPU.

nvidia-smi output:

Tue Mar 19 11:19:07 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.60.01              Driver Version: 551.76         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:01:00.0  On |                  N/A |
| 30%   54C    P0            121W /  350W |    1642MiB /  24576MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

nvcc --version output:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" output:

2024-03-19 11:29:26.228449: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-19 11:29:26.250017: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-19 11:29:26.585563: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-19 11:29:26.832574: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-19 11:29:26.844141: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

/usr/local/cuda-11.8/extras/demo_suite/deviceQuery output:

/usr/local/cuda-11.8/extras/demo_suite/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 3090"
  CUDA Driver Version / Runtime Version          12.4 / 11.8
  CUDA Capability Major/Minor version number:    8.6
  Total amount of global memory:                 24576 MBytes (25769279488 bytes)
  (82) Multiprocessors, (128) CUDA Cores/MP:     10496 CUDA Cores
  GPU Max Clock rate:                            1695 MHz (1.70 GHz)
  Memory Clock rate:                             9751 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 6291456 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.4, CUDA Runtime Version = 11.8, NumDevs = 1, Device0 = NVIDIA GeForce RTX 3090
Result = PASS

ENVs set:

export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH

Standalone code to reproduce the issue

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Relevant log output

2024-03-19 11:29:26.228449: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-19 11:29:26.250017: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-19 11:29:26.585563: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-19 11:29:26.832574: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-19 11:29:26.844141: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]
@google-ml-butler google-ml-butler bot added type:build/install Build and install issues type:support Support issues labels Mar 19, 2024
@jeanswiegers jeanswiegers changed the title Tensorflow WSL GPU CUDA rcognition issue RTX3090 Tensorflow WSL GPU CUDA recognition issue RTX3090 Mar 19, 2024
@sushreebarsa sushreebarsa added wsl2 Windows Subsystem for Linux TF 2.16 and removed type:support Support issues labels Mar 20, 2024
@sushreebarsa
Copy link
Contributor

@jeanswiegers If TF is not recognizing GPU, then could you please verify the build compatibility by running the following in your WSL environment;

import tensorflow as tf
print(tf.test.is_built_with_cuda())

This should output True. If it's False, you might need to reinstall TensorFlow with GPU support. For more information on WSL with GPU support please refer to https://www.tensorflow.org/install/pip. The TensorFlow version needs to be compatible with your CUDA version.

Thank you!

@sushreebarsa sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Mar 20, 2024
@jeanswiegers
Copy link
Author

@jeanswiegers If TF is not recognizing GPU, then could you please verify the build compatibility by running the following in your WSL environment;

import tensorflow as tf
print(tf.test.is_built_with_cuda())

This should output True. If it's False, you might need to reinstall TensorFlow with GPU support. For more information on WSL with GPU support please refer to https://www.tensorflow.org/install/pip. The TensorFlow version needs to be compatible with your CUDA version.

Thank you!

Hi Sushreebarsa, thanks for your help. It does return True in my environment.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Mar 20, 2024
@sushreebarsa
Copy link
Contributor

@jeanswiegers Thank you for your quick response!
Simply restarting your WSL instance (wsl --shutdown) or your entire computer can resolve environment variable issues. Could you please try this once. If the issue continues then, please use nvidia-smi within your WSL terminal to confirm your RTX3090 is recognized by the NVIDIA drivers. If not, there might be an issue with the driver installation itself.
Thank you!

@sushreebarsa sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Mar 20, 2024
@jeanswiegers
Copy link
Author

i have restarted pc and ubuntu multiple times.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.60.01              Driver Version: 551.76         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:01:00.0  On |                  N/A |
| 30%   54C    P0            121W /  350W |    1642MiB /  24576MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Mar 20, 2024
@jeanswiegers
Copy link
Author

I managed to get it working by installing the latest supported CUDA version (12.3) Ubuntu runfile stated on tensorflows website.

Only running
pip install tensorflow[with-cuda] doesn't work.

@jeanswiegers
Copy link
Author

Working

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@sushreebarsa
Copy link
Contributor

@jeanswiegers Glad it worked fine for you.
Thank you!

@chaudharyachint08
Copy link

Almost final and automated fix below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TF 2.16 type:build/install Build and install issues wsl2 Windows Subsystem for Linux
Projects
None yet
Development

No branches or pull requests

4 participants