Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

dobromyslova · 2020-11-11T03:42:54Z

System information

System: Windows 10
TensorFlow version (you are using): 2.5.0.dev20201108
Are you willing to contribute it (Yes/No): Yes. I can do the testing of the new build and provide any additional information.
TensorFlow version (you are using): 2.5.0.dev20201108
Python version: 3.8.6-amd64
Compiler: MSVC 2019
cuDNN: 8.0.4.30
CUDA: 11.0.3_451.82
ptxas: from CUDA 11.1
NVIDIA Drivers: 456.71

Hello TensorFlow team. I recently got working on my RTX 3090 on the Windows 10 machine and decided to share my investigations with you.

Right now I got TensorFlow 2.5.0.dev20201108 work with CUDA 11.0 and ptxas.exe (PTX compiler) from CUDA 11.1, because ptxas from 11.1 supports 8.6 compute capability and on 11.0 - I'm getting error ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name' - which is mean that 11.0 doesn't support 8.6 yet (at least I think so).

So, I don't know if it's planned, but would be great to add support of the CUDA 11.1 for the Win build (I don't know if people having the same issue on the Linux). Because my current solution is kind of hacky (using the compiler from a different version) and even though I don't see any errors right now, it could potentially cause some in the future.

Here is also detailed info on my finding: https://dobromyslova.medium.com/making-work-tensorflow-with-nvidia-rtx-3090-on-windows-10-7a38e8e582bf

Please feel free to contact me for any additional information.

The text was updated successfully, but these errors were encountered:

mihaimaruseac · 2020-11-11T20:46:51Z

@sanjoy @pkanwar23 can you take it from here?

WeiChihChern · 2020-11-13T01:17:51Z

Been using tensorflow/tensorflow:nightly-gpu-jupyter, even the latest nightly images (2.5.0-dev20201112) have the issue still with RTX 3090. Any insights?

dobromyslova · 2020-11-13T01:48:46Z

Been using tensorflow/tensorflow:nightly-gpu-jupyter, even the latest nightly images (2.5.0-dev20201112) have the issue still with RTX 3090. Any insights?

I got some performance issues (too slow learning) until I applied temporary fix for the PTX compilations. You can read more in the article that I posted in the description: https://dobromyslova.medium.com/making-work-tensorflow-with-nvidia-rtx-3090-on-windows-10-7a38e8e582bf

WeiChihChern · 2020-11-13T03:24:11Z

@dobromyslova Thanks for the info. Just tried with images provided by Nvidia which works with my RTX 3090, and the performance did improve. If you used docker, definitely check it out.

dobromyslova · 2020-11-13T04:33:53Z

Thank you for the information, I will try this as well

sanjoy · 2020-12-01T03:50:50Z

because ptxas from 11.1 supports 8.6 compute capability and on 11.0 - I'm getting error ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name' - which is mean that 11.0 doesn't support 8.6 yet (at least I think so).

Is this when building TensorFlow from source?

If yes, you can try to explicitly select compute capability 8.0 (and not 8.6) when configure asks you about it. TF built with CC 8.0 should work on 3090, though perhaps not with peak performance, and CUDA 11.0 supports 8.0.

dobromyslova · 2020-12-05T03:38:59Z

Is this when building TensorFlow from source?

No, I used already prebuilt TensorFlow from pip, the one that currently works for me is the 2.5.0.dev20201108 one.

Right now I don't have any issues with this version, but if you want I can try to build it from source with compute capability 8.0, just let me know if you want me to try.

bayesrule · 2020-12-17T08:29:49Z

exactly the same issue on Linux

dou3516 · 2021-07-19T03:48:39Z

the same issue

sushreebarsa · 2021-10-21T11:43:26Z

@dobromyslova Could you please try to use TF v2.6.0 and refer to the Build from source ? Please let us know if the issue still persists ?Thanks you!

dobromyslova · 2021-10-25T01:20:02Z

@dobromyslova Could you please try to use TF v2.6.0 and refer to the Build from source ? Please let us know if the issue still persists ?Thanks you!

Thank you! I will try it and let you know!

dobromyslova · 2021-10-27T16:24:33Z

@sushreebarsa I tried GPU build from branch v2.6.0:

git checkout v2.6.0

And got the following error:

WARNING: Download from http://mirror.tensorflow.org/github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
DEBUG: C:/users/workerml/_bazel_workerml/iqkt7btw/external/tf_runtime/third_party/cuda/dependencies.bzl:51:10: The following command will download NVIDIA proprietary software. By using the software you agree to comply with the terms of the license agreement that accompanies the software. If you do not agree to the terms of the license agreement, do not use the software.
INFO: Repository local_config_cuda instantiated at:
  C:/tf_build_test/tensorflow/WORKSPACE:15:14: in <toplevel>
  C:/tf_build_test/tensorflow/tensorflow/workspace2.bzl:1088:19: in workspace
  C:/tf_build_test/tensorflow/tensorflow/workspace2.bzl:90:19: in _tf_toolchains
Repository rule cuda_configure defined at:
  C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl:1448:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
   Traceback (most recent call last):
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 1401, column 38, in _cuda_autoconf_impl
                _create_local_cuda_repository(repository_ctx)
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 1239, column 56, in _create_local_cuda_repository
                host_compiler_includes + _cuda_include_path(
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 364, column 32, in _cuda_include_path
                inc_entries.append(realpath(repository_ctx, cuda_config.cuda_toolkit_path + "/include"))
        File "C:/tf_build_test/tensorflow/third_party/remote_config/common.bzl", line 290, column 19, in realpath
                return execute(repository_ctx, [bash_bin, "-c", "realpath \"%s\"" % path]).stdout.strip()
        File "C:/tf_build_test/tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
/usr/bin/bash: line 1: realpath: command not found
INFO: Found applicable config definition build:cuda in file c:\work\py\tf_build_test\tensorflow.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:cuda in file c:\work\py\tf_build_test\tensorflow.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
ERROR: @local_config_cuda//:enable_cuda :: Error loading option @local_config_cuda//:enable_cuda: Repository command failed
/usr/bin/bash: line 1: realpath: command not found

I found that this issue may be related to this problem: #52131

Also, the CPU build works, but we want GPU anyway.
Please let me know what you think.

P.S. here is my configuration:

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow installed from (source or binary): source
TensorFlow version: 2.6.0
Python version: 3.8.0
Installed using virtualenv? pip? conda?: pip and virtualenv
Bazel version (if compiling from source): 3.7.2
GCC/Compiler version (if compiling from source): Visual Studio 2019
CUDA/cuDNN version: 11.2 / 8.1.0
GPU model and memory: RTX3090

gadagashwini · 2022-02-25T05:48:08Z

@dobromyslova, This issue is fixed in latest Tensorflow version.
Build Tensorflow version with Bazel 5.0.0/4.2.1. Follow the instructions mentioned here. Thanks!

dobromyslova · 2022-02-25T20:03:07Z

Thank you, I will try and let you know.

dobromyslova · 2022-03-29T03:45:57Z

Okay, I can confirm that latest build of TensorFlow is now working without an issues on RTX 3090 (8.6 compute capability)!

Thank you for the hard work!

Here is also some profile data from my runs:

With tf-nightly-gpu

Install TF:

pip install tf-nightly-gpu
pip install tensorflow-hub
pip install matplotlib

Results of run:

$ python validate_build.py

TensorFlow version: 2.8.0-dev20211030
Eager mode enabled: True
GPU available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CUDA version: 64_112
cuDNN version: 64_8

2022-03-28 22:06:45.962369: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 22:06:46.344298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21674 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-28 22:07:00.136319: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8101

Elapsed time: 21.45432949066162 sec

With tensorflow-gpu

Install TF:

python -m pip install -U pip

pip install tensorflow-gpu==2.8.0
pip install tensorflow-hub
pip install matplotlib

Results of run:

$ python validate_build.py

TensorFlow version: 2.8.0
Eager mode enabled: True
GPU available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CUDA version: 64_112
cuDNN version: 64_8

2022-03-28 22:17:11.289526: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 22:17:11.657375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21676 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-28 22:17:15.023230: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8101

Elapsed time: 6.308921813964844 sec

google-ml-butler · 2022-03-29T03:45:59Z

Are you satisfied with the resolution of your issue?
Yes
No

dobromyslova added the type:feature Feature requests label Nov 11, 2020

google-ml-butler bot assigned Saduf2019 Nov 11, 2020

Saduf2019 added the comp:gpu GPU related issues label Nov 11, 2020

Saduf2019 assigned ymodak and unassigned Saduf2019 Nov 11, 2020

ymodak added type:build/install Build and install issues and removed comp:gpu GPU related issues labels Nov 11, 2020

ymodak assigned mihaimaruseac Nov 11, 2020

ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 11, 2020

mihaimaruseac assigned sanjoy and pkanwar23 Nov 11, 2020

ymodak removed their assignment Nov 11, 2020

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 13, 2020

Saduf2019 mentioned this issue Nov 18, 2020

2.4.0rc1 not supporting RTX 3090? #44969

Closed

xldrx mentioned this issue Nov 30, 2020

SubProcess ended with return code: 4294967295 with tensorflow-gpu 2.4.0rc cuda11.0 cudnn8.0.2 on windows 10 #45170

Closed

bhack mentioned this issue Dec 14, 2020

Value 'sm_86' is not defined for option 'gpu-name' #45590

Closed

brettkoonce mentioned this issue Dec 21, 2020

Support Ubuntu 20.04 tensorflow/swift#512

Open

sushreebarsa self-assigned this Oct 21, 2021

sushreebarsa added stat:awaiting response Status - Awaiting response from author subtype:windows Windows Build/Installation Issues labels Oct 21, 2021

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Oct 27, 2021

sushreebarsa removed their assignment Jan 17, 2022

gadagashwini added the stat:awaiting response Status - Awaiting response from author label Feb 25, 2022

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Feb 27, 2022

dobromyslova closed this as completed Mar 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

dobromyslova commented Nov 11, 2020

mihaimaruseac commented Nov 11, 2020

WeiChihChern commented Nov 13, 2020

dobromyslova commented Nov 13, 2020 •

edited

Loading

WeiChihChern commented Nov 13, 2020

dobromyslova commented Nov 13, 2020

sanjoy commented Dec 1, 2020

dobromyslova commented Dec 5, 2020

bayesrule commented Dec 17, 2020

dou3516 commented Jul 19, 2021

sushreebarsa commented Oct 21, 2021

dobromyslova commented Oct 25, 2021

dobromyslova commented Oct 27, 2021

gadagashwini commented Feb 25, 2022

dobromyslova commented Feb 25, 2022

dobromyslova commented Mar 29, 2022

google-ml-butler bot commented Mar 29, 2022

Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

Comments

dobromyslova commented Nov 11, 2020

mihaimaruseac commented Nov 11, 2020

WeiChihChern commented Nov 13, 2020

dobromyslova commented Nov 13, 2020 • edited Loading

WeiChihChern commented Nov 13, 2020

dobromyslova commented Nov 13, 2020

sanjoy commented Dec 1, 2020

dobromyslova commented Dec 5, 2020

bayesrule commented Dec 17, 2020

dou3516 commented Jul 19, 2021

sushreebarsa commented Oct 21, 2021

dobromyslova commented Oct 25, 2021

dobromyslova commented Oct 27, 2021

gadagashwini commented Feb 25, 2022

dobromyslova commented Feb 25, 2022

dobromyslova commented Mar 29, 2022

With tf-nightly-gpu

With tensorflow-gpu

google-ml-butler bot commented Mar 29, 2022

dobromyslova commented Nov 13, 2020 •

edited

Loading