Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for CUDA 11.1 on Windows 10 for the 8.6 compute capability #44750

Closed
dobromyslova opened this issue Nov 11, 2020 · 16 comments
Closed
Assignees
Labels
subtype:windows Windows Build/Installation Issues type:build/install Build and install issues type:feature Feature requests

Comments

@dobromyslova
Copy link

System information

  • System: Windows 10

  • TensorFlow version (you are using): 2.5.0.dev20201108

  • Are you willing to contribute it (Yes/No): Yes. I can do the testing of the new build and provide any additional information.

  • TensorFlow version (you are using): 2.5.0.dev20201108

  • Python version: 3.8.6-amd64

  • Compiler: MSVC 2019

  • cuDNN: 8.0.4.30

  • CUDA: 11.0.3_451.82

  • ptxas: from CUDA 11.1

  • NVIDIA Drivers: 456.71

Hello TensorFlow team. I recently got working on my RTX 3090 on the Windows 10 machine and decided to share my investigations with you.

Right now I got TensorFlow 2.5.0.dev20201108 work with CUDA 11.0 and ptxas.exe (PTX compiler) from CUDA 11.1, because ptxas from 11.1 supports 8.6 compute capability and on 11.0 - I'm getting error ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name' - which is mean that 11.0 doesn't support 8.6 yet (at least I think so).

So, I don't know if it's planned, but would be great to add support of the CUDA 11.1 for the Win build (I don't know if people having the same issue on the Linux). Because my current solution is kind of hacky (using the compiler from a different version) and even though I don't see any errors right now, it could potentially cause some in the future.

Here is also detailed info on my finding: https://dobromyslova.medium.com/making-work-tensorflow-with-nvidia-rtx-3090-on-windows-10-7a38e8e582bf

Please feel free to contact me for any additional information.

@dobromyslova dobromyslova added the type:feature Feature requests label Nov 11, 2020
@Saduf2019 Saduf2019 added the comp:gpu GPU related issues label Nov 11, 2020
@Saduf2019 Saduf2019 assigned ymodak and unassigned Saduf2019 Nov 11, 2020
@ymodak ymodak added type:build/install Build and install issues and removed comp:gpu GPU related issues labels Nov 11, 2020
@ymodak ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 11, 2020
@mihaimaruseac
Copy link
Collaborator

@sanjoy @pkanwar23 can you take it from here?

@ymodak ymodak removed their assignment Nov 11, 2020
@tensorflowbutler tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 13, 2020
@WeiChihChern
Copy link

Been using tensorflow/tensorflow:nightly-gpu-jupyter, even the latest nightly images (2.5.0-dev20201112) have the issue still with RTX 3090. Any insights?

@dobromyslova
Copy link
Author

dobromyslova commented Nov 13, 2020

Been using tensorflow/tensorflow:nightly-gpu-jupyter, even the latest nightly images (2.5.0-dev20201112) have the issue still with RTX 3090. Any insights?

I got some performance issues (too slow learning) until I applied temporary fix for the PTX compilations. You can read more in the article that I posted in the description: https://dobromyslova.medium.com/making-work-tensorflow-with-nvidia-rtx-3090-on-windows-10-7a38e8e582bf

@WeiChihChern
Copy link

@dobromyslova Thanks for the info. Just tried with images provided by Nvidia which works with my RTX 3090, and the performance did improve. If you used docker, definitely check it out.

@dobromyslova
Copy link
Author

Thank you for the information, I will try this as well

@sanjoy
Copy link
Contributor

sanjoy commented Dec 1, 2020

because ptxas from 11.1 supports 8.6 compute capability and on 11.0 - I'm getting error ptxas fatal : Value 'sm_86' is not defined for option 'gpu-name' - which is mean that 11.0 doesn't support 8.6 yet (at least I think so).

Is this when building TensorFlow from source?

If yes, you can try to explicitly select compute capability 8.0 (and not 8.6) when configure asks you about it. TF built with CC 8.0 should work on 3090, though perhaps not with peak performance, and CUDA 11.0 supports 8.0.

@dobromyslova
Copy link
Author

Is this when building TensorFlow from source?

No, I used already prebuilt TensorFlow from pip, the one that currently works for me is the 2.5.0.dev20201108 one.

Right now I don't have any issues with this version, but if you want I can try to build it from source with compute capability 8.0, just let me know if you want me to try.

@bayesrule
Copy link

exactly the same issue on Linux

@dou3516
Copy link

dou3516 commented Jul 19, 2021

the same issue

@sushreebarsa sushreebarsa self-assigned this Oct 21, 2021
@sushreebarsa
Copy link
Contributor

@dobromyslova Could you please try to use TF v2.6.0 and refer to the Build from source ? Please let us know if the issue still persists ?Thanks you!

@sushreebarsa sushreebarsa added stat:awaiting response Status - Awaiting response from author subtype:windows Windows Build/Installation Issues labels Oct 21, 2021
@dobromyslova
Copy link
Author

@dobromyslova Could you please try to use TF v2.6.0 and refer to the Build from source ? Please let us know if the issue still persists ?Thanks you!

Thank you! I will try it and let you know!

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Oct 27, 2021
@dobromyslova
Copy link
Author

@sushreebarsa I tried GPU build from branch v2.6.0:

git checkout v2.6.0

And got the following error:

WARNING: Download from http://mirror.tensorflow.org/github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
DEBUG: C:/users/workerml/_bazel_workerml/iqkt7btw/external/tf_runtime/third_party/cuda/dependencies.bzl:51:10: The following command will download NVIDIA proprietary software. By using the software you agree to comply with the terms of the license agreement that accompanies the software. If you do not agree to the terms of the license agreement, do not use the software.
INFO: Repository local_config_cuda instantiated at:
  C:/tf_build_test/tensorflow/WORKSPACE:15:14: in <toplevel>
  C:/tf_build_test/tensorflow/tensorflow/workspace2.bzl:1088:19: in workspace
  C:/tf_build_test/tensorflow/tensorflow/workspace2.bzl:90:19: in _tf_toolchains
Repository rule cuda_configure defined at:
  C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl:1448:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
   Traceback (most recent call last):
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 1401, column 38, in _cuda_autoconf_impl
                _create_local_cuda_repository(repository_ctx)
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 1239, column 56, in _create_local_cuda_repository
                host_compiler_includes + _cuda_include_path(
        File "C:/tf_build_test/tensorflow/third_party/gpus/cuda_configure.bzl", line 364, column 32, in _cuda_include_path
                inc_entries.append(realpath(repository_ctx, cuda_config.cuda_toolkit_path + "/include"))
        File "C:/tf_build_test/tensorflow/third_party/remote_config/common.bzl", line 290, column 19, in realpath
                return execute(repository_ctx, [bash_bin, "-c", "realpath \"%s\"" % path]).stdout.strip()
        File "C:/tf_build_test/tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
/usr/bin/bash: line 1: realpath: command not found
INFO: Found applicable config definition build:cuda in file c:\work\py\tf_build_test\tensorflow.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:cuda in file c:\work\py\tf_build_test\tensorflow.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
ERROR: @local_config_cuda//:enable_cuda :: Error loading option @local_config_cuda//:enable_cuda: Repository command failed
/usr/bin/bash: line 1: realpath: command not found

I found that this issue may be related to this problem: #52131

Also, the CPU build works, but we want GPU anyway.
Please let me know what you think.

P.S. here is my configuration:

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): source
  • TensorFlow version: 2.6.0
  • Python version: 3.8.0
  • Installed using virtualenv? pip? conda?: pip and virtualenv
  • Bazel version (if compiling from source): 3.7.2
  • GCC/Compiler version (if compiling from source): Visual Studio 2019
  • CUDA/cuDNN version: 11.2 / 8.1.0
  • GPU model and memory: RTX3090

@sushreebarsa sushreebarsa removed their assignment Jan 17, 2022
@gadagashwini
Copy link
Contributor

@dobromyslova, This issue is fixed in latest Tensorflow version.
Build Tensorflow version with Bazel 5.0.0/4.2.1. Follow the instructions mentioned here. Thanks!

@gadagashwini gadagashwini added the stat:awaiting response Status - Awaiting response from author label Feb 25, 2022
@dobromyslova
Copy link
Author

Thank you, I will try and let you know.

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Feb 27, 2022
@dobromyslova
Copy link
Author

Okay, I can confirm that latest build of TensorFlow is now working without an issues on RTX 3090 (8.6 compute capability)!

Thank you for the hard work!

Here is also some profile data from my runs:

With tf-nightly-gpu

Install TF:

pip install tf-nightly-gpu
pip install tensorflow-hub
pip install matplotlib

Results of run:

$ python validate_build.py

TensorFlow version: 2.8.0-dev20211030
Eager mode enabled: True
GPU available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CUDA version: 64_112
cuDNN version: 64_8

2022-03-28 22:06:45.962369: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 22:06:46.344298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21674 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-28 22:07:00.136319: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8101

Elapsed time: 21.45432949066162 sec

With tensorflow-gpu

Install TF:

python -m pip install -U pip

pip install tensorflow-gpu==2.8.0
pip install tensorflow-hub
pip install matplotlib

Results of run:

$ python validate_build.py

TensorFlow version: 2.8.0
Eager mode enabled: True
GPU available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CUDA version: 64_112
cuDNN version: 64_8

2022-03-28 22:17:11.289526: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 22:17:11.657375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21676 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-28 22:17:15.023230: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8101

Elapsed time: 6.308921813964844 sec

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
subtype:windows Windows Build/Installation Issues type:build/install Build and install issues type:feature Feature requests
Projects
None yet
Development

No branches or pull requests