Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 & CUDA does not work [v20226] #6014

Closed
noofaq opened this issue Oct 1, 2020 · 117 comments
Closed

WSL2 & CUDA does not work [v20226] #6014

noofaq opened this issue Oct 1, 2020 · 117 comments
Labels

Comments

@noofaq
Copy link

@noofaq noofaq commented Oct 1, 2020

Environment

Windows build number: 10.0.20226.0
Your Distribution version: 18.04 / 20.04
Whether the issue is on WSL 2 and/or WSL 1: Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Tue Jun 23 12:58:10 UTC 2020

Steps to reproduce

Exactly followed instructions available at https://docs.nvidia.com/cuda/wsl-user-guide/index.html
Tested on previously working Ubuntu WSL image (IIRC GPU last worked on 20206, than whole WSL2 stopped working)
Tested also on newly created Ubuntu 18.04 and Ubuntu 20.04 images.

I have tested CUDA compatible NVIDIA drivers 455.41 & 460.20. I have tried removing all drivers etc.
I have also tested using CUDA 10.2 & CUDA 11.0.

It was tested on two separate machines (one Intel + GTX1060, other Ryzen + RTX 2080Ti)

Issue tested directly in OS also in docker containers inside.

Example (directly in Ubuntu):

piotr@DESKTOP-FS6J3NT:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Turing" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
CUDA error at BlackScholes.cu:116 code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **)&d_CallResult, OPT_SZ)"

Example in container:

piotr@DESKTOP-FS6J3NT:/mnt/c/Users/pppnn$ docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter python
Python 3.6.9 (default, Nov  7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-10-01 14:18:07.538627: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-10-01 14:18:07.624188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-10-01 14:18:32.359457: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-10-01 14:18:32.398949: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3200035000 Hz
2020-10-01 14:18:32.402692: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3d06b70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-01 14:18:32.402748: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-10-01 14:18:32.409370: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-10-01 14:18:32.877228: W tensorflow/compiler/xla/service/platform_util.cc:276] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2020-10-01 14:18:32.877370: I tensorflow/compiler/jit/xla_gpu_device.cc:136] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-10-01 14:18:32.879904: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:32.880192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:18:32.880277: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:18:32.880340: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:18:32.959947: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:18:32.973554: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:18:33.111736: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:18:33.127902: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:18:33.128018: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:18:33.128535: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:33.129170: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:33.129403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-10-01 14:18:33.131671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/test_util.py", line 1513, in is_gpu_available
    for local_device in device_lib.list_local_devices():
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/device_lib.py", line 43, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
>>>
>>>
>>>
>>>
>>> tf.config.list_physical_devices('GPU')
2020-10-01 14:18:55.610151: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.610510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:18:55.610579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:18:55.610623: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:18:55.610676: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:18:55.610719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:18:55.610762: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:18:55.610805: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:18:55.610846: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:18:55.611251: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.611765: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.611999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>>
>>>
>>>
>>> tf.test.gpu_device_name()
2020-10-01 14:20:08.762060: W tensorflow/compiler/xla/service/platform_util.cc:276] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2020-10-01 14:20:08.762222: I tensorflow/compiler/jit/xla_gpu_device.cc:136] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-10-01 14:20:08.762863: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.763201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:20:08.763263: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:20:08.763316: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:20:08.763358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:20:08.763379: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:20:08.763428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:20:08.763480: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:20:08.763533: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:20:08.763898: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.764536: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.764810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/test_util.py", line 112, in gpu_device_name
    for x in device_lib.list_local_devices():
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/device_lib.py", line 43, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
>>>

Expected behavior

CUDA working inside WSL2

Actual behavior

All tests which are using CUDA inside WSL Ubuntu are resulting with various CUDA errors - mostly referring to no CUDA devices available.

@benhillis benhillis added the GPU label Oct 1, 2020
@kunc
Copy link

@kunc kunc commented Oct 1, 2020

I am having the same issue. Everything was working flawlessly this morning but then I have updated to 20226.1000 from 20221.1000 and it does not work anymore (tried reinstalling nvidia drivers, etc.) with error that all cuda devices are busy or unavailable.

Edit:
After going back to version 20221, everything works again, thus it confirms that the new version caused the problem.

@benhillis
Copy link
Member

@benhillis benhillis commented Oct 1, 2020

Can you share the contents of c:\Windows\System32\lxss\lib?

@dfreelan
Copy link

@dfreelan dfreelan commented Oct 1, 2020

Having same issue. Here's my C:\WINDOWS\System32\lxss\lib.

09/17/2020 01:24 PM 124,664 libcuda.so
09/17/2020 01:24 PM 124,664 libcuda.so.1
09/17/2020 01:24 PM 124,664 libcuda.so.1.1
09/17/2020 01:24 PM 40,980,456 libnvwgf2umx.so

@xldsdr124
Copy link

@xldsdr124 xldsdr124 commented Oct 1, 2020

Oh too bad, I also encountered this problem. I was so happy when wsl worked again in the 20226 version, but cuda couldn’t work. I was left out of the cold. I tried the following solutions, but none of them worked for me.

  1. Reinstall the graphics card driver 460.20.

  2. Recompile cuda dependent environment library.

  3. Uninstall wsl2 and kernel program and reinstall.

@benhillis
Copy link
Member

@benhillis benhillis commented Oct 1, 2020

Interesting, you seem to be missing the libdxcore libraries.

@dfreelan
Copy link

@dfreelan dfreelan commented Oct 1, 2020

I reverted my windows back to the previous version, then reinstalled the 20226 build, and now it looks like this:

09/17/2020 01:24 PM 124,664 libcuda.so
09/17/2020 01:24 PM 124,664 libcuda.so.1
09/17/2020 01:24 PM 124,664 libcuda.so.1.1
09/26/2020 03:32 PM 832,936 libd3d12.so
09/26/2020 03:32 PM 5,115,392 libd3d12core.so
09/26/2020 03:32 PM 25,074,040 libdirectml.so
09/26/2020 03:32 PM 878,768 libdxcore.so
09/17/2020 01:24 PM 40,980,456 libnvwgf2umx.so

@adamfarquhar
Copy link

@adamfarquhar adamfarquhar commented Oct 1, 2020

I am having the same problem. WIndows 10 build 20226 and Nvidia driver 460.20. It is great to see that it is not just my install. I hope that this can be fixed soon.

And now I can also confirm that it will work if you roll back to the previous build 20221. You can download the (old) iso file from Microsoft and re-install without losing any data.

@jin8495
Copy link

@jin8495 jin8495 commented Oct 1, 2020

Same problem here, Nvidia driver 460.20 and build 20226.

@xldsdr124
Copy link

@xldsdr124 xldsdr124 commented Oct 2, 2020

可以共享c:\ Windows \ System32 \ lxss \ lib的内容吗?

lib_list

@geneing
Copy link

@geneing geneing commented Oct 2, 2020

I have the same problem Nvidia driver 460.15, build 20226. It worked with the previous insider build.

@noofaq
Copy link
Author

@noofaq noofaq commented Oct 2, 2020

Can you share the contents of c:\Windows\System32\lxss\lib?

obraz

Looked into previous Windows version folder too:
obraz

@mitch-at-orika
Copy link

@mitch-at-orika mitch-at-orika commented Oct 2, 2020

Same problem Nvidia driver 460.20 and build 20226 my contents in lsxx\lib are:
image

@aticie
Copy link

@aticie aticie commented Oct 2, 2020

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

@kunc
Copy link

@kunc kunc commented Oct 2, 2020

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

It worked for me. Are you sure you have went to the 20226 from 20221 - I think it might store only the last version as backup - the option is no longer available for me when I have reset from 20226 to 20221.

@adamfarquhar
Copy link

@adamfarquhar adamfarquhar commented Oct 2, 2020

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

Yes, you can install 20221 from https://www.microsoft.com/en-us/software-download/windowsinsiderpreviewadvanced

@kivancguckiran
Copy link

@kivancguckiran kivancguckiran commented Oct 2, 2020

It seems that it is not possible to downgrade windows without losing the apps and files which is not possible for me under these circumstances. Does anyone know another solution for this? Or we wait for Microsoft the fix the problem?

I too have version 10226.

@PRIMA-LAB-IPU
Copy link

@PRIMA-LAB-IPU PRIMA-LAB-IPU commented Oct 2, 2020

Same here.
`$ nvidia-smi.exe
Fri Oct 2 23:54:29 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.15 Driver Version: 460.15 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 45C P5 12W / N/A | 176MiB / 8192MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1752 C+G Insufficient Permissions N/A |
| 0 N/A N/A 2424 C+G ...b3d8bbwe\WinStore.App.exe N/A |
| 0 N/A N/A 3500 C+G ...y\ShellExperienceHost.exe N/A |
| 0 N/A N/A 5536 C+G ...m Files\VcXsrv\vcxsrv.exe N/A |
| 0 N/A N/A 8288 C+G ...batNotificationClient.exe N/A |
| 0 N/A N/A 10104 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 11152 C+G ...qxf38zg5c\Skype\Skype.exe N/A |
| 0 N/A N/A 11512 C+G ...artMenuExperienceHost.exe N/A |
| 0 N/A N/A 11548 C+G ...ekyb3d8bbwe\YourPhone.exe N/A |
| 0 N/A N/A 11832 C+G ...3m\Quick Eye\QuickEye.exe N/A |
| 0 N/A N/A 11996 C+G ...8wekyb3d8bbwe\Cortana.exe N/A |
| 0 N/A N/A 12856 C+G ...5n1h2txyewy\SearchApp.exe N/A |
| 0 N/A N/A 13608 C+G ...2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 14484 C+G ...re1.8.0_261\bin\javaw.exe N/A |
| 0 N/A N/A 15152 C+G ...qxf38zg5c\Skype\Skype.exe N/A |
| 0 N/A N/A 15620 C+G ...he8kybcnzg4\app\Slack.exe N/A |
| 0 N/A N/A 16728 C+G ...ropbox\Client\Dropbox.exe N/A |
| 0 N/A N/A 18824 C+G Insufficient Permissions N/A |
| 0 N/A N/A 19316 C+G ...arp.BrowserSubprocess.exe N/A |
| 0 N/A N/A 22372 C+G ...obeNotificationClient.exe N/A |
+-----------------------------------------------------------------------------+`

@ChengyuSheu
Copy link

@ChengyuSheu ChengyuSheu commented Oct 3, 2020

Thanks, @adamfarquhar. Rollback to version 20201 resolve this issue. Even though some settings are removed, files stay.

@lminer
Copy link

@lminer lminer commented Oct 3, 2020

Same problem.

  • WSL2 Ubuntu 20.04
  • driver version 460.20
  • Razer blade advanced 4K 2019
  • RTX 2080 Max Q
  • Windows insider 20226

Rollback to previous version fixes it. For people who want to do it without reinstalling, go to Recovery > restore previous version of windows

@aisensiy
Copy link

@aisensiy aisensiy commented Oct 4, 2020

I have the error remote procedure call failed in the last version, and I have this issue after upgrade. So...when I recovery does it mean I will get the remote procedure call failed back 😿

@sirisian
Copy link

@sirisian sirisian commented Oct 4, 2020

@kivancguckiran I just joined the insider build so I'm in the same boat. It would probably take like 4 hours, but you could probably revert windows to the previous version (non-insider) maybe then go specifically to 20221. I'm not going to try it and just wait though.

@strarsis
Copy link

@strarsis strarsis commented Oct 4, 2020

+1, same issue here.

  • Windows 10 Version 2004 (Build 19041.546)
  • NVIDIA Driver 460.20 (GameReady, from the NVIDIA CUDA on WSL driver page)
  • WSL 2
  • Ubuntu LTS 20.x
  • Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Tue Jun 23 12:58:10 UTC 2020

The kernel, driver and other versions are above the required minimum, so CUDA in WSL 2 should work.
However, when running the NVIDIA samples built with make, they always fail to run:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL
cat /usr/local/cuda/version.txt
CUDA Version 11.0.228
@bbongcol
Copy link

@bbongcol bbongcol commented Oct 5, 2020

I have the same problem in 20226.

  • WSL2 Ubuntu 18.04
  • Kerver Version 4.19.128
  • driver version 460.20
  • RTX 2060
  • Windows insider 20226
    https://aka.ms/AA9utty

Cuda device query is ok.

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2060"
CUDA Driver Version / Runtime Version 11.2 / 10.0
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 6144 MBytes (6442450944 bytes)
(30) Multiprocessors, ( 64) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1200 MHz (1.20 GHz)
Memory Clock rate: 7001 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 3145728 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

But cuda utility does not worked.

[./BlackScholes] - Starting...
GPU Device 0: "GeForce RTX 2060" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
CUDA error at BlackScholes.cu:116 code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **)&d_CallResult, OPT_SZ)"

Below is strace log.
BlackScholes_cuda_error_log.zip

@liamhan0905
Copy link

@liamhan0905 liamhan0905 commented Oct 5, 2020

I also have tensorflow-gpu on WSL2. But I'm getting the error message as shown below.

RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

Following this link resolved the issue for me! It seems like my issue was also the Windows10 Insider Previews... smh. Simply following "Roll Back Soon After Enabling Insider Previews" section solved it for me (current version: 10.0.20221 Build 20221) and now I can train my model again using tensorflow-gpu. Thank you everyone for the help!

@onomatopellan
Copy link

@onomatopellan onomatopellan commented Oct 5, 2020

  • Windows 10 Version 2004 (Build 19041.546)

@strarsis In your case you need to use a Windows Insider build from the Dev Channel (build >=20150). CUDA in WSL2 won't work in build 19041.

@strarsis
Copy link

@strarsis strarsis commented Oct 5, 2020

@onomatopellan: How long do have I to wait to get this support in stable Windows 10?

@onomatopellan
Copy link

@onomatopellan onomatopellan commented Oct 5, 2020

@strarsis This is expected for 21H1 aka April 2021.

@strarsis
Copy link

@strarsis strarsis commented Oct 5, 2020

@onomatopellan: To use this now, I have to register for Windows Insider, download the ISO - or can I use the Windows updater?
Any downsides to using Windows Insider version like performance or stabilitiy?

@Meeka33
Copy link

@Meeka33 Meeka33 commented Oct 5, 2020

This stopped working for me as well. winver 2004 20226 with CUDA. It previously was working until yesterday on previous builds. When will this be fixed? Too many recurring bugs, ready to dump windows

@M-G-ARC
Copy link

@M-G-ARC M-G-ARC commented Oct 14, 2020

I'm also having this problem on 20231 WSL Ubuntu 20.04. Just in case anyone wants to save time by not trying this as I just did try installing as a separate distribution Ubuntu 18.04, installed CUDA and rebuilt out from source the examples, and attempted to run BlackScholes. It seems that doesn't make a difference - same error.

Since I just moved to the dev channel today for CUDA support, I don't have the luxury of rolling back. I also have been dealing with a problem with CUDA on my Linux install, so I was hoping this would be worth the effort (guess not). Hope this is fixed soon.... it seems installing CUDA is an absolute crap experience always.

@wanfuse123
Copy link

@wanfuse123 wanfuse123 commented Oct 14, 2020

I have the same bleeping problem running cuda

docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" nricklin/ubuntu-gpu-test
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.19.128-microsoft-standard/modules.dep.bin'
test.cu(29) : cudaSafeCall() Runtime API error : no CUDA-capable device is detected.

@wanfuse123
Copy link

@wanfuse123 wanfuse123 commented Oct 14, 2020

I wonder if the module problem is causing the CUDA-capable error

From what I understand you cant run the headers in the container, does MS provide these headers in some other fashion. Seems to be broken for me for at least a week (I did not notice)

Cant roll back to 20221.1000 to test unfortunately, hope this is fixed soon...

@blackliner
Copy link

@blackliner blackliner commented Oct 14, 2020

231.1005 is available now, will it fix this problem? https://blogs.windows.com/windows-insider/2020/10/07/announcing-windows-10-insider-preview-build-20231/

According to the page seems to not fix anything

@tjrkddnr
Copy link

@tjrkddnr tjrkddnr commented Oct 14, 2020

231.1005 is available now, will it fix this problem? https://blogs.windows.com/windows-insider/2020/10/07/announcing-windows-10-insider-preview-build-20231/

According to the page seems to not fix anything

No, 20231.1005 still doesn't work

@OkuyanBoga
Copy link

@OkuyanBoga OkuyanBoga commented Oct 14, 2020

They mentioned this problem: "CUDA on WSL Guide by Nvidia"
Currently only suggested solution is reverting back to 20221 build. Is there any safe and easy way or at least any guide to do it?

@jamespacileo
Copy link

@jamespacileo jamespacileo commented Oct 14, 2020

Ok, so what is the safest way to revert to build 20221?

It's not available via the official methods: reverting to the previous build or through advanced options in the insiders panel.

Do we need to install through a third party ISO? If so which ones are safe?

Also, have we had any official comment from a WSL maintainer?

Thanks 👍

@theothings
Copy link

@theothings theothings commented Oct 14, 2020

Is there any workaround since the build 20221 image is no longer available here?

Or any ideas when this will likely be fixed? 👍

@Agrover112
Copy link

@Agrover112 Agrover112 commented Oct 14, 2020

You can find the ISO for 2021 here check in the comments @theothings @jamespacileo

https://forums.developer.nvidia.com/t/code-46-error-device-unreachable/156739

Here in the comments you will find link(unofficial) because 2021 was unavailable on Windows Insiders Downloads menu(shows later version for downloads)!

@tjrkddnr
Copy link

@tjrkddnr tjrkddnr commented Oct 14, 2020

Is there any workaround since the build 20221 image is no longer available here?

Or any ideas when this will likely be fixed? 👍

you can downlad and create previous build version of windows iso including 20221 at https://uupdump.ml

@wanfuse123
Copy link

@wanfuse123 wanfuse123 commented Oct 14, 2020

Wont installing 20021 through boot and reinstall destroy installations of Ubuntu on WSL and all the configurations of nvidia, and docker, I know you can reinstall and preserve applications but are these included?

@Agrover112
Copy link

@Agrover112 Agrover112 commented Oct 14, 2020

@wanfuse maybeMaybe

@tadam98
Copy link

@tadam98 tadam98 commented Oct 14, 2020

@onomatopellan
Copy link

@onomatopellan onomatopellan commented Oct 14, 2020

Build 20236 blog post annonuncement

We fixed a regression that was breaking NVIDIA CUDA vGPU acceleration in the Windows Subsystem for Linux. Please see this GitHub thread for full details.

@FSchoettl
Copy link

@FSchoettl FSchoettl commented Oct 14, 2020

Build 20236 fixed it for me 👍

@askourtis
Copy link

@askourtis askourtis commented Oct 14, 2020

Can confirm that it works on 20236

@Agrover112
Copy link

@Agrover112 Agrover112 commented Oct 14, 2020

Wait I received the new update too gotta check it out

@basarane
Copy link

@basarane basarane commented Oct 14, 2020

Cuda on WSL2 works perfectly on build 20236. The problem seems to be resolved.

@tadam98
Copy link

@tadam98 tadam98 commented Oct 14, 2020

Works for me too:

$ python

import tensorflow as tf
tf.test.is_gpu_available()
...
2020-10-15 00:53:25.186406: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
True

$ docker stats
$ docker stop # dockers reported by stats
$ rm ~/.docker/config.json
$ sudo service docker stop
Docker already stopped - file /var/run/docker-ssd.pid not found.
$ sudo service docker start

  • Starting Docker: docker
    $ sudo mkdir /sys/fs/cgroup/systemd
    $ sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
    $ docker ps
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
    $ docker run hello-world
    Hello from Docker!
    etc...
    $ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
    .
    NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM
GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5

Compute 7.5 CUDA device: [GeForce RTX 2080 Ti]
69632 bodies, total time for 10 iterations: 117.762 ms
= 411.730 billion interactions per second
= 8234.591 single-precision GFLOP/s at 20 flops per interaction

@cktlco
Copy link

@cktlco cktlco commented Oct 14, 2020

I confirm the same issue exists in 20231.1000

I confirm the issue is resolved in 20236.1000. Thanks to all who contributed momentum toward this.

@wanfuse123
Copy link

@wanfuse123 wanfuse123 commented Oct 15, 2020

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare -fp64 --works so it seems with output like so
Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Double precision floating point simulation
1 Devices used for simulation
GPU Device 0: "GeForce GTX 1050 Ti" with compute capability 6.1

Compute 6.1 CUDA device: [GeForce GTX 1050 Ti]
4096 bodies, total time for 10 iterations: 115.717 ms
= 1.450 billion interactions per second
= 43.495 double-precision GFLOP/s at 30 flops per interaction

while this container fails still with the error

docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" nricklin/ubuntu-gpu-test
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.19.128-microsoft-standard/modules.dep.bin'
test.cu(29) : cudaSafeCall() Runtime API error : no CUDA-capable device is detected.

This is with the latest 20231.1000 installed

Lastly the container mentioned below which seems like a very good test shows errors but I think that a kernel compile option that I am unaware of will take care of it. I will investigate it further.

I am posting this second run line for others to use for testing the video card, since nvidia-smi is still not working and everything seems to suggest using nvidia-smi in the containers for examples. (Can't wait for the nvidia-smi thing to be fixed)

Anway here is the command

docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" tensorflow/tensorflow:1.13.2-gpu-py3-jupyter python /app/benchmark.py gpu 10000
(note: I am chaining the nvidia runtime container to a second container) I don't see very many mentions of how to do this without docker-compose. (Helpful?)

it produces output like

Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.30GiB
2020-10-15 02:16:20.656683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-10-15 02:16:20.658946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-15 02:16:20.659002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-10-15 02:16:20.659019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-10-15 02:16:20.659493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.

As you can see there seems to be NUMA missing, this might not be a kernel compile problem but rather something with tensorflow versional problems like mentioned here:

https://stackoverflow.com/questions/55511186/could-not-identify-numa-node-of-platform-gpu (at the very bottom)

I am still investigating. Hope these test containers help someone else.

@markdjthomas
Copy link

@markdjthomas markdjthomas commented Oct 15, 2020

I’m still running into the all CUDA-capable devices are busy or unavailable error after updating to 20236 with CUDA 460.15 on Ubuntu 20.04.

torch.cuda.is_available() returns True but I can’t move any variables to the device.

@tadam98
Copy link

@tadam98 tadam98 commented Oct 15, 2020

docker-compose does not support gpu docker containers. In order to use docker containers:

  1. uninstall docker-compose
  2. pip install git+https://github.com/yoanisgil/compose.git@device-requests
  3. Add lines to your docker file:
version: "3.7"
...
    device_requests:
      - capabilities:
        - "gpu"

You may get an error "docker.errors.InvalidVersion: device_requests param is not supported in API versions < 1.40" which is fixed by setting environment variable COMPOSE_API_VERSION=1.40

With this modified docker-compose, gpu parameter will be passed to the docker container.

It is not a Microsoft problem but a docker-composer problem.
With the modified docker-compose it works well.

@keenranger
Copy link

@keenranger keenranger commented Oct 15, 2020

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare -fp64 --works so it seems with output like so
Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Double precision floating point simulation
1 Devices used for simulation
GPU Device 0: "GeForce GTX 1050 Ti" with compute capability 6.1

Compute 6.1 CUDA device: [GeForce GTX 1050 Ti]
4096 bodies, total time for 10 iterations: 115.717 ms
= 1.450 billion interactions per second
= 43.495 double-precision GFLOP/s at 30 flops per interaction

while this container fails still with the error

docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" nricklin/ubuntu-gpu-test
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.19.128-microsoft-standard/modules.dep.bin'
test.cu(29) : cudaSafeCall() Runtime API error : no CUDA-capable device is detected.

This is with the latest 20231.1000 installed

Lastly the container mentioned below which seems like a very good test shows errors but I think that a kernel compile option that I am unaware of will take care of it. I will investigate it further.

I am posting this second run line for others to use for testing the video card, since nvidia-smi is still not working and everything seems to suggest using nvidia-smi in the containers for examples. (Can't wait for the nvidia-smi thing to be fixed)

Anway here is the command

docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" tensorflow/tensorflow:1.13.2-gpu-py3-jupyter python /app/benchmark.py gpu 10000
(note: I am chaining the nvidia runtime container to a second container) I don't see very many mentions of how to do this without docker-compose. (Helpful?)

it produces output like

Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.30GiB
2020-10-15 02:16:20.656683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-10-15 02:16:20.658946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-15 02:16:20.659002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-10-15 02:16:20.659019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-10-15 02:16:20.659493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.

As you can see there seems to be NUMA missing, this might not be a kernel compile problem but rather something with tensorflow versional problems like mentioned here:

https://stackoverflow.com/questions/55511186/could-not-identify-numa-node-of-platform-gpu (at the very bottom)

I am still investigating. Hope these test containers help someone else.

I had same trouble til 20231. CUDA might not work with build 20231, try it with build 20236

CUDA works

In my case, it works well after the update

@noofaq
Copy link
Author

@noofaq noofaq commented Oct 15, 2020

It is also fixed for me in 20236 so I am closing the issue. Thanks to all commenters and also WSL developers for fixing the issue.

@manishkm
Copy link

@manishkm manishkm commented Oct 16, 2020

Yes now cuda on wsl is working in 20236. I am creating a restore point if this issue arises again in future.

@hanxiaotian
Copy link

@hanxiaotian hanxiaotian commented Oct 23, 2020

I'm using 20241, but cuda is still not working in mu wsl. Previously, when I use 20176, it worked

import torch
torch.cuda.current_device()
Traceback (most recent call last):
File "", line 1, in
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 377, in current_device
_lazy_init()
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init
_check_driver()
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 101, in check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

@borgarpa
Copy link

@borgarpa borgarpa commented Oct 23, 2020

I'm using 20241, but cuda is still not working in mu wsl. Previously, when I use 20176, it worked

import torch
torch.cuda.current_device()
Traceback (most recent call last):
File "", line 1, in
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 377, in current_device
_lazy_init()
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init
_check_driver()
File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 101, in check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

I think they broke WSL-CUDA again... It did not work for me either. I tried reinstalling WSL, CUDA and NVIDIA driver. But WSL just failed to identify both NVIDIA driver and CUDA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.