Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda can't init [802] Call to cuInit results in UNKNOWN_CUDA_ERROR #7187

Closed
2W-12 opened this issue Jul 6, 2021 · 9 comments
Closed

cuda can't init [802] Call to cuInit results in UNKNOWN_CUDA_ERROR #7187

2W-12 opened this issue Jul 6, 2021 · 9 comments
Labels
CUDA CUDA related issue/PR question Notes an issue as a question

Comments

@2W-12
Copy link

2W-12 commented Jul 6, 2021

Hello,
i'm running on azure instance: ND96asr_v4
ubuntu: 20.04 LTS
kernel: 5.8.0-1036-azure
numba: 0.53.1

>>> import numba
>>> from numba import cuda
>>> numba.__version__
'0.53.1'
>>> cuda.detect()
Traceback (most recent call last):
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 237, in initialize
    self.cuInit(0)
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 300, in safe_cuda_api_call
    self._check_error(fname, retcode)
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 335, in _check_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [802] Call to cuInit results in UNKNOWN_CUDA_ERROR

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/api.py", line 485, in detect
    print('Found %d CUDA devices' % len(devlist))
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 49, in __len__
    return len(self.lst)
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 26, in __getattr__
    numdev = driver.get_device_count()
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 346, in get_device_count
    self.cuDeviceGetCount(byref(count))
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 280, in __getattr__
    self.initialize()
  File "/home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 240, in initialize
    raise CudaSupportError("Error at driver init: \n%s:" % e)
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: 
[802] Call to cuInit results in UNKNOWN_CUDA_ERROR:

no errors in dmesg
other apps works just fine.

Tue Jul  6 19:19:10 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.42.01    Driver Version: 470.42.01    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-SXM...  On   | 00000001:00:00.0 Off |                    0 |
| N/A   37C    P0    46W / 400W |    113MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM...  On   | 00000002:00:00.0 Off |                    0 |
| N/A   37C    P0    46W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM...  On   | 00000003:00:00.0 Off |                    0 |
| N/A   36C    P0    43W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A100-SXM...  On   | 00000004:00:00.0 Off |                    0 |
| N/A   38C    P0    47W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM...  On   | 0000000B:00:00.0 Off |                    0 |
| N/A   36C    P0    45W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA A100-SXM...  On   | 0000000C:00:00.0 Off |                    0 |
| N/A   35C    P0    44W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA A100-SXM...  On   | 0000000D:00:00.0 Off |                    0 |
| N/A   36C    P0    44W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA A100-SXM...  On   | 0000000E:00:00.0 Off |                    0 |
| N/A   37C    P0    44W / 400W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                 95MiB |
|    0   N/A  N/A      2693      G   /usr/bin/gnome-shell               16MiB |
|    1   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    2   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    3   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    4   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    5   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    6   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
|    7   N/A  N/A      2029      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

exactly same disk been used with V100 and everything works smooth.
Any ideas what it can be?

@gmarkall
Copy link
Member

gmarkall commented Jul 6, 2021

I wonder what CUDA libs it's picking up - can you give the output of python -c "from numba import cuda; cuda.cudadrv.libs.test()" please?

@gmarkall gmarkall added CUDA CUDA related issue/PR question Notes an issue as a question labels Jul 6, 2021
@2W-12
Copy link
Author

2W-12 commented Jul 6, 2021

I wonder what CUDA libs it's picking up - can you give the output of python -c "from numba import cuda; cuda.cudadrv.libs.test()" please?

>>> import numba
>>> from numba import cuda
>>> cuda.cudadrv.libs.test()
Finding cublas from System
	located at /usr/local/cuda/lib64/libcublas.so.11.5.2.43
	trying to open library...	ok
Finding cusparse from System
	located at /usr/local/cuda/lib64/libcusparse.so.11.6.0.43
	trying to open library...	ok
Finding cufft from System
	located at /usr/local/cuda/lib64/libcufft.so.10.5.0.43
	trying to open library...	ok
Finding curand from System
	located at /usr/local/cuda/lib64/libcurand.so.10.2.5.43
	trying to open library...	ok
Finding nvvm from System
	located at /usr/local/cuda/nvvm/lib64/libnvvm.so.4.0.0
	trying to open library...	ok
Finding cudart from System
	located at /usr/local/cuda/lib64/libcudart.so.11.4.43
	trying to open library...	ok
Finding cudadevrt from System
	located at /usr/local/cuda/lib64/libcudadevrt.a
Finding libdevice from System
	searching for compute_20...	ok
	searching for compute_30...	ok
	searching for compute_35...	ok
	searching for compute_50...	ok
True

😁😁😁

@gmarkall
Copy link
Member

gmarkall commented Jul 6, 2021

Thanks - sorry, I totally forgot that this doesn't output the location of the libcuda.so that it's using. Can you check what libcuda.so another working app is using, then check what libcuda.so Numba is picking up?

If you're not sure how to do this, one way to do it would be to start the app under GDB, wait until it's started using CUDA, then break in it and give the command info shared. Similarly you could start a Python session under GDB, run cuda.detect(), then break and do info shared to find out which libcuda.so Numba is using.

Could you try this and paste the output of info shared for both processes please?

@gmarkall
Copy link
Member

gmarkall commented Jul 6, 2021

(Created #7188 to remind me to do something about the location of libcuda.so not being shown)

@2W-12
Copy link
Author

2W-12 commented Jul 6, 2021

Thanks - sorry, I totally forgot that this doesn't output the location of the libcuda.so that it's using. Can you check what libcuda.so another working app is using, then check what libcuda.so Numba is picking up?

If you're not sure how to do this, one way to do it would be to start the app under GDB, wait until it's started using CUDA, then break in it and give the command info shared. Similarly you could start a Python session under GDB, run cuda.detect(), then break and do info shared to find out which libcuda.so Numba is using.

Could you try this and paste the output of info shared for both processes please?

Sure.
There you go:

(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x00007ffff7fd0100  0x00007ffff7ff2674  Yes (*)     /lib64/ld-linux-x86-64.so.2
0x00007ffff7dec630  0x00007ffff7f6120d  Yes         /lib/x86_64-linux-gnu/libc.so.6
0x00007ffff7dabae0  0x00007ffff7dbb4d5  Yes         /lib/x86_64-linux-gnu/libpthread.so.0
0x00007ffff7d9f220  0x00007ffff7da0179  Yes         /lib/x86_64-linux-gnu/libdl.so.2
0x00007ffff7d9a3e0  0x00007ffff7d9ad90  Yes         /lib/x86_64-linux-gnu/libutil.so.1
0x00007ffff7c593c0  0x00007ffff7cfff18  Yes         /lib/x86_64-linux-gnu/libm.so.6
0x00007ffff7c20230  0x00007ffff7c3b4b7  Yes (*)     /lib/x86_64-linux-gnu/libexpat.so.1
0x00007ffff7c00280  0x00007ffff7c10e2b  Yes (*)     /lib/x86_64-linux-gnu/libz.so.1
0x00007ffff74b4f80  0x00007ffff74ebb0c  Yes (*)     /usr/lib/python3/dist-packages/_yaml.cpython-38-x86_64-linux-gnu.so
0x00007ffff748e240  0x00007ffff74a71b5  Yes (*)     /lib/x86_64-linux-gnu/libyaml-0.so.2
0x00007ffff7788460  0x00007ffff7796c4d  Yes (*)     /usr/lib/python3.8/lib-dynload/_ctypes.cpython-38-x86_64-linux-gnu.so
0x00007ffff7472230  0x00007ffff7477a46  Yes (*)     /lib/x86_64-linux-gnu/libffi.so.7
0x00007ffff7fbe580  0x00007ffff7fbfd59  Yes (*)     /usr/lib/python3.8/lib-dynload/_bz2.cpython-38-x86_64-linux-gnu.so
0x00007ffff739f240  0x00007ffff73abec6  Yes (*)     /lib/x86_64-linux-gnu/libbz2.so.1.0
0x00007ffff74838a0  0x00007ffff74861f9  Yes (*)     /usr/lib/python3.8/lib-dynload/_lzma.cpython-38-x86_64-linux-gnu.so
0x00007ffff73673c0  0x00007ffff737e3a6  Yes (*)     /lib/x86_64-linux-gnu/liblzma.so.5
0x00007ffff7399160  0x00007ffff73994b5  Yes (*)     /usr/lib/python3.8/lib-dynload/_opcode.cpython-38-x86_64-linux-gnu.so
0x00007ffff346a700  0x00007ffff5232ce0  Yes         /home/xxz/.local/lib/python3.8/site-packages/llvmlite/binding/libllvmlite.so
0x00007ffff7390720  0x00007ffff7393d70  Yes         /lib/x86_64-linux-gnu/librt.so.1
0x00007ffff3030160  0x00007ffff3118452  Yes (*)     /lib/x86_64-linux-gnu/libstdc++.so.6
0x00007ffff2f7a5e0  0x00007ffff2f8b045  Yes (*)     /lib/x86_64-linux-gnu/libgcc_s.so.1
0x00007ffff2e1a440  0x00007ffff2e2233b  Yes (*)     /usr/lib/python3.8/lib-dynload/_ssl.cpython-38-x86_64-linux-gnu.so
0x00007ffff2d95770  0x00007ffff2de0baa  Yes (*)     /lib/x86_64-linux-gnu/libssl.so.1.1
0x00007ffff2b19000  0x00007ffff2cb2800  Yes (*)     /lib/x86_64-linux-gnu/libcrypto.so.1.1
0x00007ffff757a0c0  0x00007ffff757a2af  Yes (*)     /usr/lib/python3.8/lib-dynload/_contextvars.cpython-38-x86_64-linux-gnu.so
0x00007ffff2a12a00  0x00007ffff2a172e9  Yes (*)     /usr/lib/python3.8/lib-dynload/_asyncio.cpython-38-x86_64-linux-gnu.so
0x00007ffff317e440  0x00007ffff317f9c8  Yes (*)     /usr/lib/python3.8/lib-dynload/_lsprof.cpython-38-x86_64-linux-gnu.so
0x00007ffff287c6a0  0x00007ffff28898ad  Yes (*)     /usr/lib/python3.8/lib-dynload/_json.cpython-38-x86_64-linux-gnu.so
0x00007ffff227c5f0  0x00007ffff25916c8  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so
0x00007ffff040b000  0x00007ffff1dd1db4  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libopenblasp-r0-5bebc122.3.13.dev.so
0x00007fffefefe850  0x00007ffff00bc378  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libgfortran-2e0d59d6.so.5.0.0
0x00007fffefca53b0  0x00007fffefccc13c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libquadmath-2d0c479f.so.0.0.0
0x00007fffefa8e120  0x00007fffefa993a8  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libz-eb09ad1d.so.1.2.3
0x00007fff51eafba0  0x00007fff51ec4adc  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/core/_multiarray_tests.cpython-38-x86_64-linux-gnu.so
0x00007fff51c22110  0x00007fff51c23308  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/linalg/lapack_lite.cpython-38-x86_64-linux-gnu.so
0x00007fff519f8a60  0x00007fff51a15908  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/linalg/_umath_linalg.cpython-38-x86_64-linux-gnu.so
0x00007fff5171ac40  0x00007fff5172d9ec  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/fft/_pocketfft_internal.cpython-38-x86_64-linux-gnu.so
0x00007fff5143f500  0x00007fff5148449c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/mtrand.cpython-38-x86_64-linux-gnu.so
0x00007fff5120b360  0x00007fff51223050  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/bit_generator.cpython-38-x86_64-linux-gnu.so
0x00007fff50fcde70  0x00007fff50ffd450  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_common.cpython-38-x86_64-linux-gnu.so
0x00007fff50f81960  0x00007fff50f84e2b  Yes (*)     /usr/lib/python3.8/lib-dynload/_hashlib.cpython-38-x86_64-linux-gnu.so
0x00007fff50d30340  0x00007fff50d715bc  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_bounded_integers.cpython-38-x86_64-linux-gnu.so
0x00007fff50b13ae0  0x00007fff50b21a9c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_mt19937.cpython-38-x86_64-linux-gnu.so
0x00007fff508fe560  0x00007fff5090a60c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_philox.cpython-38-x86_64-linux-gnu.so
0x00007fff506e5730  0x00007fff506f413c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_pcg64.cpython-38-x86_64-linux-gnu.so
0x00007fff504d7f70  0x00007fff504dea88  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_sfc64.cpython-38-x86_64-linux-gnu.so
0x00007fff50216780  0x00007fff5027b57c  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/numpy/random/_generator.cpython-38-x86_64-linux-gnu.so
0x00007ffff3177270  0x00007ffff3178302  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/core/typeconv/_typeconv.cpython-38-x86_64-linux-gnu.so
0x00007ffff71e0100  0x00007ffff71e0285  Yes (*)     /usr/lib/python3.8/lib-dynload/_uuid.cpython-38-x86_64-linux-gnu.so
0x00007ffff71c8580  0x00007ffff71cbf71  Yes (*)     /lib/x86_64-linux-gnu/libuuid.so.1
0x00007fff4fe45e60  0x00007fff4fe5b04c  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/_helperlib.cpython-38-x86_64-linux-gnu.so
0x00007ffff71db220  0x00007ffff71dbb24  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/_dynfunc.cpython-38-x86_64-linux-gnu.so
0x00007ffff717b5e0  0x00007ffff7181d72  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/_dispatcher.cpython-38-x86_64-linux-gnu.so
0x00007ffff71d60c0  0x00007ffff71d62ea  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/_devicearray.cpython-38-x86_64-linux-gnu.so
0x00007ffff7171230  0x00007ffff7173458  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/core/runtime/_nrt_python.cpython-38-x86_64-linux-gnu.so
0x00007ffff716a260  0x00007ffff716bc3c  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/np/ufunc/_internal.cpython-38-x86_64-linux-gnu.so
0x00007ffff71d1110  0x00007ffff71d1435  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/experimental/jitclass/_box.cpython-38-x86_64-linux-gnu.so
0x00007fff4f5f3740  0x00007fff4f5fb6c8  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/scipy/_lib/_ccallback_c.cpython-38-x86_64-linux-gnu.so
0x00007ffff7164190  0x00007ffff7164a09  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/mviewbuf.cpython-38-x86_64-linux-gnu.so
0x00007ffff715f090  0x00007ffff715f211  Yes         /home/xxz/.local/lib/python3.8/site-packages/numba/cuda/cudadrv/_extras.cpython-38-x86_64-linux-gnu.so
0x00007fff4dcda9f0  0x00007fff4e0e7d74  Yes (*)     /lib/x86_64-linux-gnu/libcuda.so
0x00007fff4db036a0  0x00007fff4db20d20  Yes (*)     /usr/lib/python3/dist-packages/apt_pkg.cpython-38-x86_64-linux-gnu.so
0x00007fff4d953140  0x00007fff4da978e3  Yes (*)     /lib/x86_64-linux-gnu/libapt-pkg.so.6.0
0x00007fff4d8f6720  0x00007fff4d90511c  Yes         /lib/x86_64-linux-gnu/libresolv.so.2
0x00007fff4d8d3120  0x00007fff4d8ec8fb  Yes (*)     /lib/x86_64-linux-gnu/liblz4.so.1
0x00007fff4d82c240  0x00007fff4d8bdd0a  Yes (*)     /lib/x86_64-linux-gnu/libzstd.so.1
0x00007fff4d801000  0x00007fff4d81b540  Yes (*)     /lib/x86_64-linux-gnu/libudev.so.1
0x00007fff4d75dbc0  0x00007fff4d7d0780  Yes (*)     /lib/x86_64-linux-gnu/libsystemd.so.0
0x00007fff4d63a580  0x00007fff4d7079dc  Yes (*)     /lib/x86_64-linux-gnu/libgcrypt.so.20
0x00007fff4d60fc60  0x00007fff4d621a92  Yes (*)     /lib/x86_64-linux-gnu/libgpg-error.so.0
0x00007ffff715a3e0  0x00007ffff715aee8  Yes (*)     /usr/lib/python3.8/lib-dynload/_queue.cpython-38-x86_64-linux-gnu.so
0x00007ffff71541e0  0x00007ffff715470f  Yes (*)     /usr/lib/python3/dist-packages/cryptography/hazmat/bindings/_constant_time.abi3.so
0x00007fff4cfdab70  0x00007fff4cff4f78  Yes         /home/xxz/.local/lib/python3.8/site-packages/_cffi_backend.cpython-38-x86_64-linux-gnu.so
0x00007fff4cdca8b0  0x00007fff4cdd0048  Yes (*)     /home/xxz/.local/lib/python3.8/site-packages/cffi.libs/libffi-806b1a9d.so.6.0.4
0x00007fff4ccdae40  0x00007fff4cd50e0f  Yes (*)     /usr/lib/python3/dist-packages/cryptography/hazmat/bindings/_openssl.abi3.so
0x00007fff4cb06780  0x00007fff4cb19f17  Yes (*)     /usr/lib/python3.8/lib-dynload/_decimal.cpython-38-x86_64-linux-gnu.so
0x00007fff4caca2e0  0x00007fff4caf2658  Yes (*)     /lib/x86_64-linux-gnu/libmpdec.so.2
0x00007fff4f366820  0x00007fff4f36be49  Yes (*)     /usr/lib/python3/dist-packages/simplejson/_speedups.cpython-38-x86_64-linux-gnu.so
                                        No          linux-vdso.so.1

@2W-12
Copy link
Author

2W-12 commented Jul 6, 2021

changing A100 to V100 on same system:

>>> cuda.detect()
Found 4 CUDA devices
id 0    b'Tesla V100-PCIE-16GB'                              [SUPPORTED]
                      compute capability: 7.0
                           pci device id: 0
                              pci bus id: 0
id 1    b'Tesla V100-PCIE-16GB'                              [SUPPORTED]
                      compute capability: 7.0
                           pci device id: 0
                              pci bus id: 0
id 2    b'Tesla V100-PCIE-16GB'                              [SUPPORTED]
                      compute capability: 7.0
                           pci device id: 0
                              pci bus id: 0
id 3    b'Tesla V100-PCIE-16GB'                              [SUPPORTED]
                      compute capability: 7.0
                           pci device id: 0
                              pci bus id: 0
Summary:
	4/4 devices are supported
True

mystics....

@gmarkall
Copy link
Member

gmarkall commented Jul 6, 2021

Thanks - I see the output for one application (Python / Numba?) but not for the other (the working application, I think) - can you give the output for the other as well please?

@2W-12
Copy link
Author

2W-12 commented Jul 6, 2021

Please close issue.
Itsn't numba relevant.
For anyone who experience same problem with some A100/V100 or any other GPU with nvlink/datacenter edition must install and run nv-fabricmanager for initial GPU init.

@gmarkall
Copy link
Member

gmarkall commented Jul 7, 2021

Glad you solved it - many thanks for following up with an explanation of how to resolve the issue, too!

@gmarkall gmarkall closed this as completed Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA CUDA related issue/PR question Notes an issue as a question
Projects
None yet
Development

No branches or pull requests

2 participants