CUDA: Don't make a runtime call on import #6147

gmarkall · 2020-08-18T20:08:22Z

Calling runtime.get_version() on import initializes the CUDA driver, even though it creates no context. This is an issue for libraries that import Numba then set CUDA_VISIBLE_DEVICES, because the call at import time already initialized the driver and fixed the set of visible devices. This is an issue for, but not limited to, the Dask Local CUDA Cluster.

This PR fixes this by deferring the call to runtime.get_version() until the list of supported CCs is actually needed.

cc @kkraus14 @pentschev @quasiben

Fixes #6149.

Calling `runtime.get_version()` on import initializes the CUDA driver, even though it creates no context. This is an issue for libraries that import Numba then set `CUDA_VISIBLE_DEVICES`, because the call at import time already initialized the driver and fixed the set of visible devices. This is an issue for, but not limited to, the Dask Local CUDA Cluster. This commit fixes this by deferring the call to `runtime.get_version()` until the list of supported CCs is actually needed.

kkraus14 · 2020-08-18T20:17:43Z

FYI: This breaks more than just Dask, but Python multiprocessing in general as even spawning a process via multiprocessing will have it run imports from the parent process unless they're put under the if __name__ == "__main__" scope.

gmarkall · 2020-08-19T10:08:11Z

For some extra info - prior to the commit in this PR:

$ NUMBA_CUDA_LOG_LEVEL=DEBUG python -c "from numba import cuda; print('Done')"
== CUDA [199] DEBUG -- call runtime api: cudaRuntimeGetVersion
Done

With this PR / commit:

$ NUMBA_CUDA_LOG_LEVEL=DEBUG python -c "from numba import cuda; print('Done')"
Done

gmarkall · 2020-08-19T11:30:42Z

Test added - the test requires multiple GPUs. For me on a single-GPU system (one RTX 8000), I get:

$ python -m numba.runtests numba.cuda.tests.cudadrv.test_runtime -v
test_get_version (numba.cuda.tests.cudadrv.test_runtime.TestRuntime) ... ok
test_visible_devices_set_after_import (numba.cuda.tests.cudadrv.test_runtime.TestRuntime)
    ... skipped 'This test requires multiple GPUs'

----------------------------------------------------------------------
Ran 2 tests in 0.019s

OK (skipped=1)

With an 8-GPU system (V100s):

$ python -m numba.runtests numba.cuda.tests.cudadrv.test_runtime -v
test_get_version (numba.cuda.tests.cudadrv.test_runtime.TestRuntime) ... ok
test_visible_devices_set_after_import (numba.cuda.tests.cudadrv.test_runtime.TestRuntime) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.618s

OK

and with CUDA_VISIBLE_DEVICES already set:

$ export CUDA_VISIBLE_DEVICES=6,7
$ python -m numba.runtests numba.cuda.tests.cudadrv.test_runtime -v
test_get_version (numba.cuda.tests.cudadrv.test_runtime.TestRuntime) ... ok
test_visible_devices_set_after_import (numba.cuda.tests.cudadrv.test_runtime.TestRuntime)
    ... skipped 'Cannot test when CUDA_VISIBLE_DEVICES already set'

----------------------------------------------------------------------
Ran 2 tests in 0.051s

OK (skipped=1)

pentschev

LGTM, thanks @gmarkall !

stuartarchibald · 2020-08-20T10:38:29Z

Buildfarm ID: numba_smoketest_cuda_94 for a8e0877

stuartarchibald · 2020-08-20T11:04:30Z

Testsuite is failing on the farm universally with:

[2020-08-20 06:00:35,606]  INFO - ======================================================================
[2020-08-20 06:00:35,606]  INFO - FAIL: test_visible_devices_set_after_import (numba.cuda.tests.cudadrv.test_runtime.TestRuntime)
[2020-08-20 06:00:35,606]  INFO - ----------------------------------------------------------------------
[2020-08-20 06:00:35,606]  INFO - Traceback (most recent call last):
[2020-08-20 06:00:35,606]  INFO -   File "<path>\testenv_b522e3de-5976-43cf-ae68-63c7d7c1f2a3\lib\site-packages\numba\cuda\tests\cudadrv\test_runtime.py", line 51, in test_visible_devices_set_after_import
[2020-08-20 06:00:35,606]  INFO -     p.start()
[2020-08-20 06:00:35,606]  INFO -   File "<path>\envs\testenv_b522e3de-5976-43cf-ae68-63c7d7c1f2a3\lib\multiprocessing\process.py", line 118, in start
[2020-08-20 06:00:35,606]  INFO -     assert not _current_process._config.get('daemon'), \
[2020-08-20 06:00:35,606]  INFO - AssertionError: daemonic processes are not allowed to have children
[2020-08-20 06:00:35,606]  INFO - 
[2020-08-20 06:00:35,606]  INFO - ----------------------------------------------------------------------
[2020-08-20 06:00:35,606]  INFO - Ran 665 tests in 144.755s
[2020-08-20 06:00:35,606]  INFO - 
[2020-08-20 06:00:35,606]  INFO - FAILED (failures=1, skipped=18)

numba/cuda/tests/cudadrv/test_runtime.py

This commit fixes the error: ``` [2020-08-20 06:00:35,606] INFO - FAIL: test_visible_devices_set_after_import (numba.cuda.tests.cudadrv.test_runtime.TestRuntime) [2020-08-20 06:00:35,606] INFO - ---------------------------------------------------------------------- [2020-08-20 06:00:35,606] INFO - Traceback (most recent call last): [2020-08-20 06:00:35,606] INFO - File "<path>\testenv_b522e3de-5976-43cf-ae68-63c7d7c1f2a3\lib\site-packages\numba\cuda\tests\cudadrv\test_runtime.py", line 51, in test_visible_devices_set_after_import [2020-08-20 06:00:35,606] INFO - p.start() [2020-08-20 06:00:35,606] INFO - File "<path>\envs\testenv_b522e3de-5976-43cf-ae68-63c7d7c1f2a3\lib\multiprocessing\process.py", line 118, in start [2020-08-20 06:00:35,606] INFO - assert not _current_process._config.get('daemon'), \ [2020-08-20 06:00:35,606] INFO - AssertionError: daemonic processes are not allowed to have children ``` by putting `test_visible_devices_set_after_import` in its own test class with a `SerialMixin`.

numba/cuda/cudadrv/nvvm.py

numba/cuda/tests/cudadrv/test_runtime.py

stuartarchibald

Thanks for fixing this!

stuartarchibald · 2020-08-21T16:27:15Z

Buildfarm ID: numba_smoketest_cuda_95 for 9988d56

PR numba#6147 broke the test_cuda_submodules test because the test_nvvm_driver test could no longer be imported - the change from SUPPORTED_CC to get_supported_ccs in nvvm.py was not reflected in the CUDA simulator. This commit fixes the issue by changing SUPPORTED_CC to get_supported_ccs in the simulator.

CUDA: Don't make a runtime call on import

gmarkall changed the title ~~[WIPDon't make a runtime call on import~~ [WIP] Don't make a runtime call on import Aug 18, 2020

gmarkall added the 2 - In Progress label Aug 18, 2020

gmarkall added this to the Numba 0.51.1 milestone Aug 18, 2020

gmarkall added the CUDA CUDA related issue/PR label Aug 18, 2020

gmarkall changed the title ~~[WIP] Don't make a runtime call on import~~ CUDA: Don't make a runtime call on import Aug 18, 2020

jakirkham approved these changes Aug 18, 2020

View reviewed changes

jakirkham mentioned this pull request Aug 18, 2020

ENH Update numba versioning to pull 0.51 rapidsai/integration#109

Merged

gmarkall added 3 - Ready for Review and removed 2 - In Progress labels Aug 19, 2020

stuartarchibald linked an issue Aug 19, 2020 that may be closed by this pull request

CUDA: GPU list frozen at import time with 0.51 #6149

Closed

Add test for Issue numba#6149

312d3ef

Fix flake8

a8e0877

stuartarchibald added the Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm label Aug 19, 2020

pentschev approved these changes Aug 19, 2020

View reviewed changes

jakirkham mentioned this pull request Aug 19, 2020

[BUG] HuggingFace/Pytorch with dask-cuda- worker does not free memory rapidsai/dask-cuda#383

Closed

quasiben mentioned this pull request Aug 19, 2020

Pin Numba version to exclude 0.51.0 rapidsai/dask-cuda#385

Merged

stuartarchibald reviewed Aug 20, 2020

View reviewed changes

numba/cuda/tests/cudadrv/test_runtime.py Show resolved Hide resolved

stuartarchibald reviewed Aug 20, 2020

View reviewed changes

numba/cuda/cudadrv/nvvm.py Show resolved Hide resolved

stuartarchibald reviewed Aug 20, 2020

View reviewed changes

numba/cuda/tests/cudadrv/test_runtime.py Show resolved Hide resolved

stuartarchibald added 4 - Waiting on CI Review etc done, waiting for CI to finish and removed 3 - Ready for Review labels Aug 20, 2020

stuartarchibald approved these changes Aug 20, 2020

View reviewed changes

sklam approved these changes Aug 21, 2020

View reviewed changes

sklam merged commit 79b7b3a into numba:master Aug 21, 2020

jakirkham mentioned this pull request Aug 24, 2020

Perform check for unique GPU assignments in LocalCUDACluster rapidsai/dask-cuda#384

Closed

stuartarchibald mentioned this pull request Aug 25, 2020

#6147 broke CUDASIM #6167

Closed

gmarkall mentioned this pull request Aug 25, 2020

Fix Issue #6167: Failure in test_cuda_submodules #6168

Merged

stuartarchibald mentioned this pull request Aug 26, 2020

Release 0.51.1 checklist #6172

Closed

21 tasks

stuartarchibald pushed a commit to stuartarchibald/numba that referenced this pull request Aug 26, 2020

Merge pull request numba#6147 from gmarkall/grm-no-runtime-on-import

4416a6c

CUDA: Don't make a runtime call on import

gmarkall mentioned this pull request Sep 18, 2020

Fix race in reduction kernels on Volta, require CUDA 9, add syncwarp with default mask #6127

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: Don't make a runtime call on import #6147

CUDA: Don't make a runtime call on import #6147

gmarkall commented Aug 18, 2020 •

edited

kkraus14 commented Aug 18, 2020

gmarkall commented Aug 19, 2020

gmarkall commented Aug 19, 2020

pentschev left a comment

stuartarchibald commented Aug 20, 2020

stuartarchibald commented Aug 20, 2020

stuartarchibald left a comment

stuartarchibald commented Aug 21, 2020

CUDA: Don't make a runtime call on import #6147

CUDA: Don't make a runtime call on import #6147

Conversation

gmarkall commented Aug 18, 2020 • edited

kkraus14 commented Aug 18, 2020

gmarkall commented Aug 19, 2020

gmarkall commented Aug 19, 2020

pentschev left a comment

Choose a reason for hiding this comment

stuartarchibald commented Aug 20, 2020

stuartarchibald commented Aug 20, 2020

stuartarchibald left a comment

Choose a reason for hiding this comment

stuartarchibald commented Aug 21, 2020

gmarkall commented Aug 18, 2020 •

edited