New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support NVIDIA's CUDA Python bindings #7461
Conversation
This provides enough to successfully run `cuda.detect()` and for `cuda.is_available()` to return `True`, whilst maintaining correctness of the inbuilt ctypes bindings.
All tests pass with ctypes bindings. With CUDA Python bindings: ``` Ran 1201 tests in 30.800s FAILED (failures=5, errors=900, skipped=10, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 31.862s FAILED (failures=49, errors=863, skipped=10, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 32.145s FAILED (failures=6, errors=892, skipped=10, expected failures=1) ```
No functional change (intended).
CUDA Python test results are now: ``` Ran 1201 tests in 35.580s FAILED (failures=6, errors=892, skipped=10, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 36.668s FAILED (failures=6, errors=1580, skipped=10, expected failures=1) ```
CUDA Python test results are presently: ``` Ran 1201 tests in 39.895s FAILED (failures=30, errors=1500, skipped=10, expected failures=1) ```
Unfortunately the testsuite segfaults before completion at present, so there are no metrics available.
Test results with CUDA Python: ``` Ran 1174 tests in 60.225s FAILED (failures=30, errors=591, skipped=10, expected failures=1) ```
Test results with CUDA Python are now: ``` Ran 1201 tests in 47.054s FAILED (failures=31, errors=248, skipped=41, expected failures=1) ```
It should not be the responsibility of launch_kernel to get the ctypes pointer from a higher-level abstraction object such as a device array or record.
CUDA Python test results are now: ``` Ran 1201 tests in 72.650s FAILED (failures=32, errors=224, skipped=50, expected failures=1) ```
These caused some problems for the ctypes bindings.
CUDA Python test results are now: ``` Ran 1201 tests in 68.140s FAILED (failures=15, errors=217, skipped=41, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 54.820s FAILED (failures=13, errors=146, skipped=41, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 65.175s FAILED (failures=15, errors=139, skipped=41, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 55.840s FAILED (failures=17, errors=105, skipped=41, expected failures=1) ```
CUDA Python results are now: ``` Ran 1201 tests in 74.768s FAILED (failures=18, errors=99, skipped=41, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 81.919s FAILED (failures=15, errors=49, skipped=41, expected failures=1) ```
CUDA Python results are now: ``` Ran 1201 tests in 73.974s FAILED (failures=15, errors=39, skipped=41, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 73.229s FAILED (failures=10, errors=38, skipped=41, expected failures=1) ```
Tests now unskipped. CUDA Python tests are now: ``` Ran 1201 tests in 78.288s FAILED (failures=10, errors=38, skipped=13, expected failures=1) ```
CUDA Python test results now look like: ``` Ran 1201 tests in 74.361s FAILED (failures=10, errors=38, skipped=10, expected failures=1) ```
Tests results with CUDA Python are now: ``` Ran 1201 tests in 74.033s FAILED (failures=7, errors=34, skipped=10, expected failures=1) ```
Test results with CUDA Python are now: ``` Ran 1201 tests in 73.647s FAILED (failures=7, errors=32, skipped=10, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 75.513s FAILED (failures=7, errors=25, skipped=10, expected failures=1) ```
CUDA Python test results are now: ``` Ran 1201 tests in 74.231s FAILED (failures=3, errors=25, skipped=10, expected failures=2) ```
CUDA Python test results are now: ``` Ran 1201 tests in 77.240s FAILED (failures=1, errors=22, skipped=10, expected failures=2) (numbanp120) ```
CUDA Python test results are now: ``` Ran 1201 tests in 73.284s FAILED (failures=1, errors=14, skipped=10, expected failures=2) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, couple more minor things to look at otherwise looks good.
gpuci run tests |
@stuartarchibald Thanks for the feedback - I've turned the hard error into a warning now. I think this should be ready for another look. |
Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for all your efforts on this @gmarkall, really glad to see this working and to get the performance boost this provides to CUDA users. This is waiting on a final gpuci
run and also the buildfarm for confirmation. Thanks again.
|
gpuci run tests |
@esc Did the buildfarm run succeed? |
This PR adds support for using NVIDIA's CUDA Python bindings to make CUDA Driver API calls. The ecosystem around these bindings presently at an early stage, with CuPy recently gaining support for them. Eventually the common use of the NVIDIA bindings by Python libraries in the CUDA ecosystem will facilitate better interoperability for primitives other than just arrays (which are presently dealt with nicely by the CUDA Array Interface), and eliminate the need for converting types between libraries (see e.g. #4797).
An overview of differences between the NVIDIA bindings and Numba's ctypes bindings:
cuMemGetInfo
which returns the free and total memory). The way in which values are returned is made consistent with the ctypes bindings by some adaptation in_check_cuda_python_error
.CUDA_USE_CUDA_PYTHON
.Module
,Function
andLinker
are now ABCs with a separate implementations for each binding, as they are mostly specific to the binding in use.Some remarks:
CCs for visiblity: @quasiben @leofang @stuartarchibald @mmccarty @shwina