Restore CUDA 9.x compatibility #304

eyalroz · 2022-03-24T00:02:54Z

We're currently not compatible with CUDA 9.x; let's fix that.

* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000) * Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place) * Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime. * Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0 * In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols. Caveats: * Some tests still fail; it remains to determine why. * These changes target 9.2 compatibility, not 9.0.

* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum.

* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Now (apparently) fully-compatible with CUDA 9.2.

* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.

…ionality requiring later versions).

* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000) * Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place) * Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime. * Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0 * In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols. Caveats: * Some tests still fail; it remains to determine why. * These changes target 9.2 compatibility, not 9.0.

* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.

…ionality requiring later versions).

…_grid_params_for_max_occupancy` in CUDA version 10.0 - which does not yet support it.

…ancy()` in CUDA 10.0

* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch. * Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument. * Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before * `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy). * Dropped some redundant comments in `stream.hpp` about the choice of API functions

eyalroz added the task label Mar 24, 2022

eyalroz self-assigned this Mar 24, 2022

eyalroz mentioned this issue Mar 24, 2022

Use driver API error enum values whenever applicable #306

Closed

eyalroz added a commit that referenced this issue Mar 24, 2022

Fixes #304, #305, #306:

d24ae5b

* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum.

eyalroz added a commit that referenced this issue Apr 4, 2022

Fixes #304: Now compatible with CUDA 9.0 (by #if-ing off some funct…

3fcd36d

…ionality requiring later versions).

eyalroz added a commit that referenced this issue Apr 16, 2022

Fixes #304: Now compatible with CUDA 9.0 (by #if-ing off some funct…

5f332e3

…ionality requiring later versions).

eyalroz closed this as completed in 26d7c0c May 9, 2022

eyalroz added a commit that referenced this issue Jun 20, 2022

Fixes #304: Now compatible with CUDA 9.0 (by #if-ing off some funct…

1c35412

…ionality requiring later versions).

eyalroz added a commit that referenced this issue Aug 6, 2022

Regards #304: The fix for this issue had inappropriately exposed `min…

9a61805

…_grid_params_for_max_occupancy` in CUDA version 10.0 - which does not yet support it.

eyalroz mentioned this issue Aug 6, 2022

Regression: Fix for #304 broke CUDA 10.0 compatibility #391

Closed

eyalroz added a commit that referenced this issue Aug 6, 2022

Fixes #391, regards #304: Don't expose `min_grid_params_for_max_occup…

475f194

…ancy()` in CUDA 10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore CUDA 9.x compatibility #304

Restore CUDA 9.x compatibility #304

eyalroz commented Mar 24, 2022

Restore CUDA 9.x compatibility #304

Restore CUDA 9.x compatibility #304

Comments

eyalroz commented Mar 24, 2022