Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore CUDA 9.x compatibility #304

Closed
eyalroz opened this issue Mar 24, 2022 · 0 comments
Closed

Restore CUDA 9.x compatibility #304

eyalroz opened this issue Mar 24, 2022 · 0 comments
Assignees
Labels

Comments

@eyalroz
Copy link
Owner

eyalroz commented Mar 24, 2022

We're currently not compatible with CUDA 9.x; let's fix that.

@eyalroz eyalroz added the task label Mar 24, 2022
@eyalroz eyalroz self-assigned this Mar 24, 2022
eyalroz added a commit that referenced this issue Mar 24, 2022
* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000)
* Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place)
* Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime.
* Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0
* In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols.

Caveats:

* Some tests still fail; it remains to determine why.
* These changes target 9.2 compatibility, not 9.0.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Apr 4, 2022
eyalroz added a commit that referenced this issue Apr 16, 2022
@eyalroz eyalroz closed this as completed in 26d7c0c May 9, 2022
eyalroz added a commit that referenced this issue Jun 20, 2022
* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000)
* Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place)
* Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime.
* Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0
* In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols.

Caveats:

* Some tests still fail; it remains to determine why.
* These changes target 9.2 compatibility, not 9.0.
eyalroz added a commit that referenced this issue Jun 20, 2022
* Now using CUDA driver API error enum values wherever possible.
* Added all missing CUDA driver and runtime API error enum values to our own named error enum.
* Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers.
* Now (apparently) fully-compatible with CUDA 9.2.
eyalroz added a commit that referenced this issue Jun 20, 2022
eyalroz added a commit that referenced this issue Aug 6, 2022
…_grid_params_for_max_occupancy` in CUDA version 10.0 - which does not yet support it.
eyalroz added a commit that referenced this issue Aug 6, 2022
eyalroz added a commit that referenced this issue Feb 23, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch.
* Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument.
* Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before
* `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy).
* Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz added a commit that referenced this issue Feb 23, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch.
* Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument.
* Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before
* `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy).
* Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz added a commit that referenced this issue Mar 9, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch.
* Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument.
* Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before
* `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy).
* Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz added a commit that referenced this issue Mar 13, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch.
* Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument.
* Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before
* `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy).
* Dropped some redundant comments in `stream.hpp` about the choice of API functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant