-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly define a grid-and-block dims structure #284
Labels
Comments
eyalroz
added a commit
that referenced
this issue
Dec 11, 2021
* For #284: Introduced a grid-and-block-dimensions structure, `grid::complete_dimensions_t`. Using it when returning both grid and block dimensions instead of an `std::pair`; it has equals * For #285: Changed the construction pattern for `kernel_t`: * Dropped the templated, wrapping, direct constructor. * Added `kernel::detail_::wrap()` taking a device ID and an arbitrary (function) pointer, and a `kernel::wrap()` taking a device ID and type-erased `const void*` pointer. * Made the lower-level `wrap()` a friend of the `kernel_t` class. * Now using the default destructor for `kernel_t`'s (has nothing to do with the construction changes). * Spacing tweaks. * Comment typo fixes. * Added not-equal operators for launch configurations * Added some comments to some `#endif`'s, reminding the reader of the condition used in the `#if` of `#ifdef`. * Made some narrowing casts explicit, to clarify their intentionality to static analysis tool. * Added two aliases to the sync/async boolean enum in `cuda::stream` * A bit of comment rephrasing Example program changes: * Adapted examples for the use of `grid::complete_dimensions_t`. * Now creating wrapped kernels using `cuda::kernel::wrap()` rather than by direct construction. * Spacing tweaks. * Changes to the `cudaChooseDevice()` function in `helper_cuda.h`; mainly: * Now returning a `cuda::device_t` * No longer making the returned device current. In particular, that means that `simpleStreams.cu` may now be using a device that's not the current one.
eyalroz
added a commit
that referenced
this issue
Dec 13, 2021
* For #284: Introduced a grid-and-block-dimensions structure, `grid::complete_dimensions_t`. Using it when returning both grid and block dimensions instead of an `std::pair`; it has equals * For #285: Changed the construction pattern for `kernel_t`: * Dropped the templated, wrapping, direct constructor. * Added `kernel::detail_::wrap()` taking a device ID and an arbitrary (function) pointer, and a `kernel::wrap()` taking a device ID and type-erased `const void*` pointer. * Made the lower-level `wrap()` a friend of the `kernel_t` class. * Now using the default destructor for `kernel_t`'s (has nothing to do with the construction changes). * Spacing tweaks. * Comment typo fixes. * Added not-equal operators for launch configurations * Added some comments to some `#endif`'s, reminding the reader of the condition used in the `#if` of `#ifdef`. * Made some narrowing casts explicit, to clarify their intentionality to static analysis tool. * Added two aliases to the sync/async boolean enum in `cuda::stream` * A bit of comment rephrasing Example program changes: * Adapted examples for the use of `grid::complete_dimensions_t`. * Now creating wrapped kernels using `cuda::kernel::wrap()` rather than by direct construction. * Spacing tweaks. * Changes to the `cudaChooseDevice()` function in `helper_cuda.h`; mainly: * Now returning a `cuda::device_t` * No longer making the returned device current. In particular, that means that `simpleStreams.cu` may now be using a device that's not the current one.
eyalroz
added a commit
that referenced
this issue
Jan 14, 2022
* For #284: Introduced a grid-and-block-dimensions structure, `grid::complete_dimensions_t`. Using it when returning both grid and block dimensions instead of an `std::pair`; it has equals * For #285: Changed the construction pattern for `kernel_t`: * Dropped the templated, wrapping, direct constructor. * Added `kernel::detail_::wrap()` taking a device ID and an arbitrary (function) pointer, and a `kernel::wrap()` taking a device ID and type-erased `const void*` pointer. * Made the lower-level `wrap()` a friend of the `kernel_t` class. * Now using the default destructor for `kernel_t`'s (has nothing to do with the construction changes). * Spacing tweaks. * Comment typo fixes. * Added not-equal operators for launch configurations * Added some comments to some `#endif`'s, reminding the reader of the condition used in the `#if` of `#ifdef`. * Made some narrowing casts explicit, to clarify their intentionality to static analysis tool. * Added two aliases to the sync/async boolean enum in `cuda::stream` * A bit of comment rephrasing Example program changes: * Adapted examples for the use of `grid::complete_dimensions_t`. * Now creating wrapped kernels using `cuda::kernel::wrap()` rather than by direct construction. * Spacing tweaks. * Changes to the `cudaChooseDevice()` function in `helper_cuda.h`; mainly: * Now returning a `cuda::device_t` * No longer making the returned device current. In particular, that means that `simpleStreams.cu` may now be using a device that's not the current one.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The library already, perhaps somewhat against my earlier assumptions, has to contend with a structure representing both grid dimensions in blocks and block dimensions in threads, which is not yet a complete launch configuration. The
cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags()
function returns this (albeit for the 1-dimensional case).We currently use an
std::pair<cuda::grid::dimensions_t, cuda::grid::block_dimensions_t>
. This choice works, but there's no good reason not to have a simple dimensions struct withgrid
andblock
members. Let's have that.The text was updated successfully, but these errors were encountered: