Properly define a grid-and-block dims structure #284

eyalroz · 2021-12-10T22:23:35Z

The library already, perhaps somewhat against my earlier assumptions, has to contend with a structure representing both grid dimensions in blocks and block dimensions in threads, which is not yet a complete launch configuration. The cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() function returns this (albeit for the 1-dimensional case).

We currently use an std::pair<cuda::grid::dimensions_t, cuda::grid::block_dimensions_t>. This choice works, but there's no good reason not to have a simple dimensions struct with grid and block members. Let's have that.

The text was updated successfully, but these errors were encountered:

* For #284: Introduced a grid-and-block-dimensions structure, `grid::complete_dimensions_t`. Using it when returning both grid and block dimensions instead of an `std::pair`; it has equals * For #285: Changed the construction pattern for `kernel_t`: * Dropped the templated, wrapping, direct constructor. * Added `kernel::detail_::wrap()` taking a device ID and an arbitrary (function) pointer, and a `kernel::wrap()` taking a device ID and type-erased `const void*` pointer. * Made the lower-level `wrap()` a friend of the `kernel_t` class. * Now using the default destructor for `kernel_t`'s (has nothing to do with the construction changes). * Spacing tweaks. * Comment typo fixes. * Added not-equal operators for launch configurations * Added some comments to some `#endif`'s, reminding the reader of the condition used in the `#if` of `#ifdef`. * Made some narrowing casts explicit, to clarify their intentionality to static analysis tool. * Added two aliases to the sync/async boolean enum in `cuda::stream` * A bit of comment rephrasing Example program changes: * Adapted examples for the use of `grid::complete_dimensions_t`. * Now creating wrapped kernels using `cuda::kernel::wrap()` rather than by direct construction. * Spacing tweaks. * Changes to the `cudaChooseDevice()` function in `helper_cuda.h`; mainly: * Now returning a `cuda::device_t` * No longer making the returned device current. In particular, that means that `simpleStreams.cu` may now be using a device that's not the current one.

eyalroz added the task label Dec 10, 2021

eyalroz self-assigned this Dec 10, 2021

eyalroz added the resolved-on-development label Dec 11, 2021

eyalroz closed this as completed Jan 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly define a grid-and-block dims structure #284

Properly define a grid-and-block dims structure #284

eyalroz commented Dec 10, 2021

Properly define a grid-and-block dims structure #284

Properly define a grid-and-block dims structure #284

Comments

eyalroz commented Dec 10, 2021