-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make alignment consistent #809
Comments
In addition, the user needs greater control over the alignment/padding decisions. For example, by my reading of the calculation of the padding size in View_Mapping.hpp (starting around line 1700), a 2-D layout right, double precision view would only be padded if it had >= 64 columns, even though the performance benefits can be observed with a much smaller number of columns. For example, in a tensor project I am working on, the dominant cost is a kernel that processes a tall, skinny, layout-right matrix. With the current alignment/padding logic, I get about 30% less performance for smaller numbers of columns if I force padding versus not (e.g., by setting Kokkos::MEMORY_ALIGNMENT_THRESHOLD = 0 in Kokkos_MemoryTraits.hpp, instead of 4). At the very minimum, the user should be able to change Kokkos::MEMORY_ALIGNMENT_THRESHOLD at compile time, similar to how it is done with Kokkos::Kokkos::MEMORY_ALIGNMENT. However I think an even better approach would be to allow the user to specify these constants on a per-view basis, e.g., by passing them in as arguments to the AllowPadding ctor property. Also, the user needs control over the alignment of scratch memory space views. For example, for KNL, I want scratch views 64 byte aligned for aligned vector load/stores. |
To clarify, I meant 30% less performance is the views are not padded (which is what happens now). |
For kokkos#1092: Move SpaceAccessibility meta-function to Kokkos namespace. For kokkos#809: Always have KOKKOS_MEMORY_ALIGNMENT macro defined, replace KOKKOS_ALIGN_SIZE with KOKKOS_MEMORY_ALIGNMENT, and static_assert that it is a power of two. Rename KOKKOS_ALIGN_PTR macro as KOKKOS_IMPL_ALIGN_PTR. Remove KOKKOS_ALIGN macro and just use __attribute__((aligned(size))), this is used in only one soon-to-be-deprecated location.
In
Kokkos_Macros.hpp
we haveKOKKOS_ALIGN_SIZE
with inconsistent values.In
Kokkos_MemoryTraits.hpp
we haveKOKKOS_MEMORY_ALIGNMENT
with different value.and scratch memory space Views are 8 byte aligned.
Need to have consistent alignment of global memory throughout; or clear comment/documentation for different alignments.
OK that level 0 scratch is (smaller) that global memory alignment for CUDA.
The text was updated successfully, but these errors were encountered: