Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement/check proper treatment of over-aligned Vec and Vector data #154

Closed
3 tasks done
KyleVaughn opened this issue May 15, 2024 · 1 comment
Closed
3 tasks done

Comments

@KyleVaughn
Copy link
Owner

KyleVaughn commented May 15, 2024

The Vec<D, T> class typically uses an unaligned array T[D] to store its data. However, when UM2_ENABLE_SIMD_VEC is on, if D is a power of 2 and T is an arithmetic type, then GCC vector extensions are used as the underlying storage instead. This enables very nice SIMD optimizations on Vec. It also increases its alignment from sizeof(T) to D * sizeof(T). See
https://godbolt.org/z/or73xrxbh.

However, in Vector<T> , we allocate memory to store T using (1) https://en.cppreference.com/w/cpp/memory/new/operator_new. It is unclear whether this memory will be appropriately aligned, since we do not explicitly request an alignment. Therefore, when using over-aligned types or GCC vector extensions we want to verify that the memory, access to the memory, and related pointers are appropriately aligned.

Failure to properly align will result in undefined behavior, reads that are incorrect, and likely segfaults.

Tasks related to this issue are:

  • When UM2_ENABLE_SIMD_VEC is off, ensure that T[D] is still aligned for types which map to SIMD vectors. Use something like
static consteval auto
isPowerOf2(Int x) noexcept -> bool
{
  return (x & (x - 1)) == 0;
};

template< Int D, class T>
static consteval auto
vecAlignment() noexcept -> Int
{
  if constexpr (isPowerOf2(D) && std::is_arithmetic_v<T>) {
    return D * sizeof(T);
  } else {
    return alignof(T[D]);
  }
};

template <Int D, class T>
class Vec
{

  using Data =  typename VecData<D, T>::Data;
  alignas(vecAlignment<D, T>()) Data _data;
...
};
  • Investigate usage of new and delete in Vector and ensure that all pointers use properly aligned memory for over-aligned types. It should be sufficient to check addressof(pointer) % alignof(T) == 0

A potential add-on task:

  • When T is not an arithmetic type, but the underlying representation still maps to a SIMD vector, investigate usage of that SIMD vector as the storage. Example: Vec<2, Vec<4, double>> can be stored as __m512. When UM2_ENABLE_SIMD_VEC is off and the storage is aligned, clang18 is able to perform optimizations like this, but gcc14 is not. Testing addition of two Vec<2, Vec<4, double>> shows a single 512-bit add for aligned array storage in clang18, but two 256-bit adds when using GCC vector extensions.
@KyleVaughn
Copy link
Owner Author

Implemented in "format" branch, which will be merged into main in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant