Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libraries for explicit vectorization that might be usable for the Alpaka element layer #652

Open
bussmann opened this issue Sep 24, 2018 · 8 comments

Comments

@bussmann
Copy link

bussmann commented Sep 24, 2018

Vectorization is still an open issue (Does it belong in Alpaka at all? How do we enforce it?).
I want to use this issue to create a list of libraries that might help.
Please extend to your own liking.

Update: I have started ordering the list to my own liking in terms of usability, sustainability, etc.
My current take is that VecCore from CERN uses Vc as a backend while xsimd is really an independent approach. Vc is also Helmholtz (Volker Lindenstruth) and thus has in principle long term support and it seems there's activity in including this into the C++ standard.

  1. Vc: https://github.com/VcDevel/Vc
  2. xsimd: https://github.com/QuantStack/xsimd
  3. VecCore: https://github.com/root-project/veccore

These projects seem to be mostly one person efforts or not very active at all:
Inastemp: https://gitlab.mpcdf.mpg.de/bbramas/inastemp
boost.simd: https://github.com/NumScale/boost.simd
VCL: https://www.agner.org/optimize/#vectorclass
VCL KNC: https://github.com/mancoast/vclknc
QuickVec (a student project): https://www.andrew.cmu.edu/user/mkellogg/15-418/final.html#

@ax3l
Copy link
Member

ax3l commented Sep 24, 2018

xsimd: https://github.com/QuantStack/xsimd

@sbastrakov
Copy link
Member

Never tried that, but could be good (but currently not in boost): https://github.com/NumScale/boost.simd

@BenjaminW3 BenjaminW3 removed this from the Future milestone May 26, 2019
@j-stephan j-stephan added this to To do in Release 0.7 via automation Jan 26, 2021
@j-stephan j-stephan added this to the Version 0.7.0 milestone Jan 26, 2021
@j-stephan
Copy link
Member

We would like to get this into alpaka 0.7.0. However, this requires #38 to be resolved.

@bernhardmgruber
Copy link
Member

One of the main issues with SIMD libraries and alpaka is that you want to write your kernel code using such SIMD facilities, have it nicely emit vector code for CPU targets, but also make it compile for GPUs as well. Using existing libraries, this is not trivial.

LLAMA contains such an approach using Vc in: alpaka-group/llama#128. The key idea here is that for GPU targets, the kernel code compiles down to a scalar version and does not use the SIMD library at all, because SIMD library functions are usually not annotated with __host__ or __device__, so they cannot be referenced when we compile for CUDA or HIP.

CERN's VecCore solves exactly that by offering a vector type that can also, at compile time, be switched between a Vc vector or a scalar, thus also keeping Vc out when compiling for CUDA. So VecCore could be a potential off-the-shelf solution.

We could also hand-roll our own small SIMD wrapper that either compiles to scalar, or a loop over a vector of elements, or use a SIMD library such as Vc. But I guess this is a significant effort.

As for the API design, it seems like some implementations are converging on the std::simd design, which you can find here: https://en.cppreference.com/w/cpp/experimental/simd/simd. For a detailed rational on the design, you can read Matthias Kretz's PhD thesis.

Also interesting, the Kokkos SIMD library uses exactly this approach as well: https://github.com/kokkos/simd-math Also see tutorial slides here: https://github.com/kokkos/kokkos-tutorials/blob/main/LectureSeries/KokkosTutorial_05_SIMDStreamsTasking.pdf.

Kokkos SIMD also shows the interaction with Kokkos Views, which seems like you declare your SIMD types already in your views: Kokkos::View<Kokkos::SIMD<float>>. But there are more interesting options for the SIMD ABI parameter, which I have not studied in detail yet. So we also need to consider how the SIMD types interact with memory views.

@sbastrakov
Copy link
Member

Just to add to the list: https://github.com/google/highway

@j-stephan j-stephan added this to To do in Release 0.8 via automation May 11, 2021
@j-stephan j-stephan removed this from To do in Release 0.7 May 11, 2021
@j-stephan j-stephan removed this from To do in Release 0.8 Nov 10, 2021
@j-stephan j-stephan added this to To do in Release 0.9 via automation Nov 10, 2021
@j-stephan j-stephan removed this from To do in Release 0.9 Mar 29, 2022
@j-stephan j-stephan added this to To do in Release 1.0 via automation Mar 29, 2022
@j-stephan j-stephan removed this from the Version 0.9.0 (I/2022) milestone Mar 29, 2022
@bernhardmgruber
Copy link
Member

Btw, I solved this recently in LLAMA. Here is the documentation: https://llama-doc.readthedocs.io/en/latest/pages/simd.html
I also presented it on my poster at ACAT22 last week: https://indico.cern.ch/event/1106990/contributions/4991311/attachments/2533306/4361386/LLAMA%20poster.pdf

@fwyzard
Copy link
Contributor

fwyzard commented Nov 5, 2022

Btw, Intel is working to propose a SIMD library based on xvec/simd into Boost: https://lists.boost.org/Archives/boost/2022/09/253579.php .

@bernhardmgruber
Copy link
Member

IIUC, this is an implementation of std::simd by Intel. It's great to see more implementations appearing! And I am especially happy they try to get it into Boost. That is going to be a tough :)

Thanks for sharing the link!

@bernhardmgruber bernhardmgruber removed this from To do in Release 1.0 Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants