Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ This example outputs:

### Auto detection of the instruction set extension to be used

The same computation operating on vectors and using the most performant instruction set available:
The same computation operating on vectors and using the most performant instruction set available at compile time, based on the provided compiler flags (e.g. ``-mavx2`` for GCC and Clang to target AVX2):

```cpp
#include <cstddef>
Expand Down
23 changes: 20 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,27 @@ C++ wrappers for SIMD intrinsics.
Introduction
------------

SIMD (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
`SIMD`_ (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor
vendors and compilers.

`xsimd` provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of scalar and complex numbers with the same arithmetic
operators and common mathematical functions as for single values.

`xsimd` makes it easy to write a single algorithm, generate one version of the algorithm per micro-architecture and pick the best one at runtime, based on the
running processor capability.
There are several ways to use `xsimd`:

- one can write a generic, vectorized, algorithm and compile it as part of their
application build, with the right architecture flag;

- one can write a generic, vectorized, algorithm and compile several version of
it by just changing the architecture flags, then pick the best version at
runtime;

- one can write a vectorized algorithm specialized for a given architecture and
still benefit from the high-level abstraction proposed by `xsimd`.

Of course, nothing prevents the combination of several of those approach, but
more about this in section :ref:`Writing vectorized code`.

You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the `The C++ Scientist`_. The mathematical functions are a
lightweight implementation of the algorithms also used in `boost.SIMD`_.
Expand Down Expand Up @@ -52,6 +64,10 @@ The following SIMD instruction set extensions are supported:
+--------------+---------------------------------------------------------+
| WebAssembly | WASM |
+--------------+---------------------------------------------------------+
| Risc-V | Vector ISA |
+--------------+---------------------------------------------------------+
| PowerPC | VSX |
+--------------+---------------------------------------------------------+

Licensing
---------
Expand Down Expand Up @@ -104,6 +120,7 @@ This software is licensed under the BSD-3-Clause license. See the LICENSE file f



.. _SIMD: https://fr.wikipedia.org/wiki/Single_instruction_multiple_data
.. _The C++ Scientist: http://johanmabille.github.io/blog/archives/
.. _boost.SIMD: https://github.com/NumScale/boost.simd

6 changes: 6 additions & 0 deletions docs/source/vectorized_code.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,5 +69,11 @@ as a template parameter:

.. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set_mean_arch_independent.cpp

Then you just need to ``#include`` that file, force instantiation for a specific
architecture and pass the appropriate flag to the compiler. For instance:

.. literalinclude:: ../../test/doc/sum_sse2.cpp


This can be useful to implement runtime dispatching, based on the instruction set detected at runtime. `xsimd` provides a generic machinery :cpp:func:`xsimd::dispatch()` to implement
this pattern. Based on the above example, instead of calling ``mean{}(arch, a, b, res, tag)``, one can use ``xsimd::dispatch(mean{})(a, b, res, tag)``. More about this can be found in the :ref:`Arch Dispatching` section.