Home

Povilas Kanapickas edited this page Dec 14, 2017 · 25 revisions
Clone this wiki locally

Latest release

The library is developed in C++11. Separate branch that uses C++03 and releases based on it are provided for compatibility with older compilers.

For older releases please check out this page

2.1

C++11 version

C++03-compatible version

The library supports the following architectures and instruction sets:

  • x86, x86-64: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA3, FMA4, AVX512F, AVX512BW, AVX512DQ, AVX512VL, XOP
  • ARM 32-bit: NEON, NEONv2
  • ARM 64-bit: NEON, NEONv2
  • PowerPC 32-bit big-endian: Altivec, VSX v2.06, VSX v2.07
  • PowerPC 64-bit little-endian: Altivec, VSX v2.06, VSX v2.07
  • MIPS 32-bit little-endian: MSA
  • MIPS 64-bit little-endian: MSA

Supported compilers:

  • C++11 version:

    • GCC: 4.8-7.x
    • Clang: 3.3-4.0
    • Xcode 7.0-9.x
    • MSVC: 2013, 2015, 2017
    • ICC (on both Linux and Windows): 2013, 2015, 2016, 2017
  • C++98 version

    • GCC: 4.4-7.x
    • Clang: 3.3-4.0
    • Xcode 7.0-9.x
    • MSVC: 2013, 2015, 2017
    • ICC (on both Linux and Windows): 2013, 2015, 2016, 2017

Newer versions of the aforementioned compilers will generally work with either C++11 or C++98 version of the library. Older versions of these compilers will generally work with the C++98 version of the library.

Changes since v2.0:

  • Various bug fixes
  • Documentation has been significantly improved. The public API is now almost fully documented.
  • Added support for MIPS MSA instruction set.
  • Added support for PowerPC VSX v2.06 and v2.07 instruction sets.
  • Added support for x86 AVX512BW, AVX512DQ and AVX512VL instruction sets.
  • Added support for 64-bit little-endian PowerPC.
  • Added support for arbitrary width vectors in extract() and insert().
  • Added support for arbitrary source vectors to to_int8(), to_uint8(), to_int16(), to_uint16(), to_int32(), to_uint32(), to_int64(), to_uint64(), to_float32(), to_float64().
  • Added support for per-element integer shifts to shift_r() and shift_l(). Fallback paths are provided for SSE2-AVX instruction sets that lack hardware per-element integer shift support.
  • Make shuffle_bytes16(), shuffle_zbytes16(), permute_bytes16() and permute_zbytes() more generic.
  • New functions: popcnt, reduce_popcnt, for_each, to_mask().
  • Xcode is now supported.
  • The library has been refactored in such a way that older compilers are able to optimize vector emulation code paths much better than before.
  • Deprecation: implicit conversion operators to native vector types has been deprecated and a replacement method has been provided instead. The implicit conversion operators may lead to wrong code being accepted without a compile error on Clang.

Open source projects using libsimdpp