@hfp hfp released this Mar 15, 2018 · 2264 commits to release since this release

Assets 2

This release enables JIT-code generation of small matrix multiplications for SSE3 targets. Previously, only AVX and beyond has been supported using JIT code. SSE JIT-code generation is only supported for the MM domain (matrix multiplication). The compatibility of the library has been further refined and fine-tuned. The application binary interface (ABI) narrowed from above 500 functions down to ~50% due to adjusted symbol visibility. This revision prepares for a smooth transition to v2.0 and really internalizes low-level internals (descriptor handling, etc.), and two deprecated functions have been removed. More prominent, prefetch enumerators have been renamed e.g., LIBXSMM_PREFETCH_AL2 renamed to LIBXSMM_GEMM_PREFETCH_AL2.


  • ABI specification improved: exported functions are decorated for visibility/internal use (issue #205).
  • Math functions to eventually avoid LIBM dep., or to control specific requirements (libxsmm_math.h).
  • MM: enabled JIT-generation of SSE code for small matrix multiplications (BE and FE support).
  • MM: extended FE to handle multiple flavors of low-precision GEMMs (C and C++).
  • Detect mainainer build and avoid target flags (GCC toolchain, STATIC=0).
  • SMM: I16I32 and I16F32 WGEMM for SKX and future processors.
  • Hardening all builds by default (Linux package requirements).


  • MM domain: renamed prefetch enumerators; kept "generic" names SIGONLY, NONE, and AUTO (FE).
  • Build system presents final summary (similar to initial summary); also mentions VTune (if enabled).
  • Adjusted TF scratch allocator to adopt global rather than context's allocator (limited memory).
  • Combined JIT-kernel samples with respective higher level samples (xgemm, transpose).
  • Enabled extra (even more pedantic) warnings, and adjusted the code base accordingly.
  • Adjusted Fortran samples for PGI compiler (failed to deduce generic procedures).
  • Removed deprecated libxsmm_[create/release]_dgemm_descriptor functions.
  • Included validation and compatibility information into PDF (Appendix).
  • MinGW: automatically apply certain compiler flags (workaround).
  • Internalized low-level descriptor setup (opaque type definitions).
  • Moved LIBXSMM_DNN_INTERNAL_API into internal API.
  • Fixed dynamic linkage with CCE (CRAY compiler).


  • Take prefetch requests in libxsmm_xmmdispatch (similar to libxsmm_[s|d|w]mmdispatch).
  • SpMM: prevent to generate (unsupported) SP-kernels (incorrect condition).
  • Fixed code-gen. bug in GEMM/KNM, corrected K-check in WGEMM/KNM.
  • MinGW: correctly parse path of library requirements ("drive letter").
  • Fixed VC projects to build DLLs if requested.