Version 1.9
This release enables JIT-code generation of small matrix multiplications for SSE3 targets. Previously, only AVX and beyond has been supported using JIT code. SSE JIT-code generation is only supported for the MM domain (matrix multiplication). The compatibility of the library has been further refined and fine-tuned. The application binary interface (ABI) narrowed from above 500 functions down to ~50% due to adjusted symbol visibility. This revision prepares for a smooth transition to v2.0 and really internalizes low-level internals (descriptor handling, etc.), and two deprecated functions have been removed. More prominent, prefetch enumerators have been renamed e.g., LIBXSMM_PREFETCH_AL2 renamed to LIBXSMM_GEMM_PREFETCH_AL2.
INTRODUCED
- ABI specification improved: exported functions are decorated for visibility/internal use (issue #205).
- Math functions to eventually avoid LIBM dep., or to control specific requirements (libxsmm_math.h).
- MM: enabled JIT-generation of SSE code for small matrix multiplications (BE and FE support).
- MM: extended FE to handle multiple flavors of low-precision GEMMs (C and C++).
- Detect mainainer build and avoid target flags (GCC toolchain, STATIC=0).
- SMM: I16I32 and I16F32 WGEMM for SKX and future processors.
- Hardening all builds by default (Linux package requirements).
IMPROVEMENTS / CHANGES
- MM domain: renamed prefetch enumerators; kept "generic" names SIGONLY, NONE, and AUTO (FE).
- Build system presents final summary (similar to initial summary); also mentions VTune (if enabled).
- Adjusted TF scratch allocator to adopt global rather than context's allocator (limited memory).
- Combined JIT-kernel samples with respective higher level samples (xgemm, transpose).
- Enabled extra (even more pedantic) warnings, and adjusted the code base accordingly.
- Adjusted Fortran samples for PGI compiler (failed to deduce generic procedures).
- Removed deprecated libxsmm_[create/release]_dgemm_descriptor functions.
- Included validation and compatibility information into PDF (Appendix).
- MinGW: automatically apply certain compiler flags (workaround).
- Internalized low-level descriptor setup (opaque type definitions).
- Moved LIBXSMM_DNN_INTERNAL_API into internal API.
- Fixed dynamic linkage with CCE (CRAY compiler).
FIXES
- Take prefetch requests in libxsmm_xmmdispatch (similar to libxsmm_[s|d|w]mmdispatch).
- SpMM: prevent to generate (unsupported) SP-kernels (incorrect condition).
- Fixed code-gen. bug in GEMM/KNM, corrected K-check in WGEMM/KNM.
- MinGW: correctly parse path of library requirements ("drive letter").
- Fixed VC projects to build DLLs if requested.