Description
Here's the actual use case: Intel has recently released a "Strict CNR mode" for MKL (in 2019.3 release), see: https://software.intel.com/en-us/mkl-linux-developer-guide-reproducibility-conditions
Basically, for a small subset of BLAS functions, it guarantees strict bit-wise reproducibility regardless of the number of threads used etc:
In strict CNR mode, Intel MKL provides bitwise reproducible results for a limited set of functions and code branches even when the number of threads changes. These routines and branches support strict CNR mode (64-bit libraries only):
- ?gemm, ?symm, ?hemm, ?trsm and their CBLAS equivalents (cblas_?gemm, cblas_?symm, cblas_?hemm, and cblas_?trsm).
- Intel® Advanced Vector Extensions 2 (Intel® AVX2) or Intel® Advanced Vector Extensions 512 (Intel® AVX-512).
For whatever reason, *gemv
functions are not listed, but *gemm
are - so all your matrix-by-matrix dot products can now be numerically stable, but not matrix-vector...
Is there are way to force numpy to use e.g. dgemm/sgemm and not dgemv/sgemv, when multiplying (1 x n) by (n x m)?
I've stumbled upon this piece of code which, IIUC, tries to be "smart" and treats single-row/single-column matrices as vectors when dispatching to blas routines, so you can't force it to use 'mm' versions of blas functions instead of 'mv' even if you want to:
numpy/numpy/core/src/common/cblasfuncs.c
Lines 156 to 178 in 5000356