Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ExBLAS -- Exact (fast, accurate, and reproducible) BLAS library v1.0

ExBLAS aims at providing new algorithms and implementations for fundamental linear algebra operations -- like those included in the BLAS library -- that deliver reproducible and accurate results with small or without losses to their performance on modern parallel architectures such as Intel desktop and server processors, Intel Xeon Phi co-processors, and GPU accelerators. We construct our approach in such a way that it is independent from data partitioning, order of computations, thread scheduling, or reduction tree schemes.

The current version of the ExBLAS library is equipped with

  • ExSUM -- Reproducible and accurate parallel reduction (parallel summation);
  • ExDOT -- Reproducible and accurate parallel dot product;
  • ExGEMV -- Reproducible and accurate parallel matrix-vector product for various sizes (m = n, m >= n, and m <= n) for both transpose and non-transpose matrices;
  • ExTRSV -- Reproducible and accurate parallel triangular solver for both transpose and non-transpose, lower and unit triangular matrices with unit and non-unit diagonals;
  • ExGEMM -- Reproducible and accurate parallel matrix-matrix multiplication for squeare matrices for a moment. These routines can be executed on a set of architectures: Intel Core i7 and Sandy Bridge processors; Intel Xeon Phi co-processors; both AMD and NVIDIA GPUs using the OpenCL framework.


In order to use ExBLAS, the following software is needed:

  • [Required] Cmake of version 2.8.8 or higher
  • [Required] For CPUs, support of AVX instructions
  • [Required] For CPUs, Intel TBB library of version 4.0 or higher
  • [Required] For CPUs, support of C++11
  • [Required] For MIC, Intel C/C++ compilers
  • [Required] For GPUs only, OpenCL of version 1.1 or higher
  • [Optional] GMP of version 6.0
  • [Optional] MPFR of version 3.1.2 The later two are required to verify the accuracy and reproducibility of the results.



  • Download the latest version of source code from
  • Extract the sources from a tar.gz archive (tar -zxvf) to {EXBLAS_ROOT} folder
  • Create a build dir in the root of the library and move there

Compilation Options

  • -DEXBLAS_MPI=ON -- enables compilation with MPI. By default, only shared-memory parallelism is activated
  • -DEXBLAS_MIC=ON -- enables compilation for Intel MIC architectures
  • -DEXBLAS_GPU=ON -- enables compilation for GPUs -DEXBLAS_GPU_AMD=ON -- for AMD GPUs -DEXBLAS_GPU_NVIDIA=ON -- for NVIDIA GPUs
  • -DEXBLAS_VS_MPFR=ON -- compares the results against the ones produced by MPFR


  • For Intel CPU with AVX instructions: CC=icc CXX=icpp cmake .. make make install

  • With MPI: CC=mpicc CXX=mpicxx cmake -DEXBLAS_MPI=ON .. make make install

  • For GPU: CC=mpicc CXX=mpicxx cmake -DEXBLAS_GPU=ON -DEXBLAS_AMD|NVIDIA_GPU=ON .. make make install

  • For MIC: CC=mpicc CXX=mpicxx cmake -DEXBLAS_MIC=ON .. make make install


  • make test -- it will run several tests using naive, loguniform, and ill-cond number distribution comparing the results either against the ones produced by superaccumulators only or by MPFR

  • All tests are stored in the tests dir


  • [Required] Doxygen of version 1.8.0

  • [Optional] LaTeX in case you want to generate documentation in pdf format

  • -DEXBLAS_DOC=ON -- enables generationg of the ExBLAS documentation using Doxygen and LaTeX

  • make doc -- command to generate an html version of the ExBLAS documentation

  • make pdf -- command to generate a pdf version of the ExBLAS documentation


In the example dir you can find a simple test that shows how the ExBLAS routines can be called and how the ExBLAS library can be linked from your project.


ExBLAS: fast, accurate, and reproducible BLAS







No releases published


No packages published