Parallelization

kaskr edited this page Jun 25, 2016 · 5 revisions

BLAS

TMB uses the following BLAS kernels when calculating function value and derivatives

Function Gradient
dgemm dgemm
dsyrk dsymm
dtrsm dtrsm
dpotrf dpotri

If your model spends a significant amount of time in these BLAS operations you may benefit from an optimized BLAS library e.g. MKL or OpenBLAS for CPU or nvblas for GPU. For a good result it's critical that

  1. All required BLAS kernels are part of the library (currently not the case for nvblas ? ).
  2. The library should not add significant overhead for small matrices (OPENBLAS have had problems - is it still the case ? ).
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.