# Numpy

The list of NumPy functions that call BLAS and LAPACK APIs that may offer automatic parallelism via threads is as follows:
Matrix and vector products

    numpy.dot()
    numpy.linalg.multi_dot(()
    numpy.vdot()
    numpy.inner()
    numpy.outer()
    numpy.matmul()
    numpy.tensordot()
    numpy.einsum(()
    numpy.einsum_path()
    numpy.linalg.matrix_power()
    numpy.kron()

Decompositions

    numpy.linalg.cholesky()
    numpy.linalg.qr()
    numpy.linalg.svd()

Matrix eigenvalues

    numpy.linalg.eig()
    numpy.linalg.eigh()
    numpy.linalg.eigvals()
    numpy.linalg.eigvalsh()

Norms and other numbers

    numpy.linalg.norm()
    numpy.linalg.cond()
    numpy.linalg.det()
    numpy.linalg.matrix_rank()
    numpy.linalg.slogdet()
    numpy.trace()

Solving equations and inverting matrices

    numpy.linalg.solve()
    numpy.linalg.tensorsolve()
    numpy.linalg.lstsq()
    numpy.linalg.inv()
    numpy.linalg.pinv()
    numpy.linalg.tensorinv()


In [70]:
import os
import numpy as np

## Numpy config

Show libraries and system information on which NumPy was built and is being used

In [71]:
np.show_config()

Build Dependencies:
  blas:
    detection method: pkgconfig
    found: true
    include directory: /usr/local/include
    lib directory: /usr/local/lib
    name: openblas64
    openblas configuration: USE_64BITINT=1 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS=
      NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= HASWELL MAX_THREADS=2
    pc file directory: /usr/local/lib/pkgconfig
    version: 0.3.23.dev
  lapack:
    detection method: internal
    found: true
    include directory: unknown
    lib directory: unknown
    name: dep139863411681952
    openblas configuration: unknown
    pc file directory: unknown
    version: 1.26.4
Compilers:
  c:
    args: -fno-strict-aliasing
    commands: cc
    linker: ld.bfd
    linker args: -Wl,--strip-debug, -fno-strict-aliasing
    name: gcc
    version: 10.2.1
  c++:
    commands: c++
    linker: ld.bfd
    linker args: -Wl,--strip-debug
    name: gcc
    version: 10.2.1
  cython:
    commands: cython
    linker: cython
    name: cython
    versio

Depending of BLAS Library NumPy is using, you have to select the environment variable:
 * OMP_NUM_THREADS: openmp,
 * OPENBLAS_NUM_THREADS: openblas,
 * MKL_NUM_THREADS: mkl,
 * VECLIB_MAXIMUM_THREADS: accelerate,
 * NUMEXPR_NUM_THREADS: numexpr

## Examples of Multithreaded NumPy Functions

If Numpy uses openblas, you have to use the environment variable 'OPENBLAS_NUM_THREADS' to control the number of cpus used.

The number of threads can only be changed before importing numpy

### matrix-matrix multiplication

**Sequential execution**

In [1]:
import os
os.environ['OPENBLAS_NUM_THREADS'] = '1'
import numpy as np
n = 4000
# create an array of random values
data1 = np.random.rand(n, n)
data2 = np.random.rand(n, n)
%time result = data1.dot(data2)

CPU times: user 12.9 s, sys: 20.8 ms, total: 12.9 s
Wall time: 12.9 s


**Parallel execution**

Restart your kernel

In [1]:
import os
os.environ['OPENBLAS_NUM_THREADS'] = '8'
import numpy as np
n = 4000
# create an array of random values
data1 = np.random.rand(n, n)
data2 = np.random.rand(n, n)
%time result = data1.dot(data2)

CPU times: user 29 s, sys: 3.37 s, total: 32.4 s
Wall time: 4.1 s
