# Tests with `scipy.linalg.blas`

This code compares `numpy.dot` with low-level BLAS functions from `scipy.linalg.blas`<sup> 1 2 </sup>

* <sup> 1 </sup>[Low-level BLAS functions (scipy.linalg.blas)](https://docs.scipy.org/doc/scipy/reference/linalg.blas.html)

* <sup> 2 </sup>[BLAS (Basic Linear Algebra Subprograms)](http://www.netlib.org/blas/#_blas_routines)

In [1]:
import numpy as np
import numba as nb
from timeit import timeit

In [2]:
from scipy.linalg import blas

In [3]:
def matvec(A, x):
    N, M = A.shape
    y = np.zeros(N)
    for i in range(N):
        for j in range(M):
            y[i] += A[i,j]*x[j]
    return y

In [4]:
matvec_jit = nb.njit()(matvec)

In [5]:
N = 500
M = 460
A = np.ones((N,M))

In [6]:
x = np.random.rand(M)

In [7]:
%timeit np.dot(A, x)

40.9 µs ± 1.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [8]:
%timeit matvec(A, x)

223 ms ± 12.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [9]:
%timeit matvec_jit(A, x)

297 µs ± 35 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [10]:
%timeit blas.dgemv(1., A, x)

565 µs ± 50 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [11]:
B = np.ones((N,M), order='F')

In [12]:
%timeit blas.dgemv(1., B, x)

39.8 µs ± 394 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
