## HW1-P1 Checkpoint

This file contains a number of benchmarks so that we can get some idea of how your computer performs and make sure everything installed correctly. These should take a minute or two to run depending on your machine.

In order to complete this checkpoint simply run all of the cells (put your cursor in each cell and hit Ctrl+Enter/Command+Enter or goto Kernel->Restart and Run All) and then save the notebook (Ctrl+S/Command+S) and upload it to gradescope.

In [1]:
import numpy as np
from time import time

# Roughly based on: http://stackoverflow.com/questions/11443302/compiling-numpy-with-openblas-integration
np.random.seed(0)

size = 4096
A, B = np.random.random((size, size)), np.random.random((size, size))
C, D = np.random.random((size * 128,)), np.random.random((size * 128,))
E = np.random.random((int(size / 2), int(size / 4)))
F = np.random.random((int(size / 2), int(size / 2)))
F = np.dot(F, F.T)
G = np.random.random((int(size / 2), int(size / 2)))

Run a few quick numpy benchmarks

In [2]:
# Matrix multiplication
N = 20
t = time()
for i in range(N):
    np.dot(A, B)
delta = time() - t
print('Dotted two %dx%d matrices in %0.2f s.' % (size, size, delta / N))
del A, B

# Vector multiplication
N = 5000
t = time()
for i in range(N):
    np.dot(C, D)
delta = time() - t
print('Dotted two vectors of length %d in %0.2f ms.' % (size * 128, 1e3 * delta / N))
del C, D

# Singular Value Decomposition (SVD)
N = 3
t = time()
for i in range(N):
    np.linalg.svd(E, full_matrices = False)
delta = time() - t
print("SVD of a %dx%d matrix in %0.2f s." % (size / 2, size / 4, delta / N))
del E

# Cholesky Decomposition
N = 3
t = time()
for i in range(N):
    np.linalg.cholesky(F)
delta = time() - t
print("Cholesky decomposition of a %dx%d matrix in %0.2f s." % (size / 2, size / 2, delta / N))

# Eigendecomposition
t = time()
for i in range(N):
    np.linalg.eig(G)
delta = time() - t
print("Eigendecomposition of a %dx%d matrix in %0.2f s." % (size / 2, size / 2, delta / N))

Dotted two 4096x4096 matrices in 0.87 s.
Dotted two vectors of length 524288 in 0.12 ms.
SVD of a 2048x1024 matrix in 0.40 s.
Cholesky decomposition of a 2048x2048 matrix in 0.09 s.
Eigendecomposition of a 2048x2048 matrix in 3.38 s.


The following cell simply prints out some system information to make sure everything is setup properly.

In [3]:
import platform
print(platform.platform())
print(platform.machine())
print(platform.processor())

np.__config__.show()

macOS-10.14.6-x86_64-i386-64bit
x86_64
i386
blas_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/include']
blas_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/include']
lapack_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/pablocastanobasurto/opt/anaconda3/envs/physcat/include']
lapack_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/pablocastanobasu

Test the python-compiler library numba

In [4]:
from numba import njit

def fib_python(n):
    last = 1
    current = 1
    if n == 0 or n == 1:
        return 1
    for i in range(0, n - 2):
        last, current = current, last + current
    return current

@njit
def fib_numba(n):
    last = 1
    current = 1
    if n == 0 or n == 1:
        return 1
    for i in range(0, n - 2):
        last, current = current, last + current
    return current

# Trigger jitting of fib
fib_numba(5)
fib_python(5)

%timeit fib_numba(1500)
%timeit fib_python(1500)

858 ns ± 3.58 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
97.4 µs ± 580 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
