<a href="https://colab.research.google.com/github/daniel-falk/ai-ml-principles-exercises/blob/main/ML-training/intro-to-libraries/intro_to_numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NumPy, a library for working with matrices
`numpy` is a library used to work with matrices and matrix manipulations in python. Here are some examples on how to create arrays and multiply arrays.

In [None]:
import numpy as np

In [None]:
np.array([[1,2,3], [4,5,6]])

In [None]:
np.zeros((3,5))

In [None]:
np.eye(3)

In [None]:
np.arange(start=5, stop=10, step=2)

In [None]:
np.linspace(start=1, stop=10, num=3)

In [None]:
arr = np.random.randint(0, 255, (3,5))
arr

In [None]:
arr.shape

In [None]:
arr.T

In [None]:
arr * 0.1

In [None]:
arr * np.array([1, 0, 0, 0, 0]).T

# Statistics

`numpy` also has functions to calculate statistical measures from arrays.

In [None]:
# Calculate the mean over all values in the array
np.mean(arr)

In [None]:
# Calculate the variance of the values in the array
np.var(arr)

In [None]:
# Calculate the mean for each row in the matrix
np.mean(arr, axis=1)

In [None]:
# Find the minimum and maximum values
arr.min(), arr.max()

# Data types
The values in the array can have multiple differnet datatypes. In difference from native types in python, the numpy types are of specific type and size. In python, there is no limit to how large number you can store in an `int`, in `numpy` it is just like C that a specific type has a specific range.

In [None]:
arr.dtype

In [None]:
np.array([-100, 1, 100], dtype=np.uint64)

In [None]:
np.array([1.5, 100.123], dtype=np.uint64)

In [None]:
np.array([1.5, 100.123], dtype=np.float32)

In [None]:
np.array([1.5, 100.123]).dtype

In [None]:
(np.array([1, 2, 3, 4]) / 2).dtype

# Operations on matrices
There are many operations in the `numpy` library that can be applied to matrices, such as trigometric functions.

In [None]:
x = np.linspace(start=0, stop=2*np.pi, num=8)
x

In [None]:
y = np.sin(x)
y

In [None]:
# A matrix can be binary thresholded to create a binary mask
y > 0

In [None]:
# A binary mask can be used to index the array
y[y > 0]

In [None]:
# A single value can be indexed
y[1]

In [None]:
# A range of values can be indexed
y[0:5]

In [None]:
# An array can be reshaped
y.reshape((-1, 2))

In [None]:
np.stack((x, y), axis=1)

In [None]:
np.argmin(y)

In [None]:
np.argsort(y)

In [None]:
np.where(y > 0.5)

# Speed up by vectorization
`numpy` can often significantly speed up vectorized operations, such as multiplication of each element in a vector.

In [None]:
TEST_VECTOR_SIZE = 1_000_000

In [None]:
py_list = range(TEST_VECTOR_SIZE)

def divide_list_loop():
  new_list = []
  for i in py_list:
    new_list.append(i / 2)
  return new_list

%timeit new_list = divide_list_loop()

In [None]:
py_list = range(TEST_VECTOR_SIZE)

%timeit new_list = [i / 2 for i in py_list]

In [None]:
np_list = np.array(range(TEST_VECTOR_SIZE))

def divide_list_np():
  return np_list / 2

%timeit new_list = divide_list_np()