# Machine Learning: Tools of the Trade

#### CSCI-UA 473 Introduction to Machine Learning

## Introduction to NumPy

NumPy (_Numerical Python_) provides an efficient interface to store and operate on data. This library is widely compatible with most other popular numerical libraries within the Python ecosystem.

In [3]:
import numpy as np

The object of interest in NumPy will almost always be `array`.

### Creation

In [4]:
x = [1,2,3,4,5]  ## The usual Python array
x = np.array([1,2,3,4,5]) ## A NumPy array

x   ## The output last statement in a notebook cell is printed by default, i.e. equivalent to print(x)

**TIP**: Pull up documentation (when available) directly from the Jupyter notebook using `?np.array`. This will print the Python docstrings.

x.dtype

SyntaxError: invalid syntax (<ipython-input-4-bbdf448968d7>, line 6)

In [None]:
x.astype(np.float64)

Look for other possible types [here](https://numpy.org/doc/stable/user/basics.types.html).

In [None]:
x.shape

#### Multi-dimensional arrays

In [6]:
M = np.array([[1,2,3,4,5], [6,7,8,9,10]]).astype(np.float64)
print(M)
print(f'Shape of M: {M.shape}')
print(f'Number of dimensions in M: {M.ndim}')

[[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  9. 10.]]
Shape of M: (2, 5)
Number of dimensions in M: 2


In [7]:
M = np.random.random((3,4,5))
print(M)
print(f'Shape of M: {M.shape}')
print(f'Number of dimensions in M: {M.ndim}')

[[[0.60088009 0.80896513 0.17040742 0.39038036 0.62604164]
  [0.50254653 0.07892314 0.10578484 0.33660597 0.79329496]
  [0.26420581 0.30994277 0.31225862 0.35788746 0.73767658]
  [0.87355725 0.95523726 0.1502337  0.9601097  0.1005065 ]]

 [[0.27565806 0.89082902 0.97250036 0.7515316  0.70463802]
  [0.70776029 0.35398524 0.38303294 0.29734202 0.73166109]
  [0.86430327 0.91066752 0.21493226 0.44807059 0.3969464 ]
  [0.19733175 0.07136026 0.61966906 0.06331979 0.1837867 ]]

 [[0.79713593 0.23506067 0.23256497 0.3384677  0.91717951]
  [0.3202295  0.36001323 0.80929225 0.58803133 0.71515088]
  [0.46629363 0.26512902 0.80754717 0.68931963 0.82503039]
  [0.59696798 0.39049764 0.39068775 0.02381431 0.5827232 ]]]
Shape of M: (3, 4, 5)
Number of dimensions in M: 3


### Reshaping

Often, we need to reshape the arrays for downstream operations. For instance, if we don't care about the spatial relations between the pixels of an image, we can reshape a $100 \times 100$ grayscale image as a large $10000$-dimensional vector instead.

In [None]:
M.shape

In [None]:
np.reshape(M, (3, 20))

Or, we can leave the only dimension's calculation to NumPy and use $-1$ for the dimension size instead.

In [None]:
np.reshape(M, (3, -1))

#### Flattening

By default, all reshapes happen in a row-major format, i.e. the reshape effectively happens by collecting items along the "rows".

In [None]:
print(np.reshape(M, -1))
np.reshape(M, -1).shape

Or directly call the `flatten` method of the array.

In [None]:
M.flatten().shape

### Indexing and Slicing

As usual in Python, everything is still 0-indexed.

In [None]:
M[0,1,2]

Intuitively, think of indexing as a sequence of transforms $M_0 = M[0]$ --> $M_{0, 1} = M_0[1]$ --> $M_{0,1,2} = M_{0,2}[2]$. This can also be realized as a chain of indexing operations in Python - `M[0][1][2]`.

For any dimension not explicitly mentioned in the indexing schemes below, everything is returned.

In [1]:
print(M[0:1])
M[:].shape

NameError: name 'M' is not defined

In [9]:
print(M[:])  ## Get only the first two elements from the first dimension.
M[:2].shape

[[[0.60088009 0.80896513 0.17040742 0.39038036 0.62604164]
  [0.50254653 0.07892314 0.10578484 0.33660597 0.79329496]
  [0.26420581 0.30994277 0.31225862 0.35788746 0.73767658]
  [0.87355725 0.95523726 0.1502337  0.9601097  0.1005065 ]]

 [[0.27565806 0.89082902 0.97250036 0.7515316  0.70463802]
  [0.70776029 0.35398524 0.38303294 0.29734202 0.73166109]
  [0.86430327 0.91066752 0.21493226 0.44807059 0.3969464 ]
  [0.19733175 0.07136026 0.61966906 0.06331979 0.1837867 ]]

 [[0.79713593 0.23506067 0.23256497 0.3384677  0.91717951]
  [0.3202295  0.36001323 0.80929225 0.58803133 0.71515088]
  [0.46629363 0.26512902 0.80754717 0.68931963 0.82503039]
  [0.59696798 0.39049764 0.39068775 0.02381431 0.5827232 ]]]


(2, 4, 5)

In [10]:
print(M[:2, -1:])  ## Get only the first two elements from the first dimension, and the last from second.
M[:2, -1:].shape

[[[0.87355725 0.95523726 0.1502337  0.9601097  0.1005065 ]]

 [[0.19733175 0.07136026 0.61966906 0.06331979 0.1837867 ]]]


(2, 1, 5)

## Algebra with NumPy Arrays

In [None]:
x = np.array([1,2,3,4,5])
x.shape

All scalar operations are universally applicable to NumPy arrays, and applied element-wise.

In [None]:
(5 * x + 10)**2

For our usual matrix algebra, we are going to reshape the array into a $5 \times 1$ column vector, and call it `v`.

In [None]:
v = np.reshape(x, (x.shape[0], 1))
v.shape

Here a few alternatives, that I invite you to explore, which are equivalent in effect to the reshape operation.

In [None]:
## Alternatively, think in terms of adding a dimension of size 1.

print(x[:, np.newaxis].shape)
print(np.expand_dims(x, axis=-1).shape)  

To compute the $L_2$-norm of this $5$-dimensional column vector, we use the inner-product form

$$
\lVert v \rVert_2 = \sqrt{v_1^2 + v_2^2 + v_3^2 + v_4^2 + v_5^2} = \sqrt{v^Tv}
$$

In [None]:
l2 = np.sqrt(np.sum(v**2, axis=0, keepdims=True))
print(l2)
l2.shape

Importantly, notice the `keepdims` argument. What happens if `keepdims=False`?

Better yet, let's use the `np.dot` and `np.matmul` methods.

In [None]:
l2 = np.sqrt(np.dot(v.T,v))
print(l2)
l2.shape

In [None]:
l2 = np.sqrt(np.matmul(v.T, v))
print(l2)
l2.shape

Let's also do an outer product $vv^T$ to create a rank-1 matrix,

In [None]:
v_outer = np.matmul(v, v.T)
v_outer

and find the mean value of each row.

In [None]:
v_outer_row_mean = np.mean(v_outer, axis=0, keepdims=True)
print(v_outer_row_mean)
v_outer_row_mean.shape

Similarly, any operation can be defined on any dimension (usually referred to as `axis` in arguments to NumPy functions).

### Broadcasting

NumPy applies a small and strict set of "automatic" operation, when manipulating multi-dimensional arrays. This is elegant and powerful, so use with care!

In [None]:
A = np.ones((4, 5))
b = np.arange(5)
print(A)
print(b)
print(M.shape, b.shape)

In [None]:
A + b

NumPy applies left-padding to the dimensions of `b`, then "broadcasts" them to each row of `A`, and finally applies the addition operation.

### Numerical Linear Algebra with NumPy

NumPy has a battle-tested numerical linear algebra library under `np.linalg` for all the common operations - matrix factorizations, decompositions, solving linear systems, etc. Let's consider solving the following linear system

$$
6x + 2y + z = 13 \\
2x + 9y + 3z = 29 \\
x + 3y + 8z = 31
$$

This can be written more concisely as a system $Ax = b$.

In [None]:
A = np.array([[6, 2, 1],
              [2, 9, 3],
              [1, 3, 8]])
b = np.array([13, 29, 31])

x = np.linalg.solve(A, b)

We use `np.allclose` to verify our result.

**TIP**: Take a look at the documentation using `?np.allclose` to see the method signature and a detailed description.

In [None]:
assert np.allclose(np.matmul(A, x), b)

Let's try Cholesky decomposition.

In [None]:
L = np.linalg.cholesky(A)
assert np.allclose(np.matmul(L, L.T), A)

## Introduction to Matplotlib

Matplotlib is an extremely flexible plotting library. We'll discuss some basic elements here. 

In [None]:
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

In [None]:
x = np.linspace(- 2. * np.pi, 2. * np.pi)
sin_x = np.sin(x)
cos_x = np.cos(x)

In [None]:
fig, ax = plt.subplots(figsize=(10,5))  ## Start a new figure with specified size.

## Add a line plot, with some properties like width, color, markers.
ax.plot(x, cos_x,
         linewidth=2.,
         marker='.',
         color='forestgreen',
         label='$\cos{x}$') 

## Add another line plot to the same figure.
ax.plot(x, sin_x,
         linewidth=0.5,
         marker='.',
         color='orangered',
         label="$\sin{x}$")

ax.set_title("Two Waves")
ax.set_xlabel("$x$")
ax.set_ylabel("$y$")
ax.legend(loc="lower left")
plt.show()

### Plotting 3-D Surfaces

We can also plot in 3-D. Let us plot

$$
z = f(x,y) = x^2 + y^2
$$

In [None]:
x, y = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
z = x**2 + y**2

fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
surf = ax.plot_surface(x, y, z, cmap=cm.viridis)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.title(r'$z = f(x,y) = x^2 + y^2$');  ## Notice we can use LaTeX for math expressions!

## Optimizing functions with SciPy

SciPy (_Scientific Python_) provides many utilities for scientific computation in Python. Let us minimize the function $f(x,y)$ we just plotted, and visualize the optimizer trajectory.

In [None]:
import scipy as sp
from scipy import optimize

In [None]:
def f(x):
    return np.sum(x**2, axis=-1)

res = sp.optimize.minimize(f, [-5, -5], options=dict(return_all=True))
trajectory = np.array(res['allvecs'])
trajectory.shape

In [None]:
x, y = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
z = x**2 + y**2

fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.view_init(30, 25)

surf = ax.plot_surface(x, y, z, cmap=cm.viridis, zorder=1)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')

ax.plot3D(trajectory[:, 0], trajectory[:, 1], f(trajectory), c='red', linewidth=2.0, marker='o', zorder=10)

plt.title(f'Optimizer Trajectory for $f(x,y)$, x = {trajectory[-1,0]:.04f}, y = {trajectory[-1,1]:.04f}');

## Optimizing functions with PyTorch

PyTorch is another popular framework primarily providing easy-to-use tools for building and training neural networks. The main object manipulated here is a `Tensor`, which in some respects is equivalent to the NumPy `array`.

PyTorch allows many of the same numerical operations as NumPy, including a linear algebra library. A key distinction is the ability to take derivatives (_automatic differentiation_) through computational graphs. This does not require hand-coding the derivatives, and in essence simply an efficient implementation of the chain rule from calculus.

We will see this in more detail in a future lab on neural networks. For now, let us reproduce the optimization of $f(x,y)$.

In [None]:
import torch
import torch.nn as nn


## The PyTorch equivalent of our f(x,y)
class F(nn.Module):    
    def __init__(self):
        super().__init__()
        
        ## In the future, a neural network would be constructed here.
        
    def forward(self, v):
        return v.pow(2).sum(dim=-1)

An important feature is PyTorch is that the `Tensor`s can be differentiated. We will use a simple gradient descent optimizer to keep improving the initial value $x_0$.

In [None]:
f = F()
x0 = torch.Tensor([[-5., -5.]]).float().requires_grad_(True)

optim = torch.optim.SGD([x0], lr=0.1)

trajectory = [x0.detach().clone()]
for _ in range(100):
    y = f(x0)
    optim.zero_grad()
    y.backward()
    optim.step()

    trajectory.append(x0.detach().clone())  ## Get a clone of the current value.
    
trajectory = torch.cat(trajectory, axis=0)

In [None]:
x, y = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
z = x**2 + y**2

fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.view_init(30, 25)

surf = ax.plot_surface(x, y, z, cmap=cm.viridis, zorder=1)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')

## Note that Matplotlib prefers numpy arrays, and hence we make the type conversion.
ax.plot3D(trajectory[:, 0].numpy(), trajectory[:, 1].numpy(), f(trajectory).numpy(), c='red', linewidth=2.0, marker='o', zorder=10)

plt.title(f'Optimizer Trajectory for $f(x,y)$, x = {trajectory[-1,0]:.04f}, y = {trajectory[-1,1]:.04f}');