[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/COGS118A/demo_notebooks/blob/main/lecture_04_pre_numpy.ipynb)

# Machine learning is just linear algebra and vector calculus

A generic ML algorithm
1. define an equation with parameters you can set, such that optimizing the equation (picking the best parameters) means being better at the task you want to solve
1. [optional] transform your data from its raw form to make it easier to learn with
1. do one of two things to make the ML system learn on your training data:
    - if the equation has a single unique minimum that you can solve for analytically (e.g. OLS) then do so. 
    - otherwise find the gradient of the equation with respect to the parameters, and iteratively move closer to a better solution by changing the parameters a bit along the direction in the gradient that makes you better at the task
1. once you have the best/good enough parameters, you can predict with the ML system by using the parameters on new inputs (validation or testing data)

All steps involve matrix algebra, step 3 involves vector caculus 

# NumPy is how you do linear algebra in python
- the fundamental package for scientific computing with Python
- numpy is highly optimized (fast + memory effecient)
- numpy is used to build many commonly used scientific tools in python:
    - scipy: fundamental algorithms for scientific computing in Python (numeric integration, interpolation, transforms, statistics, etc)
    - scikit-learn: the "default" library for machine learning
    - scikit-image: as above, but for image processing
    - pandas: as above, but for data wrangling
    - statsmodels: as above, but for frequentist statistical modeling
    - pystan: as above, but for Bayesian statistical modeling
    - jax: Google's high performance ML library that includes automatic differentiation (vector calculus) of functions you write, compilation to runon GPU/TPU, and parallelization
  
    
In this class we will sometimes have you do things using raw numpy and writing your own algorithms the first time, so you can understand how things work at a very low level.

However, when you really want to get things done in ML you will use a higher level library like scikit-learn. These libraries help you with
- prebuilt algorithms that are well-debugged and fast
- all kinds of extras beyond the base algorithm (e.g., regularization to prevent overfitting of your data)
- multiple solvers (different ways to implement the algorithm, often with different extras and speed tradeoffs)


# NumPy tutorials you should check out

- [NumPy: the absolute basics for beginners](https://numpy.org/devdocs/user/absolute_beginners.html)

- [NumPy Illustrated: The Visual Guide to NumPy](https://betterprogramming.pub/3b1d4976de1d?sk=57b908a77aa44075a49293fa1631dd9b)



In [None]:
import numpy as np
data = np.array([1, 2, 3])

You can visualize your array this way:
![numpy array 1,2,3](https://numpy.org/devdocs/_images/np_array.png)

# Indexing and slicing
You can index and slice NumPy arrays in the same ways you can slice Python lists.

NOTE 1: Everything is 0-indexed!
NOTE 2: Arrays use the same indexing as Python's list, which roughly speaking works like `[start here: stop before here : step size going from start to stop]` and features the ability to use negative numbers and has default behavior if you omit any of those 3 slots from the indexer.

![numpy array slicing](https://numpy.org/devdocs/_images/np_indexing.png)

Please take a look at all of the indexing operations below... predict what you will see before you execute them, then see if the predictions match reality. If you can't predict what they will do then please review list/array indexing tutorials on the web 



In [None]:
print( data[1] )

In [None]:
print( data[:2] )

In [None]:
print( data[1:] )

In [None]:
print( data[-2::] )

In [None]:
print( data[::2] )

In [None]:
# Your code here... 
# how would you reverse the order of the array?
raise NotImplementedError

# Why NumPy and not just lists?

What's the time it take to do Hadamard product on a large vector?  

You know what Hadamard product is yeah? ;)

In [None]:
size = 1000000

# declaring big lists
list1 = range(size)
list2 = range(size)
  
# declaring big arrays with the same numbers
array1 = np.arange(size)
array2 = np.arange(size)

In [None]:
# here we use python's zip 
# to do element wise multiplication on the lists
%timeit [(a * b) for a, b in zip(list1, list2)]

In [None]:
# in numpy arrays, * is overloaded to do Hadamard product
# that is element wise multiplication on the arrays
%timeit array1 * array2

WOW! numpy is more than 60x faster! But is the output identical? 

BTW, checking if two arrays/lists have exactly the same contents is the kind of task you will probably have to do quite often... so maybe this is a useful prototype for you

In [None]:
# matched is an array the same size as the others
# it contains bools for every array element, True indicating a match
matched = (array1 * array2) == [(a * b) for a, b in zip(list1, list2)]
print('shape:',matched.shape, 'sample:', matched)

# np.all returns True if all elements are True
print('is every element identical?',np.all( matched ))

# Multidimensional arrays (i.e, matrixes)

In [None]:
array_example = np.array([    [[0, 1, 2, 3],
                               [4, 5, 6, 7]],
   
                              [[1, 2, 3, 4],
                               [5, 6, 7, 8]],
   
                              [[2 ,3, 4, 5],
                               [6, 7, 8, 9]]    ])

array_example

In [None]:
array_example.ndim

In [None]:
array_example.size

In [None]:
array_example.shape

In [None]:
array_example[0,1,2]

In [None]:
array_example[:,1,2]

In [None]:
array_example[1,:,2]

In [None]:
array_example[0,1,:]

In [None]:
a = np.arange(6)
print(a)
b = a.reshape(3, 2)
print(b)

In [None]:
array_example.reshape(24,)

In [None]:
array_example.flatten()

In [None]:
np.zeros(4)

In [None]:
np.ones(4)

In [None]:
np.ones((2,4))

In [None]:
np.zeros((3,2,4))

In [None]:
np.ones_like(array_example)

# Useful NumPy operations

## finding things out about data

In [None]:
data

In [None]:
data.min(), data.max(), data.mean()

In [None]:
array_example

In [None]:
array_example.min(), array_example.max()

In [None]:
array_example.argmin(), array_example.argmax()

In [None]:
array_example.flatten()[ array_example.argmin() ], array_example.flatten()[ array_example.argmax() ]

In [None]:
array_example.min(axis=0)

In [None]:
array_example.min(axis=1)

In [None]:
array_example.min(axis=2)

## doing linear algebra

In [None]:
# taking 2 vectors of same dimensionality
v1 = np.array([2 , 6])
v2 = np.array([5 , 7])

v3 = v1 + v2 # addition
print('v2 + v1 =',v3)

v4 = v2 - v1 # subtraction
print('v2 - v1 =',v4)

print('0.7 * (v2 - v1) = ', 0.7*v4) # scalar product

# two ways to get Hadamard multiplication
print('element-wise mult method #1', np.multiply(v1, v2))
print('element-wise mult method #2', v1 * v2)

# four ways to get the dot product
print('dot prod method #1', np.sum( v1 * v2 ))  # what a dot product is 
print('dot prod method #2', np.dot(v1,v2))      # expliclty dot product
print('dot prod method #3', v1.dot(v2))         # expliclty dot product
print('dot prod method #4', np.matmul(v1,v2))   # for 1D vectors, matrix mult is a dot product

#  cross product: theta is angle between v1,v2; n is the vector normal to them both
print('cross product v1 x v2 = ||v1|| ||v2|| |sin theta| n =', np.cross(v1,v2)) 


In [None]:
mat_b = np.array([[1, 2, 3],
                  [4, 5, 6]])
mat_a = np.array([[1, 1, 1],
                  [2, 2, 2]])

mat_a

In [None]:
# transpose of a matrix
mat_b.T

In [None]:
# in matrix multiplication
# 1st matrix (mxn) determines number of rows (m)
# 2nd matrix (nxp) determines number of columns (p)
# note inner size (n) must match!
# outputs are (mxp)
print('matrix mult method #1\n', mat_a @ mat_b.T)
print('matrix mult method #2\n', np.matmul(mat_a, mat_b.T))

In [None]:
mat_c = np.array([[1., 2.], [3., 4.]])
# a matrix inverse is like a reciprocal for a scalar
# the original matrix times its inverse equals I (diagonal ones)
# not every matrix has an inverse (must be square & full rank) 
# but if it does then you can find it like this
inv_c = np.linalg.inv(mat_c)
close_to_I = mat_c @ inv_c
print(mat_c, '\n x \n', inv_c, '\n = \n', close_to_I)

# note that c @ inv_c will probably be not quite the same as I
# closeness-but-not-exact can be checked if you desire
np.allclose(np.dot(mat_c, inv_c), np.eye(2))

# why is it close but not exact?
# the inverse solver (LIKE MOST NUMERIC COMPUTATIONS!) is not 
# exact.. it could go forever (or just very very long) if it tried
# to be exact.  So it calculates using an iterative algorithm
# until succesive iterations add less than a fixed tolerance 
# change to the answer

# Using matrix algebra in ML

What if we have a 2D image, where the values of `input` are simply the brighness of the pixels.

Our ML system is setup to produce a prediction from the image. For instance, what if it's trying to detect a vertical bright line in the center of the image.  In our toy example here we will pretend that learning is done. We only want to use our ML system to detect vertical lines in the input.

We will predict using the following linear algebra

$ y = w^T x$ 

where $y$ is the output prediction, $w$ is a column vector of parameters (weights learned by the ML system to do the task), and the inputs $x$ are a column vector. Because $w,x$ are 1D vectors the same length the output prediction $y$ is scalar number.

To turn our 2D image into a 1D vector we will reshape it using numpy.  We will define a weight vector by hand that produces larger $y$ values for a vertical line in the center, and smaller values for other kinds of lines 

In [None]:
# here's an "image" with a bright vertical line against a dark background
inputs = np.array([[0, 1, 0],
                   [0, 1, 0],
                   [0, 1, 0]])
    
print(inputs)
print('shape:',inputs.shape)

In [None]:
inputs.shape

In [None]:
# flatten isn't quite what we need to make the image a column vector
# this is a 1-D vector, not a column!
print(inputs.flatten())
print('shape:',inputs.flatten().shape)

In [None]:
columnar_inputs = inputs.reshape(9,1)
print(columnar_inputs)
print('shape:',columnar_inputs.shape)

In [None]:
# ok let's make a quick function to columnarize any 2d array/image
def columnar(image):
    n_els = np.prod(image.shape)
    return image.reshape(n_els,1)

col_ins = columnar(inputs)
print(col_ins)
print('shape:', col_ins.shape)

In [None]:
# here are some weights I hand designed to detect VERTICAL lines
# in the CENTER of a 3x3 image
weights = np.array([[-1],[+1],[-1],[-1],[+1],[-1],[-1],[+1],[-1]])
print(weights)
print('shape:',weights.shape)

In [None]:
# Your code here... 
# how would you use weights, inputs, and columnar() 
# and matrix multiplication to predict 
# whether the original image has a vertical line
def predict(img, wts):
    '''
    inputs: 
        img - input image, 2d array
        wts - weights of predictor, columnar array
    returns:
        bool describing if img contains a vertical line
    uses:
        matrix multiplication of inputs, you must determine how
        and columnar()
    '''
    raise NotImplementedError

In [None]:
# to test your function above here's some images
# one with a bright VERTICAL CENTERED line against a dark background
# this is positive class (predictor should say TRUE)
inputs_vc = np.array([[0, 1, 0],
                      [0, 1, 0],
                      [0, 1, 0]])

# one with a bright HORIZONTAL CENTERED line against a dark background
# this is negative class (predictor should say FALSE)
inputs_hc = np.array([[0, 0, 0],
                      [1, 1, 1],
                      [0, 0, 0]])

# also negative class (VERTICAL LEFT)
inputs_vl = np.array([[1, 0, 0],
                      [1, 0, 0],
                      [1, 0, 0]])

# also negative class (DIAGONAL)
inputs_diag = np.eye(3)

