# Lecture 08: Linear Algebra with NumPy

### Please note: This lecture will be recorded and made available for viewing online. If you do not wish to be recorded, please adjust your camera settings accordingly. 

# Reminders/Announcements:
- Assignment 2 has been collected; Assignment 3 has been posted.
- Grades for Assignment 1 are out.
- Solutions to the Practice Quiz are out.
- Quiz 1 is on Jan 25th. More details are in your Quiz folder.

## The SciPy Stack

From the SciPy webste: https://www.scipy.org/

"SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering."

There are 6 core packages:

- NumPy: N dimensional array package (think numerical linear algebra, but more powerful)
- IPython: An enhanced Python console (we have seen this! It is part of Jupyter Notebooks)
- Matplotlib: 2D plotting software (we have seen this! CoCalc's `plot()` runs on Matplotlib)
- SymPy: Symbolic computations and mathematics (we have seen this! Symbolic expressions, derivatives, etc.)
- SciPy: Base library for scientific computing
- Pandas: Toolkit for data analysis (we will see this later!)

Thanks to the SciPy developers, volunteers, and supporters!

## NumPy
From the NumPy website: https://numpy.org/

![numpy](numpy.png)

NumPy is a toolkit for linear algebra (among many other things). It utilizes optimized C code with a user friendly Python coating, so that operations are easy to implement and efficient to run.

To use NumPy we will have to use a *import* statement. This tells the kernel "please give me the extra functionality from this library; thank you for your time."

In [0]:
import numpy

Once you have imported NumPy, you can call NumPy functions using the syntax `numpy.SomeFunctionName`. Because we are always adding `numpy` to the beginning, often we shorten it:

In [0]:
import numpy as np

Now you can call the same function as `np.SomeFunctionName`. A savings of 3 whole keystrokes!

The key data structure in NumPy is an *array*.

In [0]:
L = [1,2,3,4,5]   #This is a Python list
Arr = np.array(L) #This casts L to a NumPy array.

In [0]:
print(L)
print(type(L))

In [0]:
print(Arr)
print(type(Arr))

The main differences between a Python list and a NumPy array is that
- A NumPy array uses less memory
- NumPy arrays are "faster".

In other words, NumPy has more efficient handling of "ordered sequences of data". 

Be careful!

In [0]:
L = [1,2,3,4,5, 'String']   #This is a Python list
Arr = np.array(L) #This casts L to a NumPy array.

In [0]:
print(L)
print(Arr)

In [0]:
type(Arr[0])

Whereas a list in Python can have arbitrary data types mixed together, a NumPy array has a *single* data type in it.

In [0]:
L = [1,2,3,4,5.0]   #This is a Python list
Arr = np.array(L) #This casts L to a NumPy array.

In [0]:
print(L)
print(Arr)

In [0]:
integerArr = Arr.astype(int)
print(integerArr)

NumPy has a few builtins to initialize large arrays quickly:

In [0]:
import time
a = time.time()
for i in range(10):
    np.zeros((50000), int)
print(time.time()-a)

In [0]:
import time
a = time.time()
for i in range(10):
    [0 for i in range(50000)]
print(time.time()-a)

In [0]:
np.zeros((2,2))

NumPy also utilizes *componentwise list operations*

In [0]:
L = [1,2,3] #Python list
Arr = np.asarray(L)  #NumPy array

In [0]:
print(L)
print(Arr)

In [0]:
L+L

In [0]:
Arr+Arr

In [0]:
Arr^2

In [0]:
Arr+Arr^2

In [0]:
exp(Arr)

In [0]:
np.array([1,2,3])+np.array([1,2,3,4])

Be careful!!!!!!!!!! When we get to matrices, this will lead to some funny stuff called *broadcasting*, which I will not go over in detail right now.

In [0]:
np.array([1,2,3])+np.array([[0,0,0],[0,0,0]])

In [0]:
def f(x):
    return(x^2+3)
f(Arr)

The real power of NumPy is in creating *multidimensional arrays*. You do this by passing in a list of list of list of ... of lists. I.e. a one dimensional array is formed by passing in a list; an n dimensional array is formed by passing in a list of n-1 dimensional arrays.

A 2D array is just a matrix; instantiate one by passing in a list of rows:

In [0]:
print(np.array([[1,1,1],[1,2,2]]))

A 3D array is like a hotel. You have multiple floors, and on each floor you have a grid of rooms. In each room is a data point.

In [0]:
hotel = np.array([[[3,3,3],[3,3,3]],[[2,2,2],[2,2,2]],[[1,1,1],[1,1,1]]])
print(hotel)

In [0]:
hotel.shape   #The shape of the hotel

In [0]:
hotel.size    #The number of rooms in the hotel 

## ***** Participation Check ***************************
In the code cell below I have defined two matrices, `M` and `N`. Try to multiply them. Does the result look correct?!

In [0]:
M = np.array([[1,0],[0,1]])
N = np.array([[0,6],[13,0]])
print(M)
print(N)
#Try to multiply them below:


Now try calling `numpy.matmul(M,N)`. Does this look better?

## *********************************************************

## Linear Algebra in NumPy

NumPy takes a bit to get used to, but once you're used to the operations it is very nice.

In [0]:
M = np.array([[2,6],[13,-1]])
print(M)

In [0]:
np.transpose(M)

In [0]:
M+M

In [0]:
np.matmul(M,M)

Indexing in a NumPy matrix works using "iterated slice indexing"

In [0]:
myMat = np.array([['00','01','02'],[10,11,12],[20,21,22]])
print(myMat)

In [0]:
myMat[0,2]

In [0]:
myMat[2,0]

In [0]:
myMat[:,0]

In [0]:
myMat[0,:]

In [0]:
myMat[:2,:2]

In [0]:
myMat[0,:] = ['My','new','elements']
myMat

In [0]:
myMat[1,1] = 1
myMat

In [0]:
myMat[:2,:2]=np.zeros((2,2))
myMat

Additional functionality is found in the `numpy.linalg` submodule. You can find the full documentation here: https://numpy.org/doc/stable/reference/routines.linalg.html .

Note! NumPy is a *numerical* linear algebra machine; for symbolic/exact linear algebra, this is not the right tool. We will see how to get around this with Sage on Monday.

In [0]:
from numpy import linalg 
print(M)
linalg.det(M)

Note this is obviously an integer, $-(13\cdot6 - 2\cdot(-1)) =-( 78 + 2 )= -80$. You will see some of the consequences of this in your first problem of this week's homework.

In [0]:
linalg.matrix_power(M,4)

You can compute the linear transformation of a matrix acting on a vector as follows:

In [0]:
print(M)
v = np.array([2,1])
print()
print(v)
print()
print(np.matmul(M,v))

In [0]:
np.dot(v,v)

Note that this considers $v$ as a *column vector*:

$$
\begin{pmatrix}
2&6\\
13&-1
\end{pmatrix}
\begin{pmatrix}
2\\1
\end{pmatrix} = 
\begin{pmatrix}
10\\
25
\end{pmatrix}
$$

Recall that an *eigenvalue* is a number $\lambda$ such that there exists a vector $v$ with $Mv = \lambda v$. In this setting $v$ is a *right eigenvector* corresponding to $\lambda$. You can (numerically) compute this using `linalg.eig`:

In [0]:
M = np.array([[1,2],[0,4]])
print(M)
print()
print(linalg.eig(M))

How to parse the result? The first array is the list of eigenvalues. The second array is a *matrix* whose *ith column vector is the eigenvector for the ith eigenvalue listed*.

In [0]:
eigData = linalg.eig(M)
print('The list of eigenvalues are: ', eigData[0])
print('The list of right eigenvectors are: \n', eigData[1])

## ***** Participation Check ***************************
In the code cell below I have copied the matrix `M`. By extracting the columns of the eigenvector matrix of $M$ (calculated by `linalg.eig(M)`), create two vectors `v` and `w` which are eigenvectors of `M`. Then multiply each of them by `M` to verify that the eigenvalue/eigenvector relationship holds.

In [0]:
M = np.array([[1,2],[0,4]])
#Rest of your code here

In [0]:
print(np.matmul(M,v)/v)

## *********************************************************

## Singular Value Decomposition: The most important piece of linear algebra you probably never learned

Singular value decomposition is one of the most important things in life, next to food and water. Unfortunately it is infrequently taught. Prior warning: if this is new, there will not be enough time to fully understand it here. That is ok! Just try to pick out the key ideas. You don't have to know how internal combustion engines work in order to drive a car. You just have to know to how to put gas in the tank and how to put the pedal to the metal.

Recall that a matrix $M$ is *orthogonal* if $M\cdot M^T = I$. You should think of these as "like rotation matrices (kinda)."

In [0]:
M = np.array([[1,1],[1,-1]])/float(sqrt(2))
print(M)
print()
np.matmul(M,M.transpose())

Given an $m\times n$ matrix $M$ with real entries, the SVD decomposition of $M$ is a factorization 
$$
M = U\Sigma V^T,
$$
where $U$ is an $m\times m$ orthogonal matrix, $\Sigma$ is a "diagonal" matrix consisting of the *singular values* of $M$, and $V$ is an $n\times n$ orthogonal matrix. 

When $M$ is $2\times 2$ there is a very helpful picture to keep in mind. Here is a photo from Wikipedia (licensing info: By Georg-Johann - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=11342212 )

![svd](svd.png)


In [0]:
M = np.array([[1,1,1],[2,3,4]])
print(M)

SVD on $M$ will give 
$$
\begin{pmatrix}
1&1&1\\
2&3&4
\end{pmatrix}
=
\begin{pmatrix}
u_{1,1}&u_{1,2}\\
u_{2,1}&u_{2,2}
\end{pmatrix}
\begin{pmatrix}
\sigma_1&0&0\\
0&\sigma_2&0
\end{pmatrix}
\begin{pmatrix}
v_{1,1}&v_{1,2}&v_{1,3}\\
v_{2,1}&v_{2,2}&v_{2,3}\\
v_{3,1}&v_{3,2}&v_{3,3}
\end{pmatrix}.
$$

In [0]:
U, Sigma, VT = linalg.svd(M)

In [0]:
print(U)
print()
print(np.matmul(U,U.transpose()))

In [0]:
print(Sigma)

In [0]:
print(VT)
print()
print(np.matmul(VT,VT.transpose()))

To recreate M, we need to turn Sigma into the right matrix.

In [0]:
S = np.zeros((2,3))
S

In [0]:
S[:2,:2] = np.diag(Sigma)
S

In [0]:
np.matmul(U,np.matmul(S,VT))

Why is this useful? You will see one application in Exercise 2 of Assignment 3! The short version is as follows: the singular values of $M$ tell you the "most important" information about it. In other words, if you want to *approximate* M in the best way possible, using a matrix of lower rank, SVD tells you how to do so. This is *very important* in data compression/data analysis/dimensionality reduction.

In [0]:
M = np.array([[1,1,1],[2,3,4],[4,5,3]])
print(M)
linalg.det(M)

In [0]:
U, Sigma, VT = linalg.svd(M)
print(U)
print()
print(Sigma)
print()
print(VT)

In [0]:
U2 = U[:,:2]
Sigma2 = Sigma[:2]
VT2 = VT[:2,:]
print(U2)
print()
print(Sigma2)
print()
print(VT2)


In [0]:
S2 = np.zeros((2,2))
S2[:2,:2]=np.diag(Sigma2)
print(np.matmul(U2,np.matmul(S2,VT2)))
print()
print(M)

In [0]:
print(linalg.matrix_rank(M))
print(linalg.matrix_rank(np.matmul(U2,np.matmul(S2,VT2))))

Theorem: This is the "best" rank 2 approximation of $M$. 

You will be exploring "rank k" approximations of $M$ in Homework 3, Problem 2. Please note that even though this looks new and scary, it's mainly just terminology. You don't have to fully understand it in order to apply it. In particular, the code above *essentially* solves one of the parts; you will just have to generalize it slightly and turn it into a function.