# Topic 22: Linear Algebra

- 04/28921
- onl01-dtsc-ft-022221


## Announcements

- **One on Ones Resume Next Week**
- We have our **Cohort Phase 2 Project Presentation Sharing session tomorrow at 4 pm EST**
    - Sharing is optional!
    
- **Make sure to check the new "Notes from James" page at the start of topics 25,26,27,32.**
    - Summary is: topic 25 will be 2-study groups.
    - Topic 27 and 32 are combined into 1 study group.
    - Topic 26 is a bit redundant and includes more gradient descent from scratch labs.

## Learning Objectives

- Be able to explain the difference(s) between vectors, matrices, tensors, etc. and their dimensionality.

- Understand the the difference between the shape and size of an array

- Discuss linear algebra operations with numpy

- Learn about using linear algebra to solve systems of equations
- Simple linear algebra regression analysis

## Questions/Comments?

- 

### Resources:

- **Videos (added to Canvas Topic 22 Supplemental Videos):**
 - **[YouTube Playlist: 3Blue 1Brown - Essence of Linear Algebra](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)**
- [YouTube Intuitive Explanation of Dot Product](https://youtu.be/FrDAU2N0FEg)

- [Youtube: Vector Dot Products](https://www.youtube.com/watch?v=0iNrGpwZwog)
- [Kahn Academy: Intro to Matrix Multiplication](https://youtu.be/kT4Mp9EdVqs)

- www.desmos.com (linear equation grapher)

# What & Why of Linear Algebra


### What is Linear Algebra?



- Study of "vector spaces" with linear relationships.
- Uses vectors, matrices, and tensors

<!-- - Mapping & dimensionality (PCA) -->
- Used in lots of ML applications

# Different Tensors

## Scalars, Vectors, Matrices: It's all about the dimension

<img src="https://raw.githubusercontent.com/jirvingphd/flatiron-school-data-science-curriculum-resources/master/Mathematics/LinearAlgebra/images/different_tensors.png" width=50%>




## Why Linear Algebra?


- Representing data as a N-dimensional tensor is used in many areas of Machine Learning.

#### Images are 3-D tensors.
<img src='https://raw.githubusercontent.com/jirvingphd/dsc-lingalg-motivation-online-ds-pt-100719/master/images/rgb.png' width=40%>

#### Natural Language Processing Represents text using matrices
<img src="https://raw.githubusercontent.com/learn-co-students/dsc-lingalg-motivation-online-ds-pt-100719/master/images/NLP_matrix2.png" width=80%>


#### Dimensionality Reduction (PCA)


<img src="https://raw.githubusercontent.com/learn-co-students/dsc-lingalg-motivation-online-ds-pt-100719/master/images/PCA_img.png">


#### Artificial Neural Networks
<img src="https://miro.medium.com/max/2500/1*ZB6H4HuF58VcMOWbdpcRxQ.png" width=50%>

## Creating with NumPy

In [None]:
# Imports needed for this notebook
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

In [None]:
def show_matrix(matrix ,name='matrix'):
    """Prints the name and shape of the matrix and displays the matrix"""
    print(f'{name} (shape={matrix.shape}):')
    display(matrix)
    print('\n')

In [None]:
# Scalar
s = np.arange(1)
display(s)

In [None]:
# Vector
v = np.arange(4)
show_matrix(v,'v')

In [None]:
# Other ways to define vector
x = np.linspace(-np.pi, np.pi, 10)
show_matrix(x,'x')

In [None]:
# Matrix
M0 = np.arange(8)
show_matrix(M0,'M0')

M = np.arange(8).reshape((4, 2))
show_matrix(M,"M")

In [None]:
# 3D Tensor
n = 24 
T_3d = np.arange(n).reshape((4, 2, 3))

show_matrix(T_3d,'T_3d')

### Indexing with NumPy

<img src="https://raw.githubusercontent.com/learn-co-students/dsc-scalars-vectors-matrices-tensors-codealong-online-ds-pt-100719/master/images/new_tensors.png" width=80%>

#### Different parts of a vector

In [None]:
show_matrix(v,'v')

In [None]:
# For Vectors
display(v[1:4])  # second to fourth element.
display(v[::2])  # every other element
display(v[:])    # print the whole vector
display(v[::-1]) # reverse the vector!

#### Different parts of a matrix

In [None]:
show_matrix(M)

In [None]:
## Entire first row of matrix
M[0] 

In [None]:
# first row and all columns
display(M[0, :])   

In [None]:
# element at first row and first column
display(M[0, 0])   

In [None]:
# elemenet last row and last column 
display(M[-1, -1]) 

In [None]:
# all rows and first column 
display(M[:, 0])   

In [None]:
 # all rows and all columns
display(M[:])     

#### Different parts of a tensor

In [None]:
show_matrix(T_3d,'T_3d')

In [None]:
show_matrix(T_3d[0])      # 2D: First matrix

In [None]:
show_matrix(T_3d[3])  

In [None]:
display(T_3d[0, 0])   # 1D: First matrix's first vector

In [None]:
display(T_3d[0,0, 0]) # 0D: First matrix's first vector's first element

In [None]:
display(T_3d[0, 0,0])  # 1D: first matrix, first vector, all elements

In [None]:
display(T_3d)

In [None]:
display(T_3d[0, :, 0])  # 1D: first matrix, all the vectors, just the fist element

In [None]:
display(T_3d[0, :, 1:]) # 1D: first matrix, all the vectors, all elements after the first

### Example 3D Tensor - An Image

In [None]:
## Color Images Are 3d Matrices/Tensors
img = 'images/neuron.jpg'
IMG =mpl.image.imread(img)

## Image is a 1220 x 2880 pixel color image
show_matrix(IMG)

In [None]:
## image is made of 3 channels of 1220 x 2880 pixels
[print(IMG[:,:,i].shape) for i in range(3)];

In [None]:
## Visualizing a single channel (dim) of image
show_matrix(IMG[:,:,0],'Dim 0')
plt.imshow(IMG[:,:,0],'gray')
plt.axis('off')

In [None]:
## View All Channels
fig, ax = plt.subplots(nrows=4,figsize=(5,10))

ax[0].imshow(IMG[:,:,0],cmap= 'gray')
ax[0].set_title("Dim 0")

ax[1].imshow(IMG[:,:,1],cmap='gray')
ax[1].set_title("Dim 1")

ax[2].imshow(IMG[:,:,2],cmap='gray')
ax[2].set_title("Dim 2")

ax[3].imshow(IMG)
ax[3].set_title('Combined Color Image')


[a.axis('off') for a in ax]

plt.tight_layout()

# Basic Properties

## Shape

Can help us know the dimensions and size

In [None]:
print('Scalar:')
s = np.array(100)
display(s)
display(s.shape)
display(s.size)

In [None]:
print('Vector:')
display(v)
display(v.shape)
display(v.size)

In [None]:
print('Matrix:')
display(M)
display(M.shape)
display(M.size)

In [None]:
print('3D Tensor:')
print(T_3d)
display(T_3d.shape)
display(T_3d.size)

## Transpose

<img src="https://raw.githubusercontent.com/jirvingphd/flatiron-school-data-science-curriculum-resources/master/Mathematics/LinearAlgebra/images/transpose_tensors.png">


<img src="https://raw.githubusercontent.com/learn-co-students/dsc-scalars-vectors-matrices-tensors-codealong-online-ds-pt-100719/master/images/new_matrix.png" width=40%>

In [None]:
## Transpose of a matrix
show_matrix(M,'M')

Mt = M.T
show_matrix(Mt,'M.T')

In [None]:
## Transpose of a Transpose is Original 
show_matrix(Mt,'Mt')

MtT = Mt.T
show_matrix(MtT,'Mt.T')

show_matrix(M,'M')


In [None]:
## Proving they're equal
(M.T.T == M).all()

# Combining Tensors

> Note: NumPy is pretty smart when you combine tensors; it will attempt to combine even if the dimensions don't match. This is called broadcasting & you can read about it in the documentation (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)[https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html].

In [None]:
# A = np.arange(6).reshape(3,2)
# B = 10 * A

# show_matrix(A,'A')
# show_matrix(B,'B')


## Addition

In [None]:
A = np.arange(6).reshape(3,2)
B = 10 + np.arange(6).reshape(3,2)

show_matrix(A,'A')
show_matrix(B,'B')


In [None]:
# We can add up the same dimensions! (elementwise)
A + B

In [None]:
# Broadcasting: We can even add scalars to the whole array (as you might expect)
A + 100

## What happens when we have different dimensions? Broadcasting happens

In [None]:
# 3-by-2 add 1-by-2
x = 100*np.arange(2).reshape(2)
show_matrix(x,'x')

In [None]:
show_matrix(A,'A')

In [None]:
## Broadcasted [0,100] across A
show_matrix(A + x)

In [None]:
# 3-by-2 add 3-by-2
A = np.arange(6).reshape(3,2)
show_matrix(A,'A')

In [None]:
# 3-by-2 add 2-by-3 --> Will this work?
x = 100*np.arange(2*3).reshape(2,3)
show_matrix(x,'x')

In [None]:
show_matrix(A,'A')

In [None]:
display(A + x)

## Multiplication (Hadamard Product & Dot Product)

### Hadamard Product a.k.a. Element-wise multiplication

> $C = A \circ B$

- The Hadamard product can be calculated in Python using the $*$ operator between two NumPy arrays: 

- Result: 
    - Each element in A is multiplied by its corresponding element (same row/column) as B
    - Same dimensions (after broadcasting)



$$ A \circ B = 
   \left[ {\begin{array}{cc}
   A_{1,1} * B_{1,1} & A_{1,2} * B_{1,2}\\
   A_{2,1} * B_{2,1}& A_{2,2} * B_{2,2} \\
   A_{3,1} * B_{3,1} & A_{3,2} * B_{3,2} \\
  \end{array} } \right] 
$$

In [None]:
show_matrix(A,'A')
show_matrix(B,'B')

In [None]:
## use * for hadamard product
C = A*B
show_matrix(C,'C')

In [None]:
# 3-by-2 add 1-by-2
show_matrix(A,'A')

x = 100*np.arange(2).reshape(2)
show_matrix(x,'x')

In [None]:
## will broadcast if same # of columns 
C = A * x
show_matrix(C,'C')

In [None]:
# 3-by-2 add 3-by-2
show_matrix(A,'A')

x = 100*np.arange(3*2).reshape(3,2)
show_matrix(x,'x')

In [None]:
C = A * x
show_matrix(C,'C')

In [None]:
# 3-by-2 add 2-by-3 --> Will this work?
show_matrix(A,'A')

x = x = 100*np.arange(3*2).reshape(2,3)
show_matrix(x,'x')

In [None]:
C = A * x
show_matrix(C,'C')

### Dot Product

[Kahn Academy: Intro to Matrix Multiplication](https://youtu.be/kT4Mp9EdVqs)

Result: (m-by-n) DOT (n-by-p) ==> (m-by-p)

$$A \cdot B = C$$

Likely the most common operation when we think of "multiplying" matrices.

- Matrix $A$ has $m$ rows and **$n$ columns**
- Matrix $B$ has **$n$ rows** and $k$ columns. 



>- Provided the $n$ columns in $A$ and $n$ rows in $B$ are equal, the result is a new matrix with $m$ rows and $k$ columns. 

  
- The dot product can be shown using (.) or (dot). 

> $ C_{(m, k)} = A_{(m, n)} \cdot B_{(n, k)}$ OR $ C_{(m, k)} = A_{(m, n)} \text{  dot  } B_{(n, k)}$

The product operation is deﬁned by

$$ \large C_{i, j}= \sum_k A_{i, k}B_{k, j}$$

The intuition for the matrix multiplication is that you calculate the dot product between each row in matrix $A$ with each column in matrix $B$. 

<!---
- When using the dot product, the number of columns in the first matrix must be equal the number of rows in the second matrix.

- We basically take the a column from B and transpose  it and perform broadcasted multiplication with A.

- end shape is # of rows from A and number of columns from B.
--->

$$ A = 
   \left[ {\begin{array}{cc}
   A_{1,1}& A_{1,2} \\
   A_{2,1}& A_{2,2}  \\
   A_{3,1} & A_{3,2} \\
  \end{array} } \right] 
$$

$$ B = 
   \left[ {\begin{array}{cc}
   B_{1,1}&  B_{1,2} \\
   B_{2,1} & B_{2,2} \\
  \end{array} } \right] 
$$

$$ C = 
  \left[ {\begin{array}{cc}
   A_{1,1}* B_{1,1}+ A_{1,2}*B_{2,1} & A_{1,1}* B_{1,2}+ A_{1,2}*B_{2,2} \\
   A_{2,1}* B_{1,1}+ A_{2,2}*B_{2,1} & A_{2,1}* B_{1,2}+ A_{2,2}*B_{2,2} \\
   A_{3,1}* B_{1,1}+ A_{3,2}*B_{2,1} & A_{3,1}* B_{1,2}+ A_{3,2}*B_{2,2} \\
  \end{array} } \right]
$$

In [None]:
show_matrix(A,'A')
show_matrix(B,'B')
show_matrix(B.T,'B.T')

In [None]:
C = A.dot(B.T) 

show_matrix(C,'C')

In [None]:
## Multiple ways for dot-product
Z = np.dot(A, B.T)
Z = A.dot(B.T)
Z = A @ B.T

show_matrix(Z,'Z')

<!-- ### Cross Product

Produces another tensor of the same shape (Note broadcasting can still work)

- [Kahn Academy: Intro to Cross Product](https://youtu.be/pJzmiywagfY)

<img src="https://raw.githubusercontent.com/learn-co-students/dsc-linalg-mat-multiplication-codealong-online-ds-pt-100719/master/images/cross.png">
```python
display(A[:,0])
display(A[:,1])
print()

result= np.cross(A[:,0],A[:,1])
show_matrix(result,'result')
```
 -->

# Manipulating Matrices (Identity & Inverse)

## Identity Matrix

Square matrix of diagonal 1's, rest are 0's

In [None]:
I5 = np.identity(5)
print(I5)

When multiplying (dot product), you always get the same matrix (note that still has be compatible shape)

In [None]:
A = np.arange(25).reshape(5,5)
show_matrix(A,'A')

In [None]:
IA = I5.dot(A)
show_matrix(IA,'IA')

In [None]:
AI = A.dot(I5)
show_matrix(AI,'AI')

In [None]:
(IA==AI).all()

## Inverse Matrix

Remember that we can't divide by a matrix, but we can do something similar by finding an **inverse matrix**

In [None]:
# Define two arrays
X = np.array([1,-2,3,2,-5,10,0,0,1]).reshape(3,3)
show_matrix(X,'X')

We can also find the inverse of a matrix with NumPy

In [None]:
# A = np.array([4,2,1,4,8,3,1,1,0]).reshape(3,3)
# Finding the inverse matrix
X_inv = np.linalg.inv(X)
show_matrix(X_inv,'X_inv')

In [None]:
# Note the rounding
X_X_inv = X.dot(X_inv)
show_matrix(X_X_inv,'X_X_inv')

However, not all matrices have an inverse

In [None]:
A = np.arange(9).reshape(3,3)
show_matrix(A,'A')
inv_A = np.linalg.inv(A)
show_matrix(inv_A)

# Solving Systems of Equations

Solving a system of equations can take a lot of work

$$ x - 2y + 3z = 9 $$
$$ 2x - 5y + 10z = 4 $$
$$ 0x + 0y + 6z = 0 $$

- **But we can make it easier by writing it in matrix form**

    - Our X-values for all 3 eqns are [1,2,0]
    - Our y-values for all 3 eqns are [-2,-5,0]
    - Our z-vales for all 3 eqns are [3,10,6]
    - Our outcomes for each eqn are [9,4,0]


- Below, each row of the matrix (A) contains an X,y,Z.
- The dot product of matrix A and vector [x,y,z] will produce the outcomes [9,4,0]. 
$$ 
\begin{pmatrix} 
    1 & -2 & 3 \\
    2 & -5 & 10 \\
    0 & 0 & 6
\end{pmatrix}
\cdot
\begin{pmatrix} 
    x \\
    y \\
    z
\end{pmatrix}
=
\begin{pmatrix} 
    9 \\
    4 \\
    0
\end{pmatrix}
$$


<!--- We can think of this in the abstract:
$$ A \cdot X = B $$
$$ A^{-1} \cdot A \cdot X = A^{-1} \cdot B $$
$$ I \cdot X = A^{-1} \cdot B $$
$$ X = A^{-1} \cdot B $$ --->

## Using NumPy

In [None]:
# Define the system's matrices
# eqn = [x,y,z]
eqn1 = np.array([1,-2,3])
eqn2 = np.array([2,-5,10])
eqn3 = np.array([0,0,6])

A = np.stack([eqn1,eqn2,eqn3])#,axis=None)
show_matrix(A,'A')

In [None]:
B = np.array([9,4,0]).reshape(3,1)
show_matrix(B,'B')

In [None]:
# Find the inverse
A_inv = np.linalg.inv(A)
show_matrix(A_inv,'A_inv')

In [None]:
# Solutions:
solution = A_inv @ B
show_matrix(solution,'solution')

# Activity 1: Solving Systems of Linear Equations with NumPy- Lab

> ## Exercise 1
A coffee shop is having a sale on coffee and tea. 
- On day 1, 29 bags of coffee and 41 bags of tea were sold, for a total of 490 dollars.
- On day 2, they sold 23 bags of coffee and 41 bags of tea, for which customers paid a total of 448 dollars.  
- How much does each bag cost?

In [None]:
# Create and solve the relevant system of equations


In [None]:
## solve with numpy.linalg.solve


In [None]:
# ## long way
# A_inv = np.linalg.inv(A)
# X = A_inv.dot(B.T)
# show_matrix(X)

> ## Exercise 3
You want to make a soup containing tomatoes, carrots, and onions.
- Suppose you don't know the exact mix to put in, but you know there are 7 individual pieces of vegetables, 
- and there are twice as many tomatoes as onions, and that the 7 pieces of vegetables cost 5.25 USD in total. 
- You also know that onions cost 0.5 USD each, tomatoes cost 0.75 USD and carrots cost 1.25 USD each.
- Create a system of equations to find out exactly how many of each of the vegetables are in your soup.

> ## Exercise 4
A landlord owns 3 properties: a 1-bedroom, a 2-bedroom, and a 3-bedroom house.
- The total rent he receives is 1240 USD.
- He needs to make some repairs, where those repairs cost 10% of the 1-bedroom house’s rent. The 2-bedroom repairs cost 20% of the 2-bedroom rental price and 30% of the 3-bedroom house's rent for its repairs. 
- The total repair bill for all three houses was 276 USD.
- The 3-bedroom house's rent is twice the 1-bedroom house’s rent.
<br>
How much is the individual rent for three houses?

In [None]:
## write out eqn
#

In [None]:
N = np.matrix([1240, 276, 0])
N

In [None]:
np.linalg.solve(M, N.T)

# Linear Regression with OLS

> We'll find that this is actually computationally expensive for large systems 😭

### Ordinary Least Squares

Ordinary least squares tells us that our linear regression equation can be represented as the sum of a linear term and an error term: 

$$y = X\beta + error$$

To solve for the best estimate of $\beta$, we are going to assume that on average, the error is equal to 0, thus:  

$$ y = X \beta $$


To solve for $\beta$,  we need to make $X$ into a square matrix by multiplying both sides of the equation from the left by $X^T$ : 

$$X^T y = X^T X \beta $$


Now we have a square matrix that with any luck has an inverse, which we will call $(X^T X)^{-1}$. 

Multiply both sides from the left by this inverse, and we have

$$(X^T X)^{-1} X^T y =(X^T X)^{-1} X^T X\beta $$

It turns out that a matrix multiplied by its inverse is the identity matrix $(X^{-1} X)= I$:

$$(X^T X)^{-1} X^T y =I \beta $$


You know that $I\beta= \beta$. 

So, if you want to solve for $\beta$ (that is, remember, equivalent to finding the values $m$ and $b$ in this case), you find that:

$$ \beta = (X^T X)^{-1} X^T y $$

Find $\beta$ using Ordinary Least Squares.

#### Resources for Linear Algebra behind OLS

- [The Linear Algebra of Least Squares Regression](https://medium.com/@andrew.chamberlain/the-linear-algebra-view-of-least-squares-regression-f67044b7f39b)
    - Must first understand [vector projection](https://web.archive.org/web/20131210220337/http://en.wikipedia.org/wiki/Vector_resolute) to be satisfied by their answer.

> Jump to `codealong-linear-algebra-regression.ipynb`for question