# 1. Introduction

Included a review of some linear algebra to refer back to.

# More on linear algebra
## Instructions:
* Go through the notebook and complete the tasks. 
* Make sure you understand the examples given. If you need help, refer to the Essential readings or the documentation link provided, or go to the Topic 2 discussion forum. 
* When a question allows a free-form answer (e.g. what do you observe?), create a new markdown cell below and answer the question in the notebook. 
* Save your notebooks when you are done.
 
**Task 1:**
Go through the tutorial on Linear Algebra in Python here: http://ml-cheatsheet.readthedocs.io/en/latest/linear_algebra.html.
Paste and run the code from the tutorial in the empty cell below (or create new empty cells below) to get more familiar with the material.


In [2]:
import numpy as np

## Vectors

Vectors are 1-dimensional arrays of numbers or terms. In geometry, vectors store the magnitude and direction of a potential change to a point. The vector [3, -2] says go right 3 and down 2. A vector with more than one dimension is called a matrix.

### Scalar operations

Scalar operations involve a vector and a number. You modify the vector in-place by adding, subtracting, or multiplying the number from all the values in the vector.


In [11]:
y = np.array([1,2,3])
x = 1
y + x

array([2, 3, 4])

In [12]:
y - x

array([0, 1, 2])

### Elementwise operations

In elementwise operations like addition, subtraction, and division, values that correspond positionally are combined to produce a new vector. The 1st value in vector A is paired with the 1st value in vector B. The 2nd value is paired with the 2nd, and so on. This means the vectors must have equal dimensions to complete the operation.

In [14]:
y = np.array([1,2,3])
x = np.array([4,5,6])
y + x

array([5, 7, 9])

In [15]:
y - x

array([-3, -3, -3])

In [16]:
y / x

array([0.25, 0.4 , 0.5 ])

### Dot product

The dot product of two vectors is a scalar. Dot product of vectors and matrices (matrix multiplication) is one of the most important operations in deep learning.

$$
\begin{bmatrix}
a_1 \\
a_2 \\
\end{bmatrix}
\cdot
\begin{bmatrix}
b_1 \\
b_2 \\
\end{bmatrix}
= a_1 b_1 + a_2 b_2
$$

In [22]:
y = np.array([1,2,3])
x = np.array([2,3,4])
np.dot(y,x)

20

### Hadamard product

Hadamard Product is elementwise multiplication and it outputs a vector.

$$
\begin{split}\begin{bmatrix}
a_1 \\
a_2 \\
\end{bmatrix}
 \odot
\begin{bmatrix}
b_1 \\
b_2 \\
\end{bmatrix}
=
\begin{bmatrix}
a_1 \cdot b_1 \\
a_2 \cdot b_2 \\
\end{bmatrix}\end{split}
$$

In [24]:
y = np.array([1,2,3])
x = np.array([4,5,6])
y * x

array([ 4, 10, 18])

### Vector fields

A vector field shows how far the point (x,y) would hypothetically move if we applied a vector function to it like addition or multiplication. Given a point in space, a vector field shows the power and direction of our proposed change at a variety of points in a graph.

This vector field moves in different directions depending the starting point. The reason is that the vector stores terms like 2𝑥 or 𝑥2 instead of scalar values like -2 and 5. For each point on the graph, we plug the x-coordinate into 2𝑥 or 𝑥2 and draw an arrow from the starting point to the new location. Vector fields are extremely useful for visualizing machine learning techniques like Gradient Descent.

## Matrices

A matrix is a rectangular grid of numbers or terms (like an Excel spreadsheet) with special rules for addition, subtraction, and multiplication.

### Dimensions

We describe the dimensions of a matrix in terms of rows by columns.

$$
\begin{split}\begin{bmatrix}
2 & 4 \\
5 & -7 \\
12 & 5 \\
\end{bmatrix}
\begin{bmatrix}
a² & 2a & 8\\
18 & 7a-4 & 10\\
\end{bmatrix}\end{split}
$$

The first has dimensions (3,2). The second (2,3).

In [31]:
a = np.array([
    [1,2,3],
    [4,5,6]
])
a.shape

(2, 3)

In [34]:
b = np.array([
    [4,5,6]
])
b.shape

(1, 3)

In [36]:
c = np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9],
])
c.shape

(3, 3)

### Scalar operations

Scalar operations with matrices work the same way as they do for vectors. Simply apply the scalar to every element in the matrix (add, substract, divide, etc).

In [39]:
a = np.array([
    [1,2,3],
    [4,5,6]
])
a + 1

array([[2, 3, 4],
       [5, 6, 7]])

### Elementwise operations

In order to add, subtract, or divide two matrices they must have equal dimensions. We combine corresponding values in an elementwise fashion to produce a new matrix.

$$
\begin{split}\begin{bmatrix}
a & b \\
c & d \\
\end{bmatrix}
+
\begin{bmatrix}
1 & 2\\
3 & 4 \\
\end{bmatrix}
=
\begin{bmatrix}
a+1 & b+2\\
c+3 & d+4 \\
\end{bmatrix}\end{split}
$$

In [58]:
a = np.array([
    [1,2],
    [3,4]
])

In [59]:
b = np.array([
    [5,6],
    [7,8]
])

In [60]:
a + b

array([[ 6,  8],
       [10, 12]])

In [61]:
a - b

array([[-4, -4],
       [-4, -4]])

### Hadamard product

Hadamard product of matrices is an elementwise operation. Values that correspond positionally are multiplied to produce a new matrix.

$$
\begin{split}\begin{bmatrix}
a_1 & a_2 \\
a_3 & a_4 \\
\end{bmatrix}
\odot
\begin{bmatrix}
b_1 & b_2 \\
b_3 & b_4 \\
\end{bmatrix}
=
\begin{bmatrix}
a_1 \cdot b_1 & a_2 \cdot b_2 \\
a_3 \cdot b_3 & a_4 \cdot b_4 \\
\end{bmatrix}\end{split}
$$

In [47]:
a = np.array([
    [1,2],
    [1,2]
])

In [50]:
b = np.array([
    [3,4],
    [5,6]
])

In [63]:
# Uses Python's multiply operator

a * b

array([[ 5, 12],
       [21, 32]])

In numpy you can take the Hadamard product of a matrix and vector as long as their dimensions meet the requirements of broadcasting.

### NumPy and broadcasting

In NumPy, when you perform element-wise operations like addition, subtraction, multiplication, or division on arrays (matrices and vectors), NumPy automatically broadcasts the arrays to perform the operation if their dimensions are compatible.

For example, if you have a matrix of shape (3, 2) and a vector of shape (2,), NumPy will automatically broadcast the vector to match the shape of the matrix and perform element-wise multiplication.

In [53]:
import numpy as np

# Define a matrix
matrix = np.array([[1, 2],
                   [3, 4],
                   [5, 6]])

# Define a vector
vector = np.array([2, 3])

# Perform element-wise multiplication
result = matrix * vector

print(result)

[[ 2  6]
 [ 6 12]
 [10 18]]


### Matrix transpose

Neural networks frequently process weights and inputs of different sizes where the dimensions do not meet the requirements of matrix multiplication. Matrix transposition (often denoted by a superscript ‘T’ e.g. M^T) provides a way to “rotate” one of the matrices so that the operation complies with multiplication requirements and can continue. There are two steps to transpose a matrix:

- Rotate the matrix right 90°
- Reverse the order of elements in each row (e.g. [a b c] becomes [c b a])

As an example, transpose matrix M into T:

$$
\begin{split}\begin{bmatrix}
a & b \\
c & d \\
e & f \\
\end{bmatrix}
\quad \Rightarrow \quad
\begin{bmatrix}
a & c & e \\
b & d & f \\
\end{bmatrix}\end{split}
$$

In [55]:
a = np.array([
    [1,2,3],
    [4,5,6]
])
a.T

array([[1, 4],
       [2, 5],
       [3, 6]])

### Matrix multiplication

Matrix multiplication specifies a set of rules for multiplying matrices together to produce a new matrix. Not all matrices are eligible for multiplication. In addition, there is a requirement on the dimensions of the resulting matrix output.

- The number of columns of the 1st matrix must equal the number of rows of the 2nd
- The product of an M x N matrix and an N x K matrix is an M x K matrix. The new matrix takes the rows of the 1st and columns of the 2nd

#### Steps

Matrix multiplication relies on dot product to multiply various combinations of rows and columns. In the image below (taken from Khan Academy’s linear algebra course) each entry in Matrix C is the dot product of a row in matrix A and a column in matrix B.

![Image](matrix_prod)

The operation a1 · b1 means we take the dot product of the 1st row in matrix A (1, 7) and the 1st column in matrix B (3, 5).

$$
\begin{split}a_1 \cdot b_1 =
\begin{bmatrix}
1 \\
7 \\
\end{bmatrix}
\cdot
\begin{bmatrix}
3 \\
5 \\
\end{bmatrix}
= (1 \cdot 3) + (7 \cdot 5) = 38\end{split}
$$

In [70]:
a = np.array([
    [1,7],
    [2,4]
])

b = np.array([
    [3,3],
    [5,2]
])

In [72]:
np.dot(a,b)

array([[38, 17],
       [26, 14]])

Another way to understand it is as follows:

$$
\begin{split}\begin{bmatrix}
a & b \\
c & d \\
e & f \\
\end{bmatrix}
\cdot
\begin{bmatrix}
1 & 2 \\
3 & 4 \\
\end{bmatrix}
=
\begin{bmatrix}
1a + 3b & 2a + 4b \\
1c + 3d & 2c + 4d \\
1e + 3f & 2e + 4f \\
\end{bmatrix}\end{split}
$$

### Exercises

In [77]:
#1

a = np.array([
    [1,2],
    [5,6]
])

b =np.array([
    [1,2,3],
    [4,5,6]
])

np.dot(a,b)

array([[ 9, 12, 15],
       [29, 40, 51]])

In [78]:
#2

a = np.array([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
])

b = np.array([
    [1,2],
    [5,6],
    [3,0],
    [2,1]
])

np.dot(a,b)

array([[ 28,  18],
       [ 72,  54],
       [116,  90]])

In [81]:
#3

a = np.array([
    [2,3],
    [1,4]
])

b = np.array([
    [5,4],
    [3,5]
])

np.dot(a,b)

array([[19, 23],
       [17, 24]])

In [83]:
#4

a = np.array([
    [3],
    [5]
])

b = np.array([
    [1,2,3]
])

np.dot(a,b)

array([[ 3,  6,  9],
       [ 5, 10, 15]])

In [86]:
#5

a = np.array([
    [1,2,3]
])

b = np.array([
    [4],
    [5],
    [6]
])

np.dot(a,b)

array([[32]])

**Task 2:**
Create the following matrices:

$X=\begin{bmatrix} 2 & 3 & 4 \\ 1 & 2 &3 \end{bmatrix},Y=\begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 &0\\ 0 & 0 & 1 \end{bmatrix}$ 

Then do the following:
1.	Multiply ```X``` with ```Y``` (```Z1=X*Y=XY```) and print the result. What do you observe? 
2.	Multiply ```X``` with ```Y``` transpose (```Z2=X*Y.T```) and print the result. What do you observe? 
3.	What can you tell about matrix ```Y```, given ```Z1``` and ```Z2```? 

In [None]:
#Code here

In [88]:
x = np.array([
    [2,3,4],
    [1,2,3]
])

y = np.array([
    [0,1,0],
    [1,0,0],
    [0,0,1]
])

z1 = np.dot(x,y)

In [89]:
z1

array([[3, 2, 4],
       [2, 1, 3]])

In [93]:
z2 = np.dot(x,y.T)

In [94]:
z2

array([[3, 2, 4],
       [2, 1, 3]])

In [98]:
if np.array_equal(z1,z2):
    print("Matrix y is symmetric.")
else:
    print("Matrix y is not symmetric.")

Matrix y is symmetric.


Matrix Y is symmetric. It is equal to its transpose.

**Task 3:** 
Create a 2x2 identity matrix ```I``` using the NumPy function ```eye(dim)``` $$ I=\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} $$


In [None]:
#Code here

In [115]:
I = np.eye(2)

**Task 4:** 
1. Multiply (using dot) matrix ```I``` with the vector ```x``` defined below: $$Ix$$
2. Print the output.
3. What do you observe? 



In [116]:
x=np.array([2,3])
print(x)
#Code here

[2 3]


In [117]:
np.dot(I,x)

array([2., 3.])

The vector is the same. This is because identity matrices behave in a similar way to the number 1 in scalar multiplication.

**Task 5:**
1.	Multiply (using dot) matrix ```I``` with the vector ```x``` defined below: $$Ix$$
2.	Print the output.
3.	What do you observe? 


In [118]:
X=np.array([[2,3],[4,5]])
print(x)
#Code here

[2 3]


In [120]:
np.dot(I,X)

array([[2., 3.],
       [4., 5.]])

Observations as above.

**Task 6:**
1. Multiply (using dot) matrix $X$ with the inverse matrix, $X^{-1}$ (use the NumPy function ```linalg.inv``` to do so): $XX^{-1}$
2. What do you observe? 

In [125]:
X=np.array([[1,2],[0,1]])
#Code here

inv = np.linalg.inv(X)

In [126]:
np.dot(X, inv)

array([[1., 0.],
       [0., 1.]])

When multiplying a matrix by its inverse, you get the identity matrix as the result. This property is fundamental to the concept of matrix inverses.