# 2.2.0 Linear Algebra: An Introduction

### Learning Objectives:
- [Matrix Introduction](#Matrix-Introduction)
- [Matrix Addition & Subtraction](#Matrix-Addition-&-Subtraction)
- [Matrix Transpose](#Matrix-Transpose)
- [Matrix Multiplication](#Matrix-Multiplication)
- [Linear Transformations](#Linear-Transformations)

# Matrix Introduction

Now that we have covered vectors, we can expand this idea onto the concept of a __matrix__. Matrices are rectangular ordered arrays of numbers, symbols or expressions arranged in rows and columns. Its components are also referred to as __entries__. They are considered ordered as every entry position holds a different meaning. It is often useful to treat a matrix as a sequence of __column vectors__ or __row vectors__ given certain operations. In fact, given our definition, individual vectors can also be treated as matrices! For example, consider the three matrices below:

$$ A = \begin{bmatrix} 1 \\ 2  \end{bmatrix} \text{, }B = \begin{bmatrix} 1 & 2 & 3\\ 3 & 2 & 1 \end{bmatrix} \text{, } C =\begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 4 \end{bmatrix}$$

We denote the dimensions of a matrix by first their number of rows, then their number of columns. In this case, A is a 2x1 matrix, B is a 2x3 matrix and C is a 3x2 matrix. The matrix A is also a column vector, and the matrix B can also be conceptualized as three 2-D column vectors or even two 3-D row vectors! If we let M,N be two integers, where M,N  > 1, and $a_{ij},i=1,...,M, j=1,...,N$ are $MN$ numbers, we can have the general representation of a matrix as follows:

$$ 
A = \begin{bmatrix} a_{11} & ... & a_{1j} & ... & a_{1N} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{i1} & ... & a_{ij} & ... & a_{iN} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{M1} & ... & a_{Mj} & ... & a_{MN} 
\end{bmatrix}
$$

Matrices are a meaningful way of representing data. For instance, in the previous two notebooks, we looked at a small dataset of animals containing information on their cuteness, size and ferocity. All these vectors can be represented as a single matrix! Generally, moving along rows holds a specific meaning, which is different from moving along columns. 

For example, let us say that we are gathering data on a population, and we collect height, weight and gender from each person in a sample of 8 people. A simple matrix representation of this data is such that moving along columns we move to a different feature, and moving along rows we move to a different person's data, as shown in the matrix below. We see that in this case how, once again, we can treat the matrix as a list of vectors, where each vector contains the weight, height and sex of one person.

$$ 
A = \begin{bmatrix}   169 & ... & 62 & ... & F\\
                       \vdots &    & \vdots &     & \vdots \\
                         185 & ... & 78 & ... & M 
\end{bmatrix}
$$

Another reason that makes matrices are useful is that they simplify the representation of large operations. Rather than carrying out individual operations for different vectors or even elements, we can carry out an operation on an entire matrix. This is especially useful when we have to process large volumes of data.

To represent matrices in standard Python, we can use __nested lists__, which are lists of lists. With NumPy, we can create multi-dimensional NumPy arrays to represent matrices. Using the _np.array(  )_ function, we can create a NumPy array with a nested list as an argument. Unlike with nested lists in standard Python, indexing of multi-dimensional NumPy arrays is different, as shown in the example below. This difference will be further elaborated upon in Chapter 3:

In [1]:
import numpy as np

# Standard Python
B = [[1, 2, 3], [3, 2, 1]]
print("Standard Python:")
print(B, type(B))
print(B[0][2], B[1][1])
print()

# NumPy
B = np.array(B)
print("NumPy:")
print(B, type(B))
print(B[0, 2], B[1, 1])
print()

Standard Python:
[[1, 2, 3], [3, 2, 1]] <class 'list'>
3 2

NumPy:
[[1 2 3]
 [3 2 1]] <class 'numpy.ndarray'>
3 2



# Matrix Addition & Subtraction
Given the claim that a matrix can be treated as a list of vectors, the same intuition is applied for __matrix addition__ and __matrix subtraction__. Given that we have two matrices with an equal number of rows and columns, we simply add/subtract the elements in the same position from both matrices. Two examples are shown below:
$$ \text{(1)}\begin{bmatrix} 1 & 2\\ 3 & 2 \end{bmatrix} + \begin{bmatrix} 3 & 1\\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 4 & 3\\ 4 & 3 \end{bmatrix}$$

$$ \text{(2)}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix} - \begin{bmatrix} 3 & 1\\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}$$


These operations are also shown below in NumPy:

In [2]:
# Defining our matrices 
A = np.array([[1,2],[3,2]])
B = np.array([[3,1],[1,1]])
C = np.array([[4,2],[2,2]])

print("NumPy:")
print("Addition:", A + B)
print("Subtraction:", C - B)
print()

NumPy:
Addition: [[4 3]
 [4 3]]
Subtraction: [[1 1]
 [1 1]]



# Matrix Transpose
One of the simplest, yet one of the most useful matrix operations is the __matrix transpose__. For a matrix transpose, we simply turn its rows into columns, and columns into rows, which is equivalent to flipping it about its __main diagonal__ (top-left to bottom-right). In this case, an MxN matrix (M rows and N columns), becomes an NxM matrix. Some examples of a matrix transpose are shown below:
$$A =\begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 4 \end{bmatrix}, \;\;\;\;\;\;   A^{T} = A' = 
\begin{bmatrix} 1 & 2 & 3 \\ 2 & 3 & 4 \end{bmatrix}
$$

$$B =\begin{bmatrix} 2 & 2 \\ 1 & 1 \end{bmatrix}, \;\;\;\;\;\;   B^{T} = B' = 
\begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix}
$$

We will expand on the importance of the matrix transpose once we investigate other operations. Let us code up our own function to return the transpose of a matrix, then comparing it with the result obtained with NumPy:

In [8]:
# Defining matrices
A = [[1,2],[2,3],[3,4]]
B = [[2,2],[1,1]]

# Function that computes the transpose
def transpose(mat):
    mat_transpose = [] # initialise transposed matrix
    for idx2 in range(len(mat[0])): # iterate through columns
        new_row = []
        for idx1 in range(len(mat)): # iterate through rows
            new_row.append(mat[idx1][idx2])
        mat_transpose.append(new_row)
    return mat_transpose

print("Original")
print("A =", A)
print("B =", B)
print()

print("Standard Python Transpose")
print("A' =", transpose(A))
print("B' =", transpose(B))
print()

A = np.array(A)
B = np.array(B)

print("NumPy Transpose:")
print("A' =", A.T)
print("B' =", np.transpose(B))

Original
A = [[1, 2], [2, 3], [3, 4]]
B = [[2, 2], [1, 1]]

Standard Python Transpose
A' = [[1, 2, 3], [2, 3, 4]]
B' = [[2, 1], [2, 1]]

NumPy Transpose:
A' = [[1 2 3]
 [2 3 4]]
B' = [[2 1]
 [2 1]]


# Matrix Multiplication
Just as we have seen vector multiplication, we can also carry out __matrix multiplication__. In this operation, order is important, and the second matrix must have the same number of rows as the first matrix has columns. Given a MxN matrix A and a NxK matrix B, their __matrix product__, $C = A\times B$, will be a MxK matrix. This means it will have as many rows as the first matrix, and as many columns as the second matrix. It is crucial to remember that __for matrix multiplication, the first matrix must have as many columns and the second matrix has rows__. Given the general definition of A and B below, their matrix product is given as follows:

$$ 
A = \begin{bmatrix} a_{11} & ... & a_{1j} & ... & a_{1N} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{i1} & ... & a_{ij} & ... & a_{iN}\\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{M1} & ... & a_{Mj} & ... & a_{MN} 
\end{bmatrix}, \;
B = \begin{bmatrix} b_{11} & ... & b_{1j} & ... & b_{1K} \\
                        \vdots &    & \vdots &     & \vdots \\
                        b_{i1} & ... & b_{ij} & ... & b_{iK} \\
                        \vdots &    & \vdots &     & \vdots \\
                        b_{N1} & ... & b_{Nj} & ... & b_{NK} 
\end{bmatrix}
$$

$$ 
C = A\times B = \begin{bmatrix} c_{11} & ... & c_{1j} & ... & c_{1K} \\
                        \vdots &    & \vdots &     & \vdots \\
                        c_{i1} & ... & c_{ij} & ... & c_{iK}\\
                        \vdots &    & \vdots &     & \vdots \\
                        c_{M1} & ... & c_{Mj} & ... & c_{MK} 
\end{bmatrix}
$$

Where:

$$C_{ij} = \sum_{k=1}^{N}a_{ik}b_{kj}$$

This general definition can be difficult to understand, so to simplify this process, we will treat matrices as lists of column and row vectors. In this case, the entry of the matrix product, $c_{ij}$ is simply __the inner product of the__ $\mathbf{i^{th}}$ __row of the first matrix, A, and the__ $\mathbf{j^{th}}$ __column of the second matrix, B__. Two examples of the matrix product are shown below:

$$ 
\begin{bmatrix} 1 & 2 \\ 3 & 4  \end{bmatrix} \times \begin{bmatrix} 5\\ 6 \end{bmatrix} = 
\begin{bmatrix} 
\begin{bmatrix}1 & 2 \end{bmatrix}\cdot \begin{bmatrix}5 \\ 6 \end{bmatrix}\\
\begin{bmatrix}3 & 4 \end{bmatrix}\cdot \begin{bmatrix}5 \\ 6 \end{bmatrix}
\end{bmatrix} 
= \begin{bmatrix} 5 \\ 4 \end{bmatrix}
$$ 

$$
\begin{bmatrix} 2 & 3 \\ 1 & 4 \\ 1 & 0 \end{bmatrix} \times \begin{bmatrix} 1 & 0 \\ 2 & 1 \end{bmatrix} = 
\begin{bmatrix}
\begin{bmatrix} 2 & 3 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} &
\begin{bmatrix} 2 & 3 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\
\begin{bmatrix} 1 & 4 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} & 
\begin{bmatrix} 1 & 4 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\
\begin{bmatrix} 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} &
\begin{bmatrix} 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix}
\end{bmatrix} =
\begin{bmatrix} 8 & 3 \\ 9 & 4 \\ 1 & 0 \end{bmatrix}
$$

As you have seen above, even with the help of the vector inner product, this is a time-consuming task to do by hand, and practically impossible for extremely large matrices. Given their ordered structure and the repetitive nature of this operation,we can write programs to compute matrix products. 

We can now even reformulate the vector inner product as a matrix product of a row vector and a column vector, as shown in the example below:

$$
\begin{bmatrix} 1 & 2  \end{bmatrix} \times \begin{bmatrix} 5\\ 6 \end{bmatrix} = 5\cdot 1 + 2 \cdot 6 = 17
$$

Let us write a function that computes the matrix product of two matrices. Since in standard Python, there is no way to directly access a column of a nested list, we will borrow the previously written _transpose(  )_ function. Hint: First define a function to calculate the inner product of two vectors

In [9]:
# Defining matrices
A = [[1, 2],[2, 1]]
B = [[1], [2]]
C = [[2, 3], [1, 4], [1, 0]]
D = [[1, 0], [2, 1]]

## Standard Python
def inner_product(v1, v2): # function that computes algebraic inner product of two vectors
    product = 0
    for value1, value2 in zip(v1, v2):
        product += value1*value2
    return product

def matrix_product(mat1, mat2): # function that computes the matrix product between two matrices
    result = []
    mat2 = transpose(mat2)
    for i in range(len(mat1)): # iterate over rows of first matrix
        new_row = []
        for j in range(len(mat2)): # iterate columns of the second matrix
            product = inner_product(mat1[i][:], mat2[j][:]) # inner product between row and column vectors
            new_row.append(product)
        result.append(new_row)
    return result

print("Standard Python")
print("Product of A and B:", matrix_product(A, B))
print("Product of C and D:", matrix_product(C, D))
print()


# NumPy
A = np.array(A)
B = np.array(B)
C = np.array(C)
D = np.array(D)

print("Standard Python")
print("Product of A and B:", np.matmul(A, B))
print("Product of C and D:", np.matmul(C, D))

Standard Python
Product of A and B: [[5], [4]]
Product of C and D: [[8, 3], [9, 4], [1, 0]]

Standard Python
Product of A and B: [[5]
 [4]]
Product of C and D: [[8 3]
 [9 4]
 [1 0]]


In [12]:
# Defining our matrix
A = np.array([[1, 2], [2, 1]])

I = np.eye(2,2)
print("Original Matrix:", A)
print("2x2 identity matrix:", I)
print("Matrix product:", np.matmul(A, I))

Original Matrix: [[1 2]
 [2 1]]
2x2 identity matrix: [[1. 0.]
 [0. 1.]]
Matrix product: [[1. 2.]
 [2. 1.]]


# Linear Transformations
Another useful interpretation of matrices, is to treat them as __linear transformations__. This means that we can consider matrices to be something that when multiplied with one or multiple input vectors, we get one or multiple output vectors in the same vector space. For example, consider the matrix product below. We can see that by multiplying a vector with a matrix we get another vector, both which are displayed in the diagram below:

$$ \begin{bmatrix}1 & 2 \\ 3 & 1 \end{bmatrix} \times \begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}3 \\4\end{bmatrix}$$

In [2]:
# Visualization Code

import plotly.graph_objects as go

x1 = [0, 1, 3]
x2 = [0, 1, 4]


fig = go.Figure(data=[go.Scatter(
    x=x1, y=x2,
    mode='markers',
    marker=dict(size=[10, 30, 30],
            color=["black","black", "orange"])
    )])

fig.update_layout(
    title="Linear Transformation",
    xaxis_title="$x_{1}$",
    yaxis_title="$x_{2}$",
)
fig.add_trace(go.Scatter(x=[0, 1], y=[0, 1],marker_color="black", name="Original Vector"))
fig.add_trace(go.Scatter(x=[0, 3], y=[0, 4],marker_color="orange", name="Transformed Vector"))
fig.update_layout(showlegend=True)
fig.show()

This is a really powerful intution behind matrix multiplication, and can really help in understanding more complex concepts, such as multilinear regression, as you will find in the field of data science. Let us consider once again the animals example. We can now treat this dataset as a 13x3 matrix.

<table>
  <tr>
    <th>Animal</th>
    <th>Cuteness</th>
    <th>Size</th>
    <th>Ferocity</th>
  </tr>
  <tr>
    <td>Lion</td>
    <td>80</td>
    <td>50</td>
    <td>85</td>
  </tr>
  <tr>
    <td>Elephant</td>
    <td>75</td>
    <td>95</td>
    <td>20</td>
  </tr>
  <tr>
    <td>Hyena</td>
    <td>10</td>
    <td>30</td>
    <td>90</td>
  </tr>
  <tr>
    <td>Mouse</td>
    <td>60</td>
    <td>8</td>
    <td>1</td>
  </tr>
  <tr>
    <td>Pig</td>
    <td>30</td>
    <td>30</td>
    <td>10</td>
  </tr>
  <tr>
    <td>Horse</td>
    <td>50</td>
    <td>65</td>
    <td>30</td>
  </tr>
  <tr>
    <td>Dolphin</td>
    <td>90</td>
    <td>45</td>
    <td>20</td>
  </tr>
  <tr>
    <td>Wasp</td>
    <td>2</td>
    <td>1</td>
    <td>100</td>
  </tr>
  <tr>
    <td>Giraffe</td>
    <td>60</td>
    <td>80</td>
    <td>65</td>
  </tr>
  <tr>
    <td>Dog</td>
    <td>95</td>
    <td>20</td>
    <td>15</td>
  </tr>
  <tr>
    <td>Alligator</td>
    <td>8</td>
    <td>40</td>
    <td>90</td>
  </tr>
  <tr>
    <td>Mole</td>
    <td>30</td>
    <td>12</td>
    <td>15</td>
  </tr>
  <tr>
    <td>Black Widow</td>
    <td>100</td>
    <td>30</td>
    <td>69</td>
  </tr>
  </tr>
</table>

Now let's say I want to compute a score for each animal representing how much I like them. The score, $y$, I give for a given animal is based on:

$$y = 0.5x_{1} - 0.05x_{2} - 0.5x_{3}$$

Where $x_{1}, x_{2}, x_{3}$ represent the animals' cuteness, size and ferocity respectively. In this case, I want to add a positive mark for very cute animals, I'm indifferent to an animal's size and want to penalise the animal for its ferocity. But as you can probably imagine, it's quite a tedious process to compute the score for every single example, so first we can represent each calculation as a vector inner product:

$$y = x \cdot w$$

Where $w=[0.5, -0.05, -0.5]$ and $x=[x_{1}, x_{2}, x_{3}]$. Can we extend this idea further? Yes! We can compute the likeness score of all animals in one single matrix multiplication:

In [11]:
X = np.array([[80, 50, 85], [75, 95, 20], [10, 30, 90], [60, 8, 1], [30, 30, 10], [50, 65, 30], [90, 45, 20], [2, 1, 100], [60, 80, 85], [95, 20, 15], [8, 40, 90], [30, 12, 15], [100, 30, 69]])
w = np.array([[0.5], [-0.05], [-0.5]])
Y = np.matmul(X, w).flatten()
print(Y)

[ -5.    22.75 -41.5   29.1    8.5    6.75  32.75 -49.05 -16.5   39.
 -43.     6.9   14.  ]


$$
Y = X\cdot w =
\begin{bmatrix}
80 & 50 & 85 \\
75 & 95 & 20 \\
10 & 30 & 90 \\
60 & 8 & 1 \\
30 & 30 & 10 \\
50 & 65 & 30 \\
90 & 45 & 20 \\
2 & 1 & 100 \\
60 & 80 & 85 \\
95 & 20 & 15 \\
8 & 40 & 90 \\
30 & 12 & 15 \\
100 & 30 & 69 \\
\end{bmatrix} \cdot 
\begin{bmatrix} 0.5 \\ -0.05 \\ -0.5 \end{bmatrix} = 
\begin{bmatrix} -5 \\ 22.75 \\ -41.5 \\ 29.1 \\ 9.5 \\ 6.75 \\ 32.75 \\ -49.05 \\ -16.5 \\ 39 \\ -43 \\ 6.9 \\ 14 \end{bmatrix}
$$

Where $Y$ is a vector containing the likeness score of all animals and $X$ is the matrix corresponding to the data in the table above. In one simple calculation, we are able to score the fact that lions are scary, I hate wasps and love dogs and dolphins! What we have done is used our data, represented by $X$, to map animals to a likeness score based on the weights of each feature of the animals, $w$. Let us visualize this mapping:

In [14]:
# just to remind us ;)
animal_labels = ["Lion", "Elephant", "Hyena", "Mouse", "Pig", "Horse", "Dolphin", "Wasp", "Giraffe", "Dog", "Alligator", "Mole", "Scarlett Johansson", "The Rock"]
animal_cuteness = [80, 75, 10, 60, 30, 50, 90, 1, 60, 95, 8, 30, 100, 50]
animal_size = [50, 95, 30, 8, 30, 65, 45, 1, 80, 20, 40, 12, 30, 100]
animal_ferocity = [85, 20, 90, 1, 10, 30, 20, 100, 65, 15, 90, 15, 69, 100]


# nothing particularly important... just used for visualisation purposes
animal_mean_stats = [np.mean(k) for k in zip(animal_cuteness, animal_size, animal_ferocity)]

fig = go.Figure(data=[go.Scatter3d(
    x=animal_cuteness, y=animal_size, z=animal_ferocity,
    text=animal_labels,
    mode='markers+text',
    marker=dict(
        size=12,
        color=animal_mean_stats,                # set color to an array/list of desired values
        colorscale='Viridis',   # choose a colorscale
        opacity=0.8
    ))
])

fig.update_layout(title="Animal Cuteness vs Animal Size vs Animal Ferocity",
    scene = dict(
    xaxis_title='Animal Cuteness',
    yaxis_title='Animal Size',
    zaxis_title='Animal Ferocity')
)


fig.show()


In [19]:
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=Y, y=np.zeros_like(Y), mode='markers+text', 
    marker=dict(
        size=20,
        color=animal_mean_stats,                # set color to an array/list of desired values
        colorscale='Viridis',   # choose a colorscale
        opacity=0.8
    ), 
    text=animal_labels
))
fig.update_xaxes(showgrid=False)
fig.update_yaxes(showgrid=False, 
                 zeroline=True, zerolinecolor='black', zerolinewidth=3,
                 showticklabels=False)
fig.update_layout(height=200, plot_bgcolor='white')
fig.show()

As we can see from the plots above, __every animal vector was mapped to a likeness score via the weights__. This concept is very powerful, especially when dealing with multilinear and multivariate linear regression, which you will learn later on in your careers.

Linear transformations can affect an input vector by changing its length and/or its direction. Some linear transformations _only_ change the length of any vector, whereas others _only_ change the direction of any vector. However, for every matrix, there are some special vectors that are only scaled by this linear transformation.

We will see later that this geometric understanding of matrices will help when dealing eigenvalues and eigenvectors. For the scope of this course, just think of linear transformations as matrices that take an input vector and return an output vector in the same space.

# Challenges

__Question 1__: Create a 'Matrix' class that has the following properties:
- Takes in a standard nested list as input containing the respective entries
- Has a magic method \__add\__ that adds another Matrix object to the current Matrix
- Has a magic method \__sub\__ that subtracts another Matrix instance from the current Matrix
- Has a method called shape that returns a tuple of the form (rows, columns)
- Has a method called transpose that transposes the Matrix instance
- Has a magic method \__mul\__ that computes the matrix product of the current Matrix instance with an input Matrix