# Matrices

### Learning Objectives:
- [Matrix Introduction](#Matrix-Introduction)
- [Matrix Addition & Subtraction](#Matrix-Addition-&-Subtraction)
- [Matrix Transpose](#Matrix-Transpose)
- [Matrix Multiplication](#Matrix-Multiplication)
- [Identity Matrices](#Identity-Matrices)
- [Diagonal Matrices](#Diagonal-Matrices)
- [Linear Transformations](#Linear-Transformations)
- [Determinant & Matrix Inverse](#Determinant-&-Matrix-Inverse)
- [Orthogonal Matrices](#Orthogonal-Matrices)
- [Eigenvalues & Eigenvectors](#Eigenvalues-&-Eigenvectors)


# Matrix Introduction

Now that we have covered vectors, we can expand this idea onto the concept of a __matrix__. Matrices are rectangular ordered arrays of numbers, symbols or expressions arranged in rows and columns. Its components are also referred to as __entries__. They are considered ordered as every entry position holds a different meaning. It is often useful to treat a matrix as a sequence of __column vectors__ or __row vectors__ given certain operations. In fact, given our definition, individual vectors can also be treated as matrices! For example, consider the three matrices below:

$$ A = \begin{bmatrix} 1 \\ 2  \end{bmatrix} \text{, }B = \begin{bmatrix} 1 & 2 & 3\\ 3 & 2 & 1 \end{bmatrix} \text{, } C =\begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 4 \end{bmatrix}$$

We denote the dimensions of a matrix by first their number of rows, then their number of columns. In this case, A is a 2x1 matrix, B is a 2x3 matrix and C is a 3x2 matrix. The matrix A is also a column vector, and the matrix B can also be conceptualized as three 2-D column vectors or even two 3-D row vectors! If we let M,N be two integers, where M,N  > 1, and $a_{ij},i=1,...,M, j=1,...,N$ are $MN$ numbers, we can have the general representation of a matrix as follows:

$$ A = \begin{bmatrix} a_{11} & ... & a_{1j} & ... & a_{1N} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{i1} & ... & a_{ij} & ... & a_{iN} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{M1} & ... & a_{Mj} & ... & a_{MN} 
\end{bmatrix}$$

Matrices are a meaningful way of representing data. Generally, moving along rows holds a specific meaning, which is different from moving along columns. For example, let us say that we are gathering data on a population, and we collect height, weight and gender from each person in a sample of 8 people. A simple matrix representation of this data is such that moving along rows we move to a different feature, and moving along columns we move to a different person's data, as shown in the diagram below. We see that in this case, once again, how we can treat the matrix as a list of vectors, where each vector contains the weight, height and gender of one person.

<img src="images/matrix.png"
     alt="matrix"
     width="500px"
     height="400px"/>

Another reason that makes matrices are useful is that they enable the automation of large operations. Rather than carrying out individual operations for different vectors or even elements, we can carry out an operation on an entire matrix. This is especially useful when we have to process large volumes of data.

To represent matrices in standard Python, we can use __nested lists__, which are lists of lists. With NumPy, we can create multi-dimensional NumPy arrays. Using the _np.array(  )_ function, we can create a NumPy array with a nested list as an argument. Unlike with nested lists in standard Python, indexing of multi-dimensional NumPy arrays is different, as shown in the example below. This difference will be further elaborated upon in Chapter 3:

In [1]:
import numpy as np

# Standard Python
B = [[1, 2, 3], [3, 2, 1]]
print("Standard Python:")
print(B, type(B))
print(B[0][2], B[1][1])
print()

# NumPy
B = np.array(B)
print("NumPy:")
print(B, type(B))
print(B[0, 2], B[1, 1])
print()

Standard Python:
[[1, 2, 3], [3, 2, 1]] <class 'list'>
3 2

NumPy:
[[1 2 3]
 [3 2 1]] <class 'numpy.ndarray'>
3 2



# Matrix Addition & Subtraction
Given the claim that a matrix can be treated as a list of vectors, the same intuition is applied for __matrix addition__ and __matrix subtraction__. Given that we have two matrices with an equal number of rows and columns, we simply add/subtract the elements in the same position from both matrices. Two examples are shown below:
$$ \text{(1)}\begin{bmatrix} 1 & 2\\ 3 & 2 \end{bmatrix} + \begin{bmatrix} 3 & 1\\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 4 & 3\\ 4 & 3 \end{bmatrix}$$

$$ \text{(2)}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix} - \begin{bmatrix} 3 & 1\\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}$$


These operations are also shown below in NumPy:

In [2]:
# Defining our matrices 
A = np.array([[1,2],[3,2]])
B = np.array([[3,1],[1,1]])
C = np.array([[4,2],[2,2]])

print("NumPy:")
print("Addition:",A+B)
print("Subtraction:",C-B)
print()

NumPy:
Addition: [[4 3]
 [4 3]]
Subtraction: [[1 1]
 [1 1]]



# Matrix Transpose
One of the simplest, yet one of the most useful matrix operations is the __matrix transpose__. For a matrix transpose, we simply turn its rows into columns, and columns into rows, which is equivalent to flipping it about its __main diagonal__ (top-left to bottom-right). In this case, an MxN matrix (M rows and N columns), becomes an NxM matrix. Some examples of a matrix transpose are shown below:
$$A =\begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 4 \end{bmatrix}, \;\;\;\;\;\;   A^{T} = A' = 
\begin{bmatrix} 1 & 2 & 3 \\ 2 & 3 & 4 \end{bmatrix}
$$

$$B =\begin{bmatrix} 2 & 2 \\ 1 & 1 \end{bmatrix}, \;\;\;\;\;\;   B^{T} = B' = 
\begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix}
$$

We will expand on the importance of the matrix transpose once we investigate other operations. Below we show how to compute the matrix transpose in standard Python, as well NumPy, where we can either use the _np.transpose(  )_ function or _.T_ attribute:

In [8]:
# Defining matrices
A = [[1,2],[2,3],[3,4]]
B = [[2,2],[1,1]]

def transpose(mat):
    mat_transpose = [] # initialise transposed matrix
    for idx2 in range(len(mat[0])): # iterate through columns
        new_row = []
        for idx1 in range(len(mat)): # iterate through rows
            new_row.append(mat[idx1][idx2])
        mat_transpose.append(new_row)
    return mat_transpose

print("Original")
print("A =", A)
print("B =", B)
print()

print("Standard Python Transpose")
print("A' =", transpose(A))
print("B' =", transpose(B))
print()

A = np.array(A)
B = np.array(B)

print("NumPy Transpose:")
print("A' =", A.T)
print("B' =", np.transpose(B))

Original
A = [[1, 2], [2, 3], [3, 4]]
B = [[2, 2], [1, 1]]

Standard Python Transpose
A' = [[1, 2, 3], [2, 3, 4]]
B' = [[2, 1], [2, 1]]

NumPy Transpose:
A' = [[1 2 3]
 [2 3 4]]
B' = [[2 1]
 [2 1]]


# Matrix Multiplication
Just we have seen vector multiplication, we can also carry out __matrix multiplication__. In this operation, order is important, and the second matrix must have the same number of rows as the first matrix has columns. Given a MxN matrix A and a NxK matrix B, their __matrix product__, $C = A\times B$, will be a MxK matrix. This means it will have as many rows as the first matrix, and as many columns as the second matrix. It is crucial to remember that __for matrix multiplication, the first matrix must have as many columns and the second matrix has rows__. Given the general definition of A and B below, their matrix product is given as follows:

$$ A = \begin{bmatrix} a_{11} & ... & a_{1j} & ... & a_{1N} \\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{i1} & ... & a_{ij} & ... & a_{iN}\\
                        \vdots &    & \vdots &     & \vdots \\
                        a_{M1} & ... & a_{Mj} & ... & a_{MN} 
\end{bmatrix}, \;
B = \begin{bmatrix} b_{11} & ... & b_{1j} & ... & b_{1K} \\
                        \vdots &    & \vdots &     & \vdots \\
                        b_{i1} & ... & b_{ij} & ... & b_{iK} \\
                        \vdots &    & \vdots &     & \vdots \\
                        b_{N1} & ... & b_{Nj} & ... & b_{MK} 
\end{bmatrix}$$

$$ C = A\times B = \begin{bmatrix} c_{11} & ... & c_{1j} & ... & c_{1K} \\
                        \vdots &    & \vdots &     & \vdots \\
                        c_{i1} & ... & c_{ij} & ... & c_{iK}\\
                        \vdots &    & \vdots &     & \vdots \\
                        c_{M1} & ... & c_{Mj} & ... & c_{MK} 
\end{bmatrix}$$

Where:

$$C_{ij} = \sum_{k=1}^{N}a_{ik}b_{kj}$$

This general definition can be difficult to understand, so to simplify this process, we will treat matrices as lists of column and row vectors. In this case, the entry of the matrix product, $c_{ij}$ is simply __the inner product of the__ $\mathbf{i^{th}}$ __row of the first matrix, A, and the__ $\mathbf{j^{th}}$ __column of the second matrix, B__. Two examples of the matrix product are shown below:

$$ \begin{bmatrix} 1 & 2 \\ 3 & 4  \end{bmatrix} \times \begin{bmatrix} 5\\ 6 \end{bmatrix} = 
\begin{bmatrix} 
\begin{bmatrix}1 & 2 \end{bmatrix}\cdot \begin{bmatrix}5 \\ 6 \end{bmatrix}\\
\begin{bmatrix}3 & 4 \end{bmatrix}\cdot \begin{bmatrix}5 \\ 6 \end{bmatrix}
\end{bmatrix} 
= \begin{bmatrix} 5 \\ 4 \end{bmatrix}$$ 

$$\begin{bmatrix} 2 & 3 \\ 1 & 4 \\ 1 & 0 \end{bmatrix} \times \begin{bmatrix} 1 & 0 \\ 2 & 1 \end{bmatrix} = 
\begin{bmatrix}
\begin{bmatrix} 2 & 3 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} &
\begin{bmatrix} 2 & 3 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\
\begin{bmatrix} 1 & 4 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} & 
\begin{bmatrix} 1 & 4 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\
\begin{bmatrix} 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} &
\begin{bmatrix} 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix}
\end{bmatrix} =
\begin{bmatrix} 8 & 3 \\ 9 & 4 \\ 1 & 0 \end{bmatrix}
$$

As you have seen above, even with the help of the vector inner product, this is a time-consuming task to do by hand, and practically impossible for extremely large matrices. Given their ordered structure and the repetitive nature of this operation, we can write programs to compute matrix products. Below, we show how to compute the matrix product examples above in standard Python and in NumPy. Since in standard Python, there is no way to directly access a column of a nested list, we will borrow the previously written _transpose(  )_ function.

In [9]:
# CODING CHALLENGE

# Defining matrices
A = [[1, 2],[2, 1]]
B = [[1], [2]]
C = [[2, 3], [1, 4], [1, 0]]
D = [[1, 0], [2, 1]]

# Standard Python
def inner_product(v1, v2): # function that computes algebraic inner product of two vectors
    product = 0
    for value1, value2 in zip(v1, v2):
        product += value1*value2
    return product

def matrix_product(mat1, mat2): # function that computes the matrix product between two matrices
    result = []
    mat2 = transpose(mat2)
    for i in range(len(mat1)): # iterate over rows of first matrix
        new_row = []
        for j in range(len(mat2)): # iterate columns of the second matrix
            product = ##
            new_row.append(product)
        result.append(##)
    return result

print("Standard Python")
print("Product of A and B:", matrix_product(A, B))
print("Product of C and D:", matrix_product(C, D))
print()


# NumPy
A = np.array(A)
B = np.array(B)
C = np.array(C)
D = np.array(D)

print("Standard Python")
print("Product of A and B:", np.matmul(A, B))
print("Product of C and D:", np.matmul(C, D))

Standard Python
Product of A and B: [[5], [4]]
Product of C and D: [[8, 3], [9, 4], [1, 0]]

Standard Python
Product of A and B: [[5]
 [4]]
Product of C and D: [[8 3]
 [9 4]
 [1 0]]


# Identity Matrices
Now that we have covered matrix multiplication, we can cover some interesting properties and types of matrices. A crucial type of matrix is known as an __identity matrix__. Identity matrices must be __square matrices__, meaning they have as many rows as it has columns. By definition, the matrix product of any square matrix and an identity matrix with the same number of rows and columns is itself. This property makes it a useful matrix when dealing with the matrix inverse, as well as eigenvalues and eigenvectors, which we will cover shortly. Identity matrices, denoted as __I__, have 1s on their main diagonal, and 0s elsewhere as shown below:

$$ \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix}, \; \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\0 & 0 & 1\end{bmatrix}, \;
\begin{bmatrix} 1 & 0 & ... & 0 \\
                0 & 1 & ... & 0 \\
                \vdots &    & \ddots &  \\
                0 & 0 & ... & 1 
\end{bmatrix}
$$

$$A\times I = I\times A = A$$

You can check out for yourself, and you will see that whatever square matrix you multiply with an identity matrix will give you the original matrix! Here we show how to generate an identity matrix in NumPy, along with proof that we get the original matrix when mutiplying with an identity matrix.

In [12]:
# Defining our matrix
A = np.array([[1, 2], [2, 1]])

I = np.eye(2,2)
print("Original Matrix:", A)
print("2x2 identity matrix:", I)
print("Matrix product:", np.matmul(A, I))

Original Matrix: [[1 2]
 [2 1]]
2x2 identity matrix: [[1. 0.]
 [0. 1.]]
Matrix product: [[1. 2.]
 [2. 1.]]


# Diagonal Matrices
Another useful type of matrix is a __diagonal matrix__, where entries along the main diagonal are the only non-zero values. A general MxM is shown below:

$$D = diag(a_{1}, a_{2}, ..., a_{M}) = \begin{bmatrix} a_{1} & 0 & ... & 0 \\
                      0 & a_{2} & ... & 0  \\
                      \vdots & \vdots & \ddots & \vdots \\
                       0 & 0 & ... & a_{M}
\end{bmatrix}$$
Where $a_{i}, i=1,...,M$ are scalar numbers.

Diagonal matrices have interesting properties:
- matrix multiplication order does not matter with diagonal matrices
- the transpose of a diagonal matrix is itself
- it scales the value of each row of the matrix by the respective diagonal component in the diagonal matrix

Some examples of matrix multiplication with diagonal matrices are shown below:
$$\begin{bmatrix}2 & 0 \\ 0 & 3\end{bmatrix}\times \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix} = \begin{bmatrix}2 & 2 \\ 3 & 3\end{bmatrix} $$

$$\begin{bmatrix}2 & 0 \\ 0 & 4\end{bmatrix}\times \begin{bmatrix}1 & 2 & 3\\ 1 & 2 & 3\end{bmatrix} = \begin{bmatrix}2 & 4 & 6 \\ 4 & 8 & 12\end{bmatrix} $$

After understanding some of its properties, we can even treat an identity matrix as a diagonal matrix that scales each row by 1.

# Linear Transformations
Another useful interpretation of matrices, is to treat them as __linear transformations__. This means that we can consider matrices to be something that when multiplied with one or multiple input vectors, we get one or multiple output vectors in the same vector space. For example, consider the matrix product below. We can see that by multiplying a vector with a matrix we get another vector, both which are displayed in the diagram below:

$$ \begin{bmatrix}1 & 2 \\ 3 & 1 \end{bmatrix} \times \begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}3 \\4\end{bmatrix}$$

In [2]:
# Visualization Code

import plotly.graph_objects as go

x1 = [0, 1, 3]
x2 = [0, 1, 4]


fig = go.Figure(data=[go.Scatter(
    x=x1, y=x2,
    mode='markers',
    marker=dict(size=[10, 30, 30],
            color=["black","black", "orange"])
    )])

fig.update_layout(
    title="Linear Transformation",
    xaxis_title="$x_{1}$",
    yaxis_title="$x_{2}$",
)
fig.add_trace(go.Scatter(x=[0, 1], y=[0, 1],marker_color="black", name="Original Vector"))
fig.add_trace(go.Scatter(x=[0, 3], y=[0, 4],marker_color="orange", name="Transformed Vector"))
fig.update_layout(showlegend=True)
fig.show()

Linear transformations can affect an input vector by changing its length and/or its direction. Some linear transformations _only_ change the length of any vector, whereas others _only_ change the direction of any vector. However, for every linear transformation, there are some special vectors that are only scaled by this linear transformation.

We will see later that this geometric understanding of matrices will help when dealing eigenvalues and eigenvectors. For the scope of this course, just think of linear transformations as matrices that take an input vector and return an output vector in the same space.

# Determinant & Matrix Inverse
A __determinant__ is a function of a square matrix that reduces it to a single number. The determinant of a matrix A is denoted |A| or det(A). If A consists of one element a, then |A| = a. For a 2x2 matrix it is given by:

$$A = \begin{bmatrix}a & b \\ c & d\end{bmatrix}, \; det(A) = \begin{vmatrix}a & b \\ c & d\end{vmatrix} = ad-bc$$

Finding the determinant of an NxN square matrix for N > 2 can be done by recursively deleting rows and columns to create successively smaller matrices until they are all 2 × 2 dimensions, and then applying the 2x2 definition above. The most basic technique is __expansion by row__, which we show here for a 3 × 3 matrix:

$$A = \begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix} \; det(A) = a\begin{vmatrix} e & f \\ h & i\end{vmatrix} - b\begin{vmatrix} d & f \\ g & i\end{vmatrix} + c\begin{vmatrix} d & e \\ g & h\end{vmatrix}$$

<img src="images/determinant.png"
     alt="matrix"
     width="500px"
     height="400px"/>

In this case we are expanding by row 1, which means deleting row 1 by successively deleting columns 1, column 2, and column 3 to create three 2×2 matrices. The determinant of each smaller matrix is multiplied by the entry corresponding to the intersection of the deleted row and column. The expansion alternately adds and subtracts each successive determinant.


We will generally use a computer for these calculations, especially for higher dimensions. So why do we need the determinant? Well, it is a useful tool in determining whether a square matrix has an __inverse__, and if it does, is also used in computing the matrix inverse.

If we have a square matrix, A, its inverse, $\mathbf{A^{-1}}$, is defined such that:
$$ A\times A^{-1} = A^{-1} \times A = I $$

The inverse of a square matrix is a matrix of the same dimension that when multiplied with the original matrix, returns the identity matrix, as shown above. Hence, we see the importance of identity matrices! In terms of linear transformations, it is the opposite transformation: it takes an ouput vector(s) and returns your input vector(s). However, not all square matrices have an inverse. Matrices that have no inverse are known as __singular matrices__, and have a determinant equal to 0. Thus, to determine whether a square matrix has an inverse, we simply compute its determinant! Some examples of this concept are shown below:
$$A = \begin{bmatrix}1 & 0 \\ 0 & 2\end{bmatrix}, \; det(A) = 2, \; A^{-1} = \begin{bmatrix}1 & 0 \\ 0 & \frac{1}{2}\end{bmatrix}, \; A \times A^{-1} = I$$

$$B = \begin{bmatrix}1 & 2 \\ 2 & 4\end{bmatrix}, \; det(B) = 0,\; \text{(singular, no inverse)} $$

In general, we compute matrix determinants and inverses with different programs rather than by hand due to their complexity. Below, we show how to compute determinants and inverses with NumPy.

In [7]:
# Defining our matrices
A = np.array([[1,0],[0,2]])
B = np.array([[1,2],[2,4]])


# Case 1: An inverse exists!
A_det = np.linalg.det(A)
A_inv = np.linalg.inv(A)
product = np.matmul(A,A_inv)
print("Case 1:")
print("det(A) =",A_det)
print("Inverse of A:",A_inv)
print("Matrix product:",product)
print()


# Case 2: Singular matrix, inverse does NOT exist
B_det = np.linalg.det(B)
try:
    B_inv = np.linalg.inv(B) # checking to see if Python finds an error
except:
    B_inv = "inverse not existent"

print("Case 2:")
print("det(B) =",B_det)
print("Inverse of B:",B_inv)

Case 1:
det(A) = 2.0
Inverse of A: [[1.  0. ]
 [0.  0.5]]
Matrix product: [[1. 0.]
 [0. 1.]]

Case 2:
det(B) = 0.0
Inverse of B: inverse not existent


# Orthogonal Matrices
Another type of matrix is an __orthogonal matrix__. Orthogonal matrices have the interesting product that their transpose is also their matrix inverse! In formal terms, given an orthogonal matrix A:

$$ A \times A^{T} = A^{T} \times A = I \implies A^{T} = A^{-1}$$

This is yet another example of why it is useful to visualize matrices as composed of column/row vectors. In this case, a matrix is orthogonal if its columns and rows are orthogonal unit vectors (orthonormal). This means that the inner product of its rows with its columns will yield 1 if it's the same vector, and zero otherwise. Thus, to check if a matrix is orthogonal, we simply have to prove that the inner product of different columns is 0 and their lengths are equal to 1. An example of an orthogonal matrix is shown below.

$$A = \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{bmatrix} = \begin{bmatrix} \mathbf{v_{1}} & \mathbf{v_{2}} \end{bmatrix}, \;\; ||\mathbf{v_{1}}|| = 1, ||\mathbf{v_{2}}|| = 1,  \mathbf{v_{1}} \cdot \mathbf{v_{2}} = 0 \implies \text{orthogonal} $$

$$A \times A^{T} = 
\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{bmatrix} 
\times
\begin{bmatrix}\frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{bmatrix} 
 = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix} = I$$
 
Computing the inverse of a matrix is generally a complex process, which means that exploiting matrix orthogonality allows us to simplify manual calculations and reduce computational complexity.

# Eigenvalues & Eigenvectors
We will now put together all we have covered together to understand one of the most important concepts in linear algebra: __eigenvalues__ and __eigenvectors__. Given a square matrix, A, its eigenvector(s), denoted by $\vec{\mathbf{v_{i}}}$ is/are vector(s) such that:

$$A\vec{\mathbf{v_{i}}} = \lambda_{i}\vec{\mathbf{v_{i}}}$$

Where $\lambda_{i}$ is the corresponding eigenvalue, a scalar.

__Eigenvectors are any possible non-zero solutions for__ $\mathbf{\vec{v_{i}}}$ __the equation above__. What does this actually mean? Well, this is one of the instances where it makes sense to visualize a matrix as a linear transformation. As we saw earlier in our animation, we can apply a linear transformation to any vector with the appropriate number of components, and for a few vectors, the linear transformation only scales them by a given scalar value (their eigenvalue). These are the eigenvectors of that matrix.

So how do we calculate the eigenvalues and eigenvectors of a matrix? Some of you have probably already guessed that a vector with all zero components is always an eigenvector. This is known as a __trivial solution__, and since it applies to all matrices regardless of the corresponding eigenvalue, it doesn't tell us about any interesting properties of individual matrices. We are, therefore, interested in the __non-trivial solution(s)__ for the eigenvectors of a matrix. We can rewrite the general expression in the form below, which allows us to do the following:

$$A\vec{\mathbf{v_{i}}} = \lambda_{i}I\vec{\mathbf{v_{i}}}$$
$$(A - \lambda_{i}I)\vec{\mathbf{v_{i}}} = 0 $$

Some of you may be thinking: "We can multiply both sides of the equation by $(A - \lambda_{i}I)^{-1}$, but this results in the trivial solution we had intuitively found before. So what do we do now?". Well, for $\vec{\mathbf{v_{i}}}$ to be a non-zero vector, $(A - \lambda_{i}I)$ must be singular, meaning it CANNOT have an inverse. This is why there are so few "exceptions" that lead to the few eigenvectors of the matrix out of the infinitely many other vectors in the same vector space. Going one step further, we can use the determinant condition we came across before to solve for the eigenvalues of the matrix:

$$det(A - \lambda_{i}I) = 0$$

If we keep $\lambda_{i}$ as a variable, the expression above gives us a polynomial equation, known as the __characteristic polynomial__ of the matrix, and solving it will give you the eigenvalues of the matrix! To find their corresponding eigenvectors, we plug each eigenvalue into the original expression and solve for the vector. A simple, 2x2 example is shown below:

$$A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \implies 
A - \lambda_{i}I = \begin{bmatrix} 1-\lambda_{i} & 2 \\ 2 & 1-\lambda_{i} \end{bmatrix}$$

$$det(A - \lambda_{i}I) = 1 - 2 \lambda_{i} + \lambda_{i}^{2} - 4 = \lambda_{i}^{2} - 2 \lambda_{i} - 3 = 0$$
$$ \lambda_{1} = -1, \lambda_{2} = 3$$

Now we can find the respective eigenvectors of each eigenvalue. We will start with the eigenvector of $\lambda_{1}$:
$$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix}v_{x} \\ v_{y}\end{bmatrix} = 
\begin{bmatrix}-v_{x} \\ -v_{y}\end{bmatrix}$$

$$ v_{x} + 2v_{y} = -v_{x}$$
$$ 2v_{x} + v_{y} = -v_{y}$$
This means that the corresponding eigenvectors of $\lambda_{1}$ is any vector such that $v_{x} = -v_{y}$. We can generalize these to obtain our eigenvectors (which are all scalar multiples of each other):

$$\mathbf{\vec{v_{1}}} = t\begin{bmatrix} 1 \\ -1 \end{bmatrix}, \text{where } t \in \Re$$

The same procedure can be applied for the second eigenvalue, resulting in the following eigenvector:
$$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix}v_{x} \\ v_{y}\end{bmatrix} = 
\begin{bmatrix}3v_{x} \\ 3v_{y}\end{bmatrix}$$

$$ v_{x} + 2v_{y} = 3v_{x}$$
$$ 2v_{x} + v_{y} = 3v_{y}$$
$$v_{x}=v{y}$$
$$\mathbf{\vec{v_{1}}} = \alpha\begin{bmatrix} 1 \\ 1 \end{bmatrix}, \text{where } \alpha \in \Re$$

And that's it! This procedure can be applied to a square matrix of any size. However, as you can see from the 2x2 case, even a simple example can be time-consuming. For larger matrices, it is highly impractical to manually compute the eigenvalues and eigenvectors of the matrix, so, as usual, we write programs to do the hard stuff for us. Below we show how to use NumPy to calculate the eigenvalues/eigenvectors of any square matrix. We will use _np.linalg.eig(  ),_ which returns the only possible eigenvector of unit length. In reality, any scalar multiple of this vector is also a valid solution!

In [3]:
# 2x2 matrix
A = np.array([[1,2],[2,1]])
w,v = np.linalg.eig(A)
print("Eigenvalues:",w)
print("Eigenvectors:",v)

NameError: name 'np' is not defined

# Congratulations!