# <ins>**Linear Algebra**</ins>

## **Importance to Data Science**
Data is often represented in vectors and matrices. Linear algebra is the tool to handle and manipulate those. Linear algebra plays an important role in machine learning. For example one of the most simplest and common machine learning algorithm is *linear regression* which uses linear algebra to find best-fit line for predicting outcomes. It's also present in *optimization*, *neural networks*, *image recognition*, *recommendation systems* and many more areas. Knowing linear algebra is essential to computer and data science. 

****

## **Vectors**

### <ins>What are Vectors?</ins>
A vector is essentially an ordered list of numbers. They are used to represent data points, measurements or any kind of numeric information in a structured way. 

### <ins>Characteristics of Vectors</ins>

#### 1. Dimension
- The number of elements in a vector is called its dimension. For example a vector with 3 elements is called a 3-dimensional vector.

#### 2. Notation
- Vectors are often written as a column of numbers like this:

$$
\mathbf{v} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;or as a row of numbers:

$$
\mathbf{v} = (1, 2, 3)
$$

#### 3. Components
- Each number in the vector is called a component. For example in the vector $\mathbf{v} = (1, 2, 3)$, the components are 1, 2 and 3.

### <ins>Python Examples of Creating Vectors</ins>
In Python we could use NumPy library for creating vectors. Creating a row vector is easy but the problem is that NumPy will treat column vectors like regular 1D arrays unless we explicitly shape it. This would make the column vector a 2D vector.

#### Row Vector Example:

In [1]:
import numpy as np

row_vector = np.array([1, 2, 3])
print(f"Row vector: \n{row_vector}")

Row vector: 
[1 2 3]


#### Column Vector Example:

In [2]:
column_vector = np.array([[1], [2], [3]])
print(f"Column vector: \n{column_vector}")

Column vector: 
[[1]
 [2]
 [3]]


### <ins>How Vectors are Used in Data Science</ins>
Here are some common use cases of vectors in data science:

#### 1. Data Representation:
- Features of a Data point: Each data point in a dataset can be represented as a vector. For example, if you are working with a dataset of bulking pandas where each panda is described by its height (in meters), weight (in kilograms) and age (in years), each panda can be represented as a 3-dimensional vector:

$$
Panda 1 = (1.0, 125.0, 8) \\
Panda 2 = (1.1, 140.0, 15)
$$

#### 2. Operations on Vectors:
- **Addition:** Vectors can be added together by adding their corresponding components. If $a = (1, 2)$ and $b = (3, 4)$, then: 
$$
a + b = (1 + 3, 3 + 4) = (4, 6)
$$

- **Scalar Multiplication:** A vector can be multiplied by a scalar (a single number). You multiply each component with the scalar number. If $\mathbf{v} = (2, 3)$ and the scalar is 4, then:
$$
4\mathbf{v} = 4(2, 3) = (4 \times{} 2, 4 \times{} 3) = (8, 12)
$$ 

#### 3. Distance and Similarity:
- **Euclidean Distance:** The Euclidean distance between two vectors is a measure how far apart they are, the similarity between two data points. Often times distance and similarity are considered separate things. Distance tells you how far apart the vectors are while similarity tells you how similar or aligned the vectors are. Distance ranges from 0 to infinity while similarity can have negative metrics. Choosing which one to use depends on application. For example distance for clustering algorithms and similarity for information retrieval and text analysis. We are not going that deep here yet and focus on simpler things like Euclidean distance. For vectors $a = (x_1, y_1)$ and $b = (x_2, y_2)$, the distance is given by:
$$
Distance = \sqrt{(x_2 - x_ 1)^2 + (y_2 - y_1)^2}
$$

#### 4. Direction and Magnitude:
- **Direction:** The direction of the vector is the way it points in space. This is important in multiple disciplines like physics and engineering but also in understanding the orientation of data points in data science.

- **Magnitude:** In this context magnitude means the measured length of the vector $\mathbf{v} = (x, y)$ in space and is given by:
$$
||\mathbf{v}|| = \sqrt{x^2 + y^2}
$$

&nbsp;&nbsp;&nbsp;&nbsp;and more generally depending on number of dimensions for vector $\mathbf{v} = (v_1, v_2...,v_n)$:

$$
||\mathbf{v}|| = \sqrt{v^2_1 + v^2_2 + ... + v^2_n}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Example: consider $\mathbf{v} = (3, 4)$:

$$
||\mathbf{v}|| = \sqrt{3^2 + 4^4} = \sqrt{9 + 16} = \sqrt{25} = 5
$$

&nbsp;&nbsp;&nbsp;&nbsp;This means the magnitude for the vector is 5.

- **Further Understanding the Magnitude:** In context of vectors, the notation $||\mathbf{v}||$ (read as *"norm of v"* or *"magnitude of v"*) is just a fancy way of saying magnitude. The double vertical bars $|| \; ||$ are used to denote the magnitude of a vector. You do not have to go deeper interpreting it!

### <ins>Python Examples of Vector Operations and Calculations</ins>
For the sake of simplicity we are going to use functions in NumPy and not write everything from a scratch. This is how it would work in real life as well unless you need to build your own custom function. **Do not re-invent the wheel!** Here are few examples of manipulating vectors in python:

#### Addition, Subtraction and Scalar Multiplication:
I know there was nothing about subtraction above but it works the same as addition. Lets use the vectors above, $a = (1, 2)$ and $b = (3, 4)$:

In [3]:
# Define the vectors
a = np.array([1, 2])
b = np.array([3, 4])

# Addition
vectors_added = a + b
print(f"Addition of vectors a and b is: {vectors_added}")

# Subtraction 
vectors_subtracted = a - b
print(f"Subtraction of vectors a and b is: {vectors_subtracted}")

Addition of vectors a and b is: [4 6]
Subtraction of vectors a and b is: [-2 -2]


Next for scalar multiplication we use $\mathbf{v} = (2, 3)$ with the scalar 4:

In [4]:
# Define the vector and scalar
v = np.array([2, 3])
scalar = 4

# Scalar multiplication
vector_multiplied = v * scalar
print(f"Scalar multiplication of vector is: {vector_multiplied}")

Scalar multiplication of vector is: [ 8 12]


#### Euclidean Distance
For this example lets use vectors $c = (1, 2, 3)$ and $d = (4, 5, 6)$:

In [5]:
# Define vectors
c = np.array([1, 2, 3])
d = np.array([4, 5, 6])

# Calculate Euclidean distance
euclidean_distance = np.linalg.norm(c - d)
print(f"Euclidean distance between c and d is: {euclidean_distance}")

Euclidean distance between c and d is: 5.196152422706632


#### Magnitude
For magnitude lets use one of the vectors in the previous example:

In [6]:
magnitude = np.linalg.norm(c)
print(f"Magnitude for vector c is: {magnitude}")

Magnitude for vector c is: 3.7416573867739413


#### Direction a.k.a. Normalization
Now here is something new before the python code. Direction and normalization are the same in most cases. When we talk about the direction of the vector, we are often interested in *unit vector* that points the same direction as the original vector. Thus, the unit vector is obtained from normalizing the vector. The equation for normalization looks something like this: 
$$
\mathbf{\hat{v}} = \frac{\mathbf{v}} {||\mathbf{v}||}
$$

In [7]:
# Using the c vector from above again
direction = c / np.linalg.norm(c)
print(f"Direction of the vector c is: {direction}")

Direction of the vector c is: [0.26726124 0.53452248 0.80178373]


### <ins>Final Notes About Vectors</ins>
By final notes I mean one final example of vector use case in data science. Not going into details but giving this example as an extra. 
- Suppose that we have a dataset of athletic pandas who have scores (from 0 to 100) of two performances: Jumping and Running.

$$
Panda 1 = (85, 78) \\
Panda 2 = (92, 88) \\
Panda 3 = (45, 60) \\
Panda 4 = (50, 65)
$$

- Each panda is represented as a simple 2-dimensional vector based on their scores.

- We could use clustering algorithms like K-means clustering to group pandas with similar scores together. The algorithm calculates the distance between vectors from clusters.

- The pandas might be clustered in two groups: high-performing and low-performing based on their vector scores. 

****

## **Matrices**

### <ins>What is a Matrix?</ins>
Other than a great sci-fi movie trilogy, a matrix is a rectangular array of numbers arranged in rows and columns. Each number in a matrix is called and element. Just like vectors, matrices are fundamental concept in linear algebra and are widely used in various fields - including data science - to represent and manipulate data. They are used various tasks such as data transformations and solving systems of linear equations. A matrix is typically denoted by a capital letter (e.g., $A, B, C$). Here is an example of a 3x3 matrix:

$$
\mathbf{A} = 
\begin{pmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \\
a_{2,1} & a_{2,2} & a_{2,3} \\
a_{3,1} & a_{3,2} & a_{3,3}
\end{pmatrix}
$$

### <ins>Basic Matrix Operations</ins>
Matrices have similar operations to vectors but are done in rows and columns.

#### 1. Addition and Subtraction:
- You can add and subtract two or more matrices if they have the same dimensions. This means that for example you cannot add a 3x2 and 2x3 matrices together because they have different amount of rows and columns. The operation of addition as well as subtraction are performed element-wise.

$$
\mathbf{A} + \mathbf{B} =
\begin{pmatrix}
a_{1,1} + b_{1,1} & a_{1,2} + b_{1,2}\\
a_{2,1} + b_{2,1} & a_{2,2} + b_{2,2}
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Here is an example with something less abstract than letters:
$$
\mathbf{A} =
\begin{pmatrix}
1 & 2\\
3 & 4
\end{pmatrix},

\mathbf{B} =
\begin{pmatrix}
5 & 6\\
7 & 8
\end{pmatrix},

\mathbf{A} + \mathbf{B} =
\begin{pmatrix}
1 + 5 & 2 + 6\\
3 + 7 & 4 + 8
\end{pmatrix} =

\begin{pmatrix}
6 & 8\\
10 & 12
\end{pmatrix}
$$

#### 2. Scalar Multiplication:
- The scalar multiplication happens the same kind of way as seen with vectors. You have a single number we call a scalar and we multiply each element in the matrix with it.

$$
\mathbf{c} \cdot \mathbf{A} =
\begin{pmatrix}
\mathbf{c} \cdot a_{1,1} & \mathbf{c} \cdot a_{1,2} \\
\mathbf{c} \cdot a_{2,1} & \mathbf{c} \cdot a_{2,2}
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Lets multiply the $\mathbf{A}$ matrix from above with the scalar $\mathbf{c} = 4$:

$$
4 \cdot \mathbf{A} =
\begin{pmatrix}
4 \cdot 1 & 4 \cdot 2 \\
4 \cdot 3 & 4 \cdot 4
\end{pmatrix} = 

\begin{pmatrix}
4 & 8\\
12 & 16
\end{pmatrix}
$$

#### 3. Matrix Multiplication:
- This one is a bit more tricky. To be able to multiply two matrices with each other, the first matrix must have the same number of columns as the second matrix has rows. The number of rows in the resulting matrix corresponds with the number of rows in the first matrix and the number of columns in the resulting matrix is the number of columns in the second matrix. So if we had matrixes 3x6 and 6x2, the resulting matrix would be in a form of 3x2. You can think of the 6 in 3x6 and 6x2 as some sort of "same-number-link" between the two. You would not be able to multiply 6x3 with 2x6. Here is the abstract of multiplying process:

$$
\mathbf{C} = \mathbf{A} \cdot \mathbf{B} \implies c_{ij} = \displaystyle\sum_{k} a_{ik} \cdot b_{kj}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Next lets look at more concrete example for clarity:

$$
\mathbf{A}
\begin{pmatrix}
1 & 2 & 3\\
4 & 5 & 6
\end{pmatrix},

\mathbf{B}
\begin{pmatrix}
7 & 8\\
9 & 10\\
11 & 12
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;It would seem like the product matrix $\mathbf{C}$ is a 2x2 matrix. Now the most tedious part:

&nbsp;&nbsp;&nbsp;&nbsp;**Calculate** $c_{1,1}$:

$$
c_{1,1} = a_{1,1} \cdot b_{1,1} + a_{1,2} \cdot b_{2,1} + a_{1,3} \cdot b_{3,1}\\
c_{1,1} = 1 \cdot 7 + 2 \cdot 9 + 3 \cdot 11\\
c_{1,1} = 7 + 18 + 33 = 58
$$

&nbsp;&nbsp;&nbsp;&nbsp;**Calculate** $c_{1,2}$:

$$
c_{1,2} = a_{1,1} \cdot b_{1,2} + a_{1,2} \cdot b_{2,2} + a_{1,3} \cdot b_{3,2}\\
c_{1,2} = 1 \cdot 8 + 2 \cdot 10 + 3 \cdot 12\\
c_{1,2} = 8 + 20 + 36 = 64
$$

&nbsp;&nbsp;&nbsp;&nbsp;**Calculate** $c_{2,1}$:

$$
c_{2,1} = a_{2,1} \cdot b_{1,1} + a_{2,2} \cdot b_{2,1} + a_{2,3} \cdot b_{3,1}\\
c_{2,1} = 4 \cdot 7 + 5 \cdot 9 + 6 \cdot 11\\
c_{2,1} = 28 + 45 + 66 = 139
$$

&nbsp;&nbsp;&nbsp;&nbsp;**Calculate** $c_{2,2}$:

$$
c_{2,2} = a_{2,1} \cdot b_{1,2} + a_{2,2} \cdot b_{2,2} + a_{2,3} \cdot b_{3,2}\\
c_{2,2} = 4 \cdot 8 + 5 \cdot 10 + 6 \cdot 12\\
c_{2,2} = 32 + 50 + 72 = 154
$$

&nbsp;&nbsp;&nbsp;&nbsp;This results to:

$$
\mathbf{C} =
\begin{pmatrix}
58 & 64\\
139 & 154
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;As you can observe, this is quite a workload to do by hand. Luckily technology comes to the rescue and this can be done with few lines of Python code and NumPy. 

### <ins>Python Examples of Basic Matrix Operations</ins>
Next we will see how easy it is to do matrix calculations in Python. 

#### 1. Matrix Addition and Subtraction:


In [8]:
# Define matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Addition
addition_matrix = A + B
print(f"Matrix Addition: \n{addition_matrix}")

# Subtraction
subtraction_matrix = A - B
print(f"Matrix Subtraction: \n{subtraction_matrix}")

Matrix Addition: 
[[ 6  8]
 [10 12]]
Matrix Subtraction: 
[[-4 -4]
 [-4 -4]]


#### 2. Matrix Scalar Multiplication:

In [9]:
# Define scalar
scalar = 4
scalar_matrix = scalar * A
print(f"Scalar Multiplication: \n{scalar_matrix}")

Scalar Multiplication: 
[[ 4  8]
 [12 16]]


#### 3. Matrix Multiplication:
*TRIVIA:* Matrix multiplication is also called "dot product", thus the name of NumPy function ``dot()`` used here.

In [10]:
# Redefine A and B to match what was taught above for clarity
A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([[7, 8], [9, 10], [11, 12]])

# Product matrix C
C = np.dot(A, B)

print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("Matrix C = (A * B):\n", C)

Matrix A:
 [[1 2 3]
 [4 5 6]]
Matrix B:
 [[ 7  8]
 [ 9 10]
 [11 12]]
Matrix C = (A * B):
 [[ 58  64]
 [139 154]]


### <ins>Special Matrices</ins>
There are couple matrices which are exceptional and for that reason important to know about. You will encounter them in more complex calculations on the road.  

#### Identity Matrix:
- Identity matrix is a sequence matrix with ones on diagonal and zeroes everywhere else. It is a square shape meaning it always has the same amount of rows and columns. It is commonly denoted with big $\mathbf{I}$. For example:

$$
\mathbf{I} =
\begin{pmatrix}
1 & 0 & 0\\
0 & 1 & 0\\
0 & 0 & 1
\end{pmatrix}
$$

#### Zero Matrix:
- This matrix is exactly how it sounds like, full of zeros. You might also hear someone call it the "null matrix" or "zero tensor" which are both valid and latter is maybe heard more in context of Python machine learning libraries such as PyTorch. Unlike identity matrix, zero matrix can be of any size and is usually denoted with big letter $\mathbf{O}$ (not zero). Here is a shocking example of 2x3 zero matrix you did not expect:

$$
\mathbf{O} =
\begin{pmatrix}
0 & 0 & 0\\
0 & 0 & 0
\end{pmatrix}
$$

### <ins>Python Examples of Special Matrices</ins>
Creating identity and zero matrices with NumPy is extremely easy:

In [11]:
# Identity matrix
I = np.identity(3)
print(f"Identity Matrix: \n{I}")

# Zero matrix
O = np.zeros((2, 3))
print(f"Zero Matrix: \n{O}")

Identity Matrix: 
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Zero Matrix: 
[[0. 0. 0.]
 [0. 0. 0.]]


### <ins>Matrix Properties</ins>

#### 1. Transpose of a Matrix:
The transpose of a matrix is when its rows and columns are swapped from $m \times{} n$ to $n \times{} m$. This is useful in many cases. For example if you remember the rule to multiply two matrices, we can transpose the other matrix in case multiplying was not possible otherwise. Say we have a matrix $\mathbf{A}$ and we want to transpose it to $\mathbf{A^T}$. This is how it would look like before and after:

$$
\mathbf{A} =
\begin{pmatrix}
a_{1,2} & a_{1,2} & a_{1,3}\\
a_{2,1} & a_{2,2} & a_{2,3}
\end{pmatrix} \implies

\mathbf{A^T} =
\begin{pmatrix}
a_{1,1} & a_{2,1}\\
a_{1,2} & a_{2,2}\\
a_{1,3} & a_{2,3}
\end{pmatrix}
$$

#### 2. Determinant of a Matrix:
The determinant is a scalar value that can be computed from the elements of a square matrix. It is a useful value in linear algebra for various reasons, including determining whether a matrix is invertible and solving systems of linear equations. Here are properties of determinant:

- **Invertibility**: A matrix is invertible if and only if its determinant is non-zero. 

- **Multiplicative Property**: The determinant of the product of two matrices is the product of their determinants. 
$$
\mathbf{det}(\mathbf{A}\mathbf{B}) = \mathbf{det}(\mathbf{A}) \cdot \mathbf{det}(\mathbf{B})
$$

- **Transpose**: The determinant of a matrix is equal to the determinant of its transpose.

$$
\mathbf{det}(\mathbf{A^T}) = \mathbf{det}(\mathbf{A})
$$

&nbsp;&nbsp;&nbsp;&nbsp;Next is the calculation of the determinant. I will not share an example with numbers because it is a pain to do by hand. At this point you should be able to understand the abstract already.

- **Determinant of a $2 \times{} 2$ matrix**
$$
\mathbf{A} =
\begin{pmatrix}
a & b\\
c & d
\end{pmatrix}
$$

- **The determinant is calculated as**:
$$
\mathbf{det}(\mathbf{A}) = ad - bc
$$

- **For $3 \times{} 3$ it is slightly more complex but hopefully you are able to grasp the logic**:

$$
\mathbf{A} =
\begin{pmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{pmatrix}
$$

- **The determinant is calculated using**:

$$
\mathbf{det}(\mathbf{A}) = a(ei - fh) - b(di - fg) + c(dh - eg)
$$

Going larger you will notice a pattern where $a$ is positive, $b$ is negative, $c$ is positive and $d$ would be negative again if it was a $4 \times{} 4$ matrix. If this seems too difficult to understand, do not worry, there are other methods. For example a method called *Laplace expansion* might be easier to understand for some. Also note that you do not have to do these by hand probably ever. Later in Python examples you will see how simple it is to find the determinate using NumPy.

#### 3. Inverse of a Matrix:
The inverse matrix of a matrix $\mathbf{A}$, denoted as $\mathbf{A^{-1}}$, is such matrix that when multiplied by $\mathbf{A}$ yields the identity matrix. Only square matrices have inverses and the matrix must be non-singular meaning it has a non-zero determinant. The inverse of matrix can be found by using the following formula:

$$
\mathbf{A^{-1}} = \frac{1} {\mathbf{det(\mathbf{A})}} \mathbf{Adj(\mathbf{A})}
$$

Now what is $\mathbf{Adj(\mathbf{A})}$? It means that we need to adjugate the matrix to correct the positions and weights of the elements to align them properly for inversion. For me it was hard to comprehend so I will be using a sort of metaphor to make it simpler to understand where the adjugate matrix comes from. For example:

$$
\mathbf{A} =
\begin{pmatrix}
a & b\\
c & d
\end{pmatrix} \implies

\mathbf{Adj(\mathbf{A})} =
\begin{pmatrix}
d & -b\\
-c & a
\end{pmatrix}
$$

How did we get there? There is a 3-stage process. First we find *minors*, then we apply *cofactors* and lastly the matrix is transposed. Lets get to that metaphor I mentioned above. Imagine a house with 4 apartments and there lives a panda in each. The house is 2 stories tall so it resembles a $2 \times{} 2$ matrix like above. All the pandas are very sensitive to sounds but also like to make noises themselves so they find their neighbors irritating. If we ask each panda how would they prefer to prefer to live, they would tell you that they do not want any wall neighbors. This means that the panda living in apartment $a$ for example would no not mind if panda in apartment $d$ was home or not but wants $b$ and $c$ gone. All pandas would give you a similar answer and if we make a matrix out of the neighbors they would not mind, it would look like this:

$$
\begin{pmatrix}
d & c\\
b & a
\end{pmatrix}
$$

This is what we call finding minors for each element, which sounds more mathematical than pandas wanting to live peacefully. Now the first stage is done and we can apply cofactors. Cofactor can be either positive or negative and follows the following checkerboard pattern regardless of how large the matrix is:

$$
\begin{pmatrix}
+ & -\\
- & +
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Applying to minors:

$$
\begin{pmatrix}
d & -c\\
-b & a
\end{pmatrix}
$$

Now that the second stage is done, all that is left is to transpose the matrix the way that was taught earlier and we get the adjugate matrix:

$$
\mathbf{Adj(\mathbf{A})} =
\begin{pmatrix}
d & -b\\
-c & a
\end{pmatrix}
$$

Now you should have full understanding of how to find the inverse of matrix using the formula mentioned in the beginning of this sections. A common application of inverse matrix is to use it in solving systems of linear equations like:

$$
\mathbf{Ax} = \mathbf{b}
$$

Where:
- $\mathbf{A}$: is the coefficient matrix
- $\mathbf{x}$: is the vector of unknowns
- $\mathbf{b}$: is the vector of constants

If $\mathbf{A}$ is invertible, the solution can be found as:

$$
\mathbf{x} = \mathbf{A^{-1}}\mathbf{b}
$$

### <ins>Python Examples of Matrix Properties</ins>
Now you get to see the beauty of simplicity and how little you have to write to solve some of these when you use NumPy.

#### 1. Transpose Matrix:

In [12]:
# Define transpose matrix
A = np.array([[1, 2, 3], [4, 5, 6]])

# Transpose the matrix
A_transposed = np.transpose(A)
print(f"Transpose of A: \n{A_transposed}")

Transpose of A: 
[[1 4]
 [2 5]
 [3 6]]


#### 2. Determinant of a Matrix:

In [13]:
# Define a square matrix
A = np.array([[1, 2], [3, 4]])

# Calculate the determinant
det_A = np.linalg.det(A)
print(f"Determinant of A: {det_A}")

Determinant of A: -2.0000000000000004


#### 3. Inverse of a Matrix:

In [14]:
# Define a square matrix to inverse
A = np.array([[2, 6], [7, 1]])

# Inverse the matrix
A_inverted = np.linalg.inv(A)
print("Inverse of A:\n", A_inverted)

Inverse of A:
 [[-0.025  0.15 ]
 [ 0.175 -0.05 ]]


### <ins>Advanced Matrix Operations</ins>
You can go way deeper in linear algebra than what is being discussed here, even with basics, but this notebook is suppose to summarize the basics of basics for understanding the fundamentals of data science. Regardless of basics, in this section we talk about couple more advanced matrix operations. 

#### Eigenvalues and Eigenvectors:
These two are advanced yet fundamental concept in linear algebra, which are extensively used in data science and machine learning.

- **Eigenvector**: A non-zero vector $\mathbf{v}$ that when a linear transformation is applied to it, changes only in scale, not in direction. Mathematically for $\mathbf{A}$, $\mathbf{v}$ is an eigenvector if:

$$
\mathbf{Av} = λ\mathbf{v}
$$

where λ (pronounced lambda) is a scalar known as the eigenvalue corresponding to the eigenvector $\mathbf{v}$.

- **Eigenvalue**: A scalar λ such that there exists a non-zero vector $\mathbf{v}$ (eigenvector) that satisfies the equation above.

To explain these two simply, I will put this concept in "panda-terms". Imagine that we have a dance floor which is populated by dancing pandas. In this context the dancing pandas are eigenvectors. These pandas are also special in a way that they will always dance in the same direction, as they have a vector. When a scaling factor (either matrix $\mathbf{A}$ or eigenvalue λ) is applied, the pandas might stretch, compress or change size but they wont change the direction. For example if eigenvalue is 2, the pandas will double in size but if the eigenvalue is 0.5, they will shrink half their length. 

***Special Note:*** $\mathbf{A}$ does not equal λ. From the first look it might seem logical when you remove eigenvectors but $\mathbf{A}$ is a matrix and λ is just a single scalar number or a factor.

Knowing all this we can conclude that eigenvectors are just regular vectors but they have a special name when matrix transformation is applied. Eigenvalues are just special factors made to describe complex transformations. 

To find eigenvalues we need to use something called *characteristic equation* which looks like this:

$$
\mathbf{det}(\mathbf{A}-λ\mathbf{I}) = 0
$$

For us to be able to subtract eigenvalue from out matrix, it needs to be made into one and this conveniently happens by multiplying it with identity matrix. Lets have an example, with numbers this time!

Given the matrix:

$$
\mathbf{A} = 
\begin{pmatrix}
4 & 1\\
2 & 3
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;Applying characteristic equation:

$$
\mathbf{det}
\begin{pmatrix}
4-λ & 1\\
2 & 3 - λ
\end{pmatrix} = 0
$$

&nbsp;&nbsp;&nbsp;&nbsp;This simplifies to:

$$
(4 - λ)(3 - λ) - 2 \cdot{} 1 = 0 \implies λ^2 - 7λ + 10 = 0
$$

&nbsp;&nbsp;&nbsp;&nbsp;Once we solve the quadratic equation, we get:

$$
λ = 5 \ \mathbf{or} \ λ = 2
$$

Now that we know the eigenvalues, we can search for the eigenvectors, or dancing pandas if you are still thinking about them.

For each eigenvalue we solve:

$$
(\mathbf{A}-λ\mathbf{I})\mathbf{v} = 0
$$

&nbsp;&nbsp;&nbsp;&nbsp;For λ = 5 we get:

$$
\begin{pmatrix}
-1 & 1\\
2 & -2
\end{pmatrix}\mathbf{v} = 0 \implies

\mathbf{v_1} = 
\begin{pmatrix}
1\\
1
\end{pmatrix}
$$

&nbsp;&nbsp;&nbsp;&nbsp;For λ = 2 we get:

$$
\begin{pmatrix}
2 & 1\\
2 & 1
\end{pmatrix}\mathbf{v} = 0 \implies

\mathbf{v_2} = 
\begin{pmatrix}
-1\\
2
\end{pmatrix}
$$

Why are eigenvectors and eigenvalues important in data science? Here are few example applications:

- **Principal Component Analysis (PCA)**: PCA is a dimensionality reduction technique that uses eigenvalues and eigenvectors. It transforms the data into a new coordinate system where the axes (principal components) are the directions of maximum variance. The eigenvectors of the covariance matrix of the data provide these directions, and the corresponding eigenvalues indicate the variance along these directions.

- **Graph Theory**: In graph analysis, the adjacency matrix of a graph has eigenvalues and eigenvectors that provide insights into the properties of the graph. For example, the largest eigenvalue can give an idea about the connectivity of the graph.

- **Stability Analysis**: In systems theory and control engineering, eigenvalues determine the stability of a system. If all eigenvalues of a system's matrix have negative real parts, the system is stable.

- **Markov Chains**: The steady-state distribution of a Markov chain can be found using eigenvectors and eigenvalues. The transition matrix of the Markov chain has an eigenvector corresponding to the eigenvalue 1, which represents the steady-state distribution.


#### Singular Value Decomposition (SVD):