# 1. Introduction

#### **Definition of Linear Algebra**
Linear algebra is a branch of mathematics that deals with vectors, matrices, and linear transformations. It focuses on the study of vector spaces (also called linear spaces) and the linear equations that define relationships between elements in these spaces.

#### **Purpose of Linear Algebra**:
 Linear algebra forms the mathematical backbone of many machine learning algorithms, from basic linear regression to complex neural networks and dimensionality reduction techniques.

#### **Why Linear Algebra?**:<br>

- **Data Representation**: Vectors and matrices are fundamental structures for representing datasets, with rows as samples and columns as features.

- **Transformations**: Linear algebra helps in applying geometric transformations (e.g., rotations, scaling), crucial for feature engineering and understanding data relationships.

- **Dimensionality Reduction**: Techniques like Principal Component Analysis (PCA) rely on linear algebra (eigenvectors/eigenvalues) to reduce the complexity of data while retaining its essence.

- **Optimization**: Many machine learning models use optimization algorithms (e.g., gradient descent) which are based on solving systems of linear equations.

- **Neural Networks and Deep Learning**: Operations in deep learning, such as computing weights and activations, are expressed as matrix multiplications and vector operations.

- **Efficiency**: Linear algebra allows for efficient computation of large datasets, enabling faster model training and prediction.

# 2. Data Types

In the context of linear algebra and machine learning, data types represent different mathematical structures used to organize and manipulate data.

## **1. Scalars**:
- **Definition**: A single number, often representing a simple value.
- **Example**: 5, 3.14, -2.
- **Use in ML**: Scalars can represent individual coefficients, weights, or biases in models.

### Notes:
- **Constant Symbols**: α, β, λ, σ
- **Variable Symbols**: a, b, c, ..., y, z
- Can be Real or Complex, but in ML only Real : a ∈ ℝ
- Models's Hyperparameter are usually scalars. 
- Scalars are simple but important.

![Alt Text](img/scalar.png)

## **2. Vectors**:
- **Definition**: An ordered list of numbers(scalars) (1D array) that represent a point in space or a set of features.
- **Example**: 
𝑣
=
[
𝑣1
,
𝑣2
,
...
,
𝑣n
].<br>
- **Use in ML**: Features of a dataset or weights of a machine learning model.

### **Notes**:
- Horizontal vector is called Row Vector.
- Vertical vector is called Column Vector.
- Transpose operation converts row vector to column and viceversa.
- Vector Symbols are the bold copy of Scalar Symbols
- Models Features(inputs) are usually vectors.

![Alt Text](img/vector.png)

## **3. Matrices**:
- **Definition**: 
    - A rectangular array of numbers arranged in rows and columns (2D array).
    - Collection of Vectors.
- **Notation**:
A = [aij]m*n
    - **A** is the matrix.
    - **a** is element of the matrix.
    - **i** is the index to the matrix's row.
    - **j** is the index to the matrix's column.
    - **m** is number of rows.
    - **n** is number of cols.
- **Example**:<br>
𝐴 =<br>
[
**a**11
**a**12
**a**13
...
**a**1n<br>
**a**21
**a**22
**a**23
...
**a**2n<br>
...
...
...
...
...<br>
**a**m1
**a**m2
**a**m3
...
**a**mn
]

- **Use in ML**: Represent datasets (rows as samples, columns as features) or transformations.

### **Notes**:
- row of matrix is a vector of size m.
- column of matrix is vector of size n.
- list of elements from top left to bottom right(**a**11 to **a**mn) are known as **diagonal**

![Alt Text](img/matrix.png)

## **4. Tensors**:
- **Definition**: A generalization of vectors and matrices to higher dimensions (n-dimensional arrays).
- **Example**: A 3D tensor could represent a color image where each pixel has RGB values.
- **Use in ML**: Tensors are widely used in deep learning frameworks like TensorFlow to store multidimensional data such as images, videos, or batches of data.

### **Notes**:
- Tensors are used too much in DL like image representation RGB.
- Tensors are Generalized representation of Scalars, Vectors, Matrices:
    - Scalar is 0-dim tensor.
    - Vector is 1-dim tensor.
    - Matrix is 2-dim tensor.

![Alt Text](img/tensor.png)

# 3. Vector Geometric Prespective

## **1. Representation of a Vector**
A vector is usually represented as an arrow in a space:
- **Starting point**: Called the tail (often the origin).
- **Ending point**: Called the head.
<br>

**Example** in 2D: A vector 𝑣=[3,2]
can be visualized as an arrow from the origin (0,0) to the point (3,2).

![Alt Text](img/vector_repr.png)

## **2. Magnitude (Length) & Direction of a Vector**
The magnitude (or norm) of a vector is its length, calculated using:

### L1 Norm:
The L1 norm is the sum of the absolute values of the vector components.<br>
- **Geometric Interpretation**: It represents the distance between points if you can only move along the grid (like in a city grid system).
- **Use in ML**: L1 regularization (Lasso) encourages sparsity in machine learning models by driving some coefficients to zero.

![Alt Text](img/l1norm.png)

### L2 Norm:
The L2 norm is the square root of the sum of the squared components of the vector.
- **Geometric Interpretation**: It represents the straight-line (Euclidean) distance from the origin to the point in space.
- **Use in ML**: L2 regularization (Ridge) penalizes large coefficients and helps prevent overfitting by making the solution smoother.

![Alt Text](img/l2norm.png)

### Max Norm:
The L∞ norm is the maximum absolute value of the vector components.
- **Geometric Interpretation**: It gives the largest distance along any single axis.
- **Use in ML**: It is used in optimization problems where you want to limit the largest deviation or outliers.

![ALT TEXT](img/maxnorm.png)

The **p-norm** (also called the Lp-norm) is a generalized way to define the length or size of a vector for any positive integer 
𝑝.<br><br>
![Alt Text](img/pnorm_.png)

## **3. Unit Vector**
A unit vector is a vector that has a magnitude (length) of 1. It is typically used to indicate direction without scaling the vector's length. A unit vector in the direction of any vector can be obtained by normalizing that vector (i.e., dividing the vector by its magnitude).

### Formula:
If 𝑣 is a vector, the corresponding **unit vector 𝑣^** is given by:<br><pre>
𝑣^ = 𝑣 / ∣∣𝑣∣∣</pre>
Where:
- **𝑣** is the original vector.
- **∣∣𝑣∣∣** is the magnitude (or norm) of the vector **V**.
- **𝑣^** is the resulting unit vector.

### Example:
- For a vector 𝑣=[3,4] the magnitude is:<pre>
∣∣𝑣∣∣=sqrt(3²+4²)=sqrt(9+16)=5
- The unit vector 𝑣^ is:<pre>
𝑣^ =[3,4]/5=[3/5,4/5]=[0.6,0.8]</pre>

This new vector [0.6,0.8] has a magnitude of 1, pointing in the same direction as 𝑣.

#### Shape created by different norms:
is important because it reflects how distances and magnitudes are measured in various contexts, particularly in machine learning, optimization, and data analysis.

##### L1 Norm
![ALT TEXT](img/l1_unit.png)

##### L2 Norm
![ALT TEXT](img/l2_unit.png)

## **4. Stretching & Shrinking Vectors**
In linear algebra, stretching and shrinking refer to changing the magnitude of a vector without altering its direction.

- **Stretching**: A vector is stretched when it is **multiplied by a scalar** greater than 1. This **increases its magnitude** (length).<pre>
𝑣′= 𝑐⋅𝑣 where 𝑐 >1 

- **Shrinking**: A vector is shrunk when **multiplied by a scalar** between 0 and 1, **reducing its magnitude**.<pre>
𝑣′= 𝑐⋅𝑣 where 0 < 𝑐 < 1 

#### Key Point:
- The direction can be changed by multiplication by a **negative scalar**.

## **5. Rotating a Vector (Linear Combination of Vectors)**
In linear algebra, a vector can be expressed as a linear combination of other vectors. When we talk about rotating a vector, we often think of the operation in terms of transforming it into a new vector using other vectors.


- A vector **𝑣** can be expressed as a linear combination of two basis vectors **𝑎** and **𝑏**:<pre>
𝑣 = 𝑐1\*𝑎 + 𝑐2\*𝑏

where 𝑐1​ and 𝑐2 are scalar coefficients.

## **6. Span of a Set of Vectors**
The span of a set of vectors is a fundamental concept in linear algebra that describes all possible linear combinations of those vectors. It represents a subspace formed by combining the vectors in various ways.

![ALT TEXT](img/span.png)

#### 1. Definition
The span of a set of vectors 
{𝑣1,𝑣2,...,𝑣𝑛} is defined as:<pre>
Span({𝑣1,𝑣2,...,𝑣𝑛})={𝑐1𝑣1+𝑐2𝑣2+...+𝑐𝑛𝑣𝑛 ∣ 𝑐1,𝑐2,...,𝑐𝑛 ∈ 𝑅}</pre>

#### 2. Geometric Interpretation
  - **In 2D**: The span of two non-collinear vectors will form the entire 2D plane.
  - **In 3D**: The span of three non-coplanar vectors will fill the entire 3D space.
  - If vectors are linearly dependent, the span will be a lower-dimensional subspace (e.g., a line or plane)

## **7. Basis Vectors**
Basis vectors are a set of vectors in a vector space that are used to describe all other vectors in that space through linear combinations. They play a crucial role in linear algebra, enabling the representation of vectors in a systematic and efficient manner.


![ALT TEXT](img/2d_basis_vecs.png) ![ALT TEXT](img/3d_basis_vecs.png)

#### 1. Definition
A set of vectors 
{𝑏1,𝑏2,...,𝑏𝑛} is called a basis for a vector space 𝑉 if:
- **Spanning**: The span of the basis vectors equals the vector space 𝑉:<pre>
Span({𝑏1,𝑏2,…,𝑏𝑛})=𝑉</pre>

- **Linear Independence**: The vectors are linearly independent, meaning no vector in the set can be expressed as a linear combination of the others.

#### 2. Properties
- The number of vectors in a basis corresponds to the **dimension** of the vector space. For example:
    - A basis for 𝑅² consists of 2 vectors.
    - A basis for 𝑅³ consists of 3 vectors.
- Any vector in the space can be expressed as a unique linear combination of the basis vectors.

## **8. Dot Product**
The dot product (also known as the scalar product or inner product) is a fundamental operation in linear algebra that takes two vectors and produces a scalar (a single number). It is widely used in various fields, including physics, engineering, and machine learning, to measure the angle between vectors, projection, and similarity.

![ALT TEXT](img/dot_product.png)

#### 1. Definition
For two vectors  𝑎 and b in 𝑅ⁿ, the dot product is defined as:<pre>
𝑎⋅𝑏=𝑎1𝑏1+𝑎2𝑏2+...+𝑎𝑛𝑏𝑛</pre>
where **𝑎𝑖** and **𝑏𝑖** are the components of vectors **𝑎** and **𝑏**,respectively.

#### 2. Geometric Interpretation
The dot product can also be expressed in terms of the magnitudes of the vectors and the cosine of the angle θ between them:<pre>
𝑎⋅𝑏=∥𝑎∥∥𝑏∥cos(𝜃)</pre>
where:
- **∥𝑎∥** and **∥𝑏∥** are the magnitudes (lengths) of the vectors.
- **θ** is the angle between the vectors.

#### 3. Properties
- **Commutative** <pre>:𝑎⋅𝑏 = 𝑏⋅𝑎</pre>
- **Distributive**: <pre>𝑎⋅(𝑏+𝑐) = 𝑎⋅𝑏+𝑎⋅𝑐</pre>
- **Associative with Scalars**:<pre>(𝑐𝑎)⋅𝑏=𝑐(𝑎⋅𝑏)</pre> for any scalar 𝑐
- **Dot Product Between Basis Vectors**:The dot product of orthonormal basis vectors is 0 if the vectors are different, and 1 if they are the same:<pre>
e_i · e_j = 
{
    1 if i = j
    0 if i ≠ j
}
![ALT TEXT](img/dot_product_basis.png)
- **Dot Product Between Vector & Itself**:The dot product of a vector with itself gives the square of its magnitude:<pre>
v · v = ||v||^2
![ALT TEXT](img/dot_product_itself.png)
- **How the dot product affects the angle**:
    - If 𝑎⋅𝑏>0(positive), the angle between 𝑎 and 𝑏 is acute (less than 90°).
    - If 𝑎⋅𝑏=0, the angle between 𝑎 and 𝑏 is 90° (the vectors are orthogonal).
    - If 𝑎⋅𝑏<0 (negative), the angle between 𝑎 and 𝑏 is obtuse (greater than 90°).

#### 4. Applications
- **Angle Between Vectors**: The dot product can be used to find the cosine of the angle between two vectors. If 𝑎⋅𝑏=0, then the vectors are orthogonal (perpendicular).
- **Projection**: The dot product is used to project one vector onto another:<pre>
Proj 𝑏(𝑎)= (𝑎⋅𝑏 / ∥𝑏∥) *𝑏
- **Similarity Measurement**: In machine learning, the dot product is often used in algorithms like Support Vector Machines (SVMs) and in calculating cosine similarity for text and document comparisons.

## **9. Orthogonalization & Gram-Schmidt Algorithm**
### 1. Orthogonalization
Orthogonalization is the process of converting a set of vectors into an orthogonal set. This means transforming a collection of non-orthogonal vectors into vectors that are perpendicular (orthogonal) to each other. In machine learning and data science, orthogonalization can improve the numerical stability of algorithms by eliminating redundancy and dependencies among vectors.


### 2. Gram-Schmidt Algorithm
The **Gram-Schmidt process** is a widely used method for **orthogonalizing a set of vectors**. It takes a linearly independent set of vectors and transforms them into an orthogonal (or orthonormal) set while preserving the span of the original vectors.

#### Steps of the Gram-Schmidt Algorithm
Let’s assume we have a set of vectors {𝑣1,𝑣2,...,𝑣𝑛}.

- **Start with the first vector**:
Set the first orthogonal vector **𝑢1=𝑣1**.

- **Orthogonalize each subsequent vector**: For each vector **𝑣𝑘**, subtract the projections onto the previous orthogonal vectors to ensure orthogonality:<pre>
𝑢𝑘=𝑣𝑘−∑𝑖=1𝑘−1Proj 𝑢𝑖 𝑣𝑘 </pre>

- **Normalize (optional)**:
If an orthonormal set is desired, normalize each 𝑢𝑘 to get unit vectors:
𝑒𝑘 = 𝑢𝑘 / ∣∣𝑢𝑘∣∣

![ALT TEXT](img/schmidt.png)

# 4. System of Linear Equation

## **1. Equation of Line**
The equation of a line in a 2D plane can be written in several forms. The general one is:

### General Form:
The general form of a line is:<pre>
Ax+By+C=0</pre>
Where 𝐴, 𝐵, and 𝐶 are constants.
![ALT TEXT](img/line.png)

### Importance of Linear Equations
- Natural phenomenances can be modeled using linear equation.
- Complex functions apear linear when viewed in small enough scale.

![ALT TEXT](img/import_line.png)

## **2. System of Linear Equations**
A **system of linear equations** consists of multiple linear equations that share common variables. The objective is to find values for these variables that satisfy all equations in the system simultaneously.

### General Representation
A system of **𝑚** linear equations in **𝑛** variables can be expressed as:<pre>
𝑎11𝑥1+𝑎12𝑥2+ ... +𝑎1𝑛𝑥𝑛=𝑏1
𝑎21𝑥1+𝑎22𝑥2+ ... +𝑎2𝑛𝑥𝑛=𝑏2
⋮
𝑎𝑚1𝑥1+𝑎𝑚2𝑥2+ ... +𝑎𝑚𝑛𝑥𝑛=𝑏𝑚</pre> 
Where:
- **𝑎𝑖𝑗**​ are coefficients,
- **𝑥𝑗** are the variables, and
- **𝑏𝑖** are constants.

### Types of Solutions
- **Unique Solution**: The system has exactly one solution. This occurs when the equations are independent, and the corresponding matrix 𝐴 is invertible.

- **Infinite Solutions**: The system has infinitely many solutions when at least one equation can be derived from others (dependent equations).

- **No Solution**: The system has no solution when the equations are contradictory (inconsistent).

![ALT TEXT](img/sys_lines.png)

### Solution of System of Linear Equations
There are several methods for solving a system of linear equations, each suited for different types of systems and contexts. Here are some common methods:

#### 1. Substitution Method
- Solve one equation for one variable, and then substitute that expression into the other equations. This method works well for systems with a small number of equations and variables.
![ALT TEXT](img/substitution_method.png)

#### 2. Elimination Method
Also known as the Gaussian elimination method, this approach involves adding or subtracting equations to eliminate variables systematically. The goal is to reduce the system to an upper triangular form and then solve using back substitution.<br>
![ALT TEXT](img/elimination_method.png)

#### Can every set of equations be solved?
- If the nubmer of unknows is more than the number of equations.
- If the given equations are not independent pieces of information.

## **3. Linear Equation as Vector Dot Multiplication**
A **linear equation** in 𝑛 variables can be expressed as a dot product of two vectors:<pre>
𝑎1𝑥1 + 𝑎2𝑥2 + ⋯ + 𝑎𝑛𝑥𝑛 = 𝑏</pre>

This is equivalent to the dot product of the coefficient vector 
𝑎=[𝑎1,𝑎2,…,𝑎𝑛] and the variable vector 
𝑥=[𝑥1,𝑥2,…,𝑥𝑛], resulting in:<pre>
𝑎⋅𝑥=𝑏</pre>

Where **𝑏** is the constant on the right-hand side. The dot product form simplifies understanding of the geometric interpretation of linear equations, making it easier to visualize in higher dimensions.

![ALT TEXT](img/matrix_dot.png)


## **4. System of Linear Equations as Matrix-Vector Multiplication**
A system of linear equations can be expressed as matrix-vector multiplication. Consider the system:<pre>
𝑎11𝑥1+𝑎12𝑥2+ ... +𝑎1𝑛𝑥𝑛=𝑏1
𝑎21𝑥1+𝑎22𝑥2+ ... +𝑎2𝑛𝑥𝑛=𝑏2
⋮
𝑎𝑚1𝑥1+𝑎𝑚2𝑥2+ ... +𝑎𝑚𝑛𝑥𝑛=𝑏𝑚</pre> 
This system can be represented as:<pre>
𝐴𝑥=𝑏</pre>
Where:
- **𝐴** is the coefficient matrix of size **𝑚×𝑛**.
- **𝑥** is the variable vector of size **𝑛×1**.
- **𝑏** is the constant vector of size **𝑚×1**.
 
This compact representation is useful for solving and analyzing systems of linear equations using methods like matrix inversion, Gaussian elimination, or numerical techniques.

![ALT TEXT](img/matrix_dot.png)

## **5. Gauss-Jordan Elimination**
Gauss-Jordan Elimination is a method used to solve systems of linear equations by transforming the augmented matrix into **reduced row echelon form (RREF)**. This process eliminates the need for back substitution and provides a straightforward path to find the solution for the variables.

### Steps of Gauss-Jordan Elimination:
1. **Form the Augmented Matrix**: Write the system of linear equations as an augmented matrix \( [A|b] \), where \( A \) is the coefficient matrix and \( b \) is the constants vector.
   
2. **Row Operations**: Perform the following elementary row operations:
   - Swap two rows.
   - Multiply a row by a nonzero scalar.
   - Add or subtract a multiple of one row from another row.
   
3. **Transform to Row Echelon Form (REF)**:
   - Start by making the leading coefficient (pivot) in the first row, first column equal to 1 (if not already), by scaling the row.
   - Use the pivot to eliminate all other entries in the column below it by adding/subtracting suitable multiples of the pivot row from other rows.
   
4. **Transform to Reduced Row Echelon Form (RREF)**:
   - Make the pivot entry in each row equal to 1.
   - Use the pivot to eliminate all other entries in its column, both above and below.
   
5. **Extract the Solution**:
   - Once the matrix is in RREF, the solution to the system can be easily extracted from the matrix.

#### Applications:
- Gauss-Jordan Elimination is widely used for solving linear systems in fields such as machine learning, optimization problems, and computational mathematics.


# 5. Matrices

## **1. Rank of a Matrix**
The **rank** of a matrix is defined as the maximum number of linearly independent rows or columns in the matrix. It gives an indication of how much information the matrix holds and is closely related to the concept of linear independence in vector spaces.

### Key Points:
- **Row Rank**: The maximum number of linearly independent rows in the matrix.
- **Column Rank**: The maximum number of linearly independent columns in the matrix.
- **Fundamental Theorem of Ranks**: In any matrix, the row rank and column rank are always equal, so they both are referred to simply as the **rank of the matrix**.
- The rank of a matrix provides insight into the **dimensionality** of the vector space spanned by its rows or columns.
- A matrix with full rank has all its rows or columns linearly independent.

#### How to Find the Rank:
1. **Gaussian Elimination**: Use Gaussian or Gauss-Jordan elimination to reduce the matrix to **row echelon form** or **reduced row echelon form (RREF)**. The number of non-zero rows in this form is the rank of the matrix.
   
2. **Determinant Method** (for square matrices): The rank of a matrix is the largest order of any non-zero **determinant** of its square submatrices.
### Applications of Matrix Rank:
- **Solving Linear Systems**: The rank helps determine if a system of linear equations has a unique solution, infinitely many solutions, or no solution at all.
- **Dimensionality Reduction**: Rank is used in techniques like Principal Component Analysis (PCA) to reduce the dimensionality of data while preserving as much information as possible.
- **Linear Transformations**: The rank tells us about the mapping power of a linear transformation, indicating how many dimensions of the output space are covered by the transformation.

## **2. Types of Matrices**

- **Square Matrix**: A matrix with the same number of rows and columns nxn.<br>
![ALT TEXT](img/square.png)

- **Rectangular Matrix**: A matrix where the number of rows is different from the number of columns mxn.<br>
![ALT TEXT](img/rectangle.png)

- **Symmetric Matrix**: A square matrix that is equal to its transpose, i.e.,A = A^T.<br>
![ALT TEXT](img/symmetric.png)

- **Diagonal Matrix**: A square matrix where all non-diagonal elements are zero. The diagonal elements can be non-zero.<br>
![ALT TEXT](img/diagonal.png)

- **Identity Matrix**: A diagonal matrix where all diagonal elements are 1. It acts as the multiplicative identity in matrix multiplication.<br>
![ALT TEXT](img/identity.png)

- **Upper Triangular Matrix**: A square matrix where all elements below the main diagonal are zero.<br>
![ALT TEXT](img/upper_tr.png)

- **Lower Triangular Matrix**: A square matrix where all elements above the main diagonal are zero.<br>
![ALT TEXT](img/lower_tr.png)

- **Orthogonal Matrix**: A square matrix where the rows and columns are orthonormal vectors, i.e., \( A \cdot A^T = I \), where \( I \) is the identity matrix.<br>
![ALT TEXT](img/orthogonal.png)


## **3. Transpose of a Matrix**
The **transpose** of a matrix is obtained by swapping its rows and columns. If **A** is an **mxn** matrix, its transpose, denoted by **A^T**, is an **nxm** matrix.


![ALT TEXT](img/transpose.png)

## **4. Types of Matrices by Definiteness**

- **Positive Definite Matrix**: A symmetric matrix \( A \) is positive definite if for all non-zero vectors \( x \), the following condition holds:
  \[
  x^T A x > 0
  \]

- **Positive Semidefinite Matrix**: A symmetric matrix \( A \) is positive semidefinite if for all vectors \( x \):
  \[
  x^T A x \geq 0
  \]

- **Negative Definite Matrix**: A symmetric matrix \( A \) is negative definite if for all non-zero vectors \( x \):
  \[
  x^T A x < 0
  \]

- **Negative Semidefinite Matrix**: A symmetric matrix \( A \) is negative semidefinite if for all vectors \( x \):
  \[
  x^T A x \leq 0
  \]

- **Indefinite Matrix**: A symmetric matrix \( A \) is indefinite if there exists at least one vector \( x \) such that:
  \[
  x^T A x > 0
  \]
  and at least one vector \( y \) such that:
  \[
  y^T A y < 0
  \]

![ALT TEXT](img/def.png)

## **5. Matrix Transformation on Vectors**:

- **Identity Matrix on a Vector**:
    - The identity matrix **𝐼𝑛** is a square matrix with ones on the diagonal and zeros elsewhere.
    - When the identity matrix multiplies a vector **𝑥**, it leaves the vector unchanged:<pre> 𝐼𝑛𝑥 = 𝑥

![ALT TEXT](img/identity_vector.png)

- **Scaled Identity Matrix on a Vector**:
    - A scaled identity matrix is the identity matrix multiplied by a scalar **𝜆**, denoted as **𝜆𝐼𝑛**.
    - When a scaled identity matrix multiplies a vector 𝑥, it scales the vector by 𝜆:<pre> 𝜆𝐼𝑛𝑥 = 𝜆𝑥

    - Every component of the vector is multiplied by the scalar 𝜆, effectively scaling the entire vector by that factor.
    
![ALT TEXT](img/scaled_identity_vector.png)

- **Diagonal Matrix on a Vector**:
    - A diagonal matrix 𝐷 has arbitrary values on the diagonal and zeros elsewhere.
    - When a diagonal matrix multiplies a vector 𝑥, each component of the vector is scaled by the corresponding diagonal element.
    - Each element of the vector is multiplied by the corresponding diagonal value of the matrix.

![ALT TEXT](img/diagonal_vector.png)

## **6. Matrix Transformation on Space**:

- **Identity matrix**

![ALT TEXT](img/identity_space.png)

- **Diagonal matrix**

![ALT TEXT](img/diagonal_space.png)

- **Orthogonal matrix**

![ALT TEXT](img/orthogonal_space.png)

- **Symmetric matrix**

![ALT TEXT](img/symmetric_space.png)

    - When two rows are in same direction.

![ALT TEXT](img/symmetric_space1.png)

- **Upper Triangular matrix**

![ALT TEXT](img/up_tr_space.png)

- **Lower Triangular matrix**

![ALT TEXT](img/lw_tr_space.png)

- **Positive Definite matrix**

![ALT TEXT](img/pos_def_space.png)

## **7. Matrix Inverse**
The inverse of a matrix is a matrix that, when multiplied by the original matrix, results in the identity matrix. Not all matrices have inverses—only square matrices that are non-singular (i.e., their determinant is non-zero) have inverses.<br>

If 𝐴 is a square matrix, its inverse is denoted as 𝐴−1, and the following holds:<pre>

𝐴 𝐴−1=𝐴−1 𝐴=𝐼𝑛</pre>

Where:
- **𝐴** is an 𝑛×𝑛 matrix.
- **𝐴−1**  is the inverse of 𝐴.
- **𝐼𝑛** is the 𝑛×𝑛 identity matrix.

### Key Points:
- **Existence**: Not all matrices have an inverse. A matrix must be square and have a non-zero determinant to have an inverse.
- **Uniqueness**: If the inverse exists, it is unique.
Multiplicative Identity: The product of a matrix and its inverse is the identity matrix.
- **Applications**: Matrix inverses are used in solving systems of linear equations, particularly in the equation **𝐴𝑥=𝑏**, where **𝑥=𝐴−1 𝑏**.

![ALT TEXT](img/inv.png)

#### Orthogonal Matrix Inverse:
![ALT TEXT](img/orthog_inv.png)

# 6. Determinants 

## **1. Geometric Areas and Volumes with Vectors in 2D and 3D Space**

### **1. Area of Parallelogram Represented by Basic Vectors of 2D Space**:
In 2D space, the area of a parallelogram formed by two basic vectors 𝑎 and 𝑏 is given by the absolute value of their determinant:<pre>
Area=|𝑎1𝑏2−𝑎2𝑏1∣

![ALT TEXT](img/2d_basis.png)

### **2. Area of Parallelogram Represented by Linearly Dependent Vectors**:
- If the vectors are linearly dependent, the area of the parallelogram collapses to zero because the vectors are **collinear**, representing no "spread" in space.

![ALT TEXT](img/2d_dep.png)

### **3. Volume of Parallelogram Represented by Basic Vectors of 3D Space:**
In 3D, the area of the parallelogram formed by two vectors 𝑎 and 𝑏 is given by the magnitude of their cross product:<pre>
Area=∣𝑎×𝑏∣

![ALT TEXT](img/3d_basis.png)

### **4.Volume of Parallelepiped Represented by Linearly Dependent Vectors:**
In 3D, the volume of a parallelepiped formed by three vectors 𝑎,𝑏,𝑐 is zero if the vectors are linearly dependent, as they lie in the same plane:<pre>
Volume=∣𝑎⋅(𝑏×𝑐)∣=0

![ALT TEXT](img/3d_dep.png)

## **2. Scaling of Geometric Shapes**: 

### **1. How a Matrix Scales the Area of the Parallelogram:**

![ALT TEXT](img/2d_scale.png)

### **2. How a Matrix Scales the Volume of the Parallelepiped:**

![ALT TEXT](img/3d_scale.png)

## **3. Calculate the Determinant of a Matrix:**
### For a 2x2 matrix:<pre>
𝐴=( 𝑎  𝑏 
    𝑐  𝑑)</pre>
- The determinant is calculated as:<pre>
det(𝐴) = 𝑎𝑑−𝑏𝑐</pre>

### For a 3x3 matrix:<pre>
𝐴=(𝑎 𝑏 𝑐 
𝑑 𝑒 𝑓 
𝑔 ℎ 𝑖)</pre>
- The determinant is calculated as:<pre>
det(𝐴)=𝑎(𝑒𝑖−𝑓ℎ)−𝑏(𝑑𝑖−𝑓𝑔)+𝑐(𝑑ℎ−𝑒𝑔)</pre>

![ALT TEXT](img/deter.png)

- a matrix with a zero determinant does not have an inverse, making it singular.

![ALT TEXT](img/deter0.png)

- A negative determinant indicates that the matrix transformation involves a reflection.

![ALT TEXT](img/neg_deter.png)

# 7. Eigen

## **1. Eigen Vectors and Eigen Values (Geometric View)**:
- Eigenvectors are special vectors associated with a matrix transformation. When a matrix 𝐴 acts on an eigenvector 𝑣, the vector is only scaled and not rotated or reflected.
Mathematically, for matrix 𝐴 and eigenvector 𝑣, the equation is:
𝐴𝑣=𝜆𝑣

Where 𝜆 is the eigenvalue corresponding to the eigenvector 𝑣.
- **Geometric View**: Eigenvectors point in directions that are **invariant** under the matrix transformation, while eigenvalues describe how much the vector is stretched or compressed.

    - If 𝜆>1, the vector is **stretched**.
    - If 0<𝜆<1, the vector is **shrunk**.
    - If 𝜆=1, the vector's length remains the **same**.
    - If 𝜆<0, the vector is scaled and **flipped**.

![ALT TEXT](img/eigen_geo.png)

2. Computing Eigenvectors and Eigenvalues:
- To find the **eigenvalues** of a matrix 𝐴, solve the characteristic equation:<pre>
det(𝐴−𝜆𝐼)=0</pre>

Where **𝐼** is the identity matrix and **𝜆** represents the eigenvalues.

- Once the eigenvalues 𝜆 are found, substitute them into the equation:<pre>
(𝐴−𝜆𝐼)𝑣=0</pre>
to solve for the **eigenvectors 𝑣**. This system of linear equations determines the eigenvectors corresponding to each eigenvalue.

Eigenvectors and eigenvalues have many applications in machine learning and data science, particularly in **dimensionality reduction** techniques like PCA (Principal Component Analysis).

![ALT TEXT](img/eigen_0.png) ![ALT TEXT](img/eigen_1.png)

- Geometric View

![ALT TEXT](img/eigen_2.png) ![ALT TEXT](img/eigen_3.png) ![ALT TEXT](img/eigen_4.png)

## **2. Eigen Decomposition**:
Eigen decomposition is the process of breaking down a square matrix into its eigenvalues and eigenvectors. It allows us to express the matrix in a form that reveals its geometric and algebraic properties.

- Mathematical Formulation:
For a square matrix 
𝐴 of size 𝑛×𝑛, if there exists a diagonal **matrix 𝐷** containing the **eigenvalues 𝜆1,𝜆2,...,𝜆𝑛** and a **matrix 𝑉** consisting of the corresponding **eigenvectors 𝑣1,𝑣2,...,𝑣𝑛**, we can express the eigen decomposition as:<pre>
𝐴=𝑉𝐷𝑉−1</pre>

Where:
- **𝐴** is the original matrix.
- **𝑉** is the matrix of eigenvectors, where each column is an eigenvector corresponding to an eigenvalue.
- **𝐷** is a diagonal matrix where each diagonal entry is an eigenvalue.
- **𝑉−1** is the inverse of the matrix of eigenvectors.

![ALT TEXT](img/dec_0.png)
![ALT TEXT](img/dec_1.png)

- Geometric overview

![ALT TEXT](img/dec_geo.png)

## **3. The Spectral Theorem**:
![ALT TEXT](img/spect.png)

- Geometric view

![ALT TEXT](img/spect_geo.png)

- Matrix Inverse from eigen decomposition

![ALT TEXT](img/inv_spect.png)

# 8. Singular Value Decomposition

## **1. What is SVD**
Singular Value Decomposition (SVD) is a method of decomposing a matrix into three other matrices. For a given matrix 𝐴 of dimensions 
𝑚×𝑛, SVD can be expressed as:
𝐴=𝑈.𝐷.𝑉𝑇 
Where:
- **𝑈** is an 𝑚×𝑚 orthogonal matrix whose columns are the left singular vectors of 𝐴.
- **𝐷** is an 𝑚×𝑛 diagonal matrix containing the singular values of 𝐴 (non-negative and sorted in descending order).
- **𝑉** is an 𝑛×𝑛 orthogonal matrix whose columns are the right singular vectors of A.

![ALT TEXT](img/svd.png)

## **2. Relationship between Eigen Decomposition and SVD**

![ALT TEXT](img/svd_1.png)
![ALT TEXT](img/svd_2.png)
![ALT TEXT](img/svd_3.png)

## **3. Dimentionality Reduction with SVD**

![ALT TEXT](img/dim_red.png)