# Linear Algebra in Data Science

- Introduction
- Scalers, Vectors, and Matrices
- Vector Operations
    - Arithmetics 
    - Norm
- Matrix Operations
    - Arithmetics 
    - Rank of a matrix
    - Inverse
    - Determinant
    - Identity
- Eigenvalues and Eigenvectors

In [40]:
import numpy as np

## What is Linear Algebra?
- It's a branch of mathematics and helps with having a deep understanding of data science and machine learning concepts.
- In Lin Alg data is represented by scalers, vectors, and matrices with linear equations

## Scalers and Vectors

- **Scaler** represents a magnitude or measurement. e.g. length, area, volume, speed, etc..
- **Vector** it's a physical quantity that has direction and magnitude. e.g.
    - Wind velocity is a vector: speed and direction of the wind. 15 miles/hour northeast
    - If you have an image array, you can use a vector for each color. Magnitudes represent the intensity of the color.
    - Weight in physics has direction (gravity) and magnitude
- If we have `v = [4,5,6]` the norm of the vector is the magnitude

### Vector Operations

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_3.png)

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_4.png)

### Multiplying 2 vectors (Dot Product vs Cross Product)

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_6.png)

- Dot Product:
    - The result is always a scaler. 
    - Also called scaler product.
    - Applications:
        - Finding the angle between vectors
        - Project one vector onto another
- Cross Product
    - The result is always a vector
    - The resulting vector is perpendicular to both input vectors
    - It works in 3D space

> two vectors must be the same size for the operations above.

In [41]:
a = [4,5,6]
b = [8,9,2]

In [42]:
np.dot(a,b)

89

In [43]:
np.cross(a,b)

array([-44,  40,  -4])

### Other Operations

In [44]:
np.add(a,b) #lin alg concept

array([12, 14,  8])

In [45]:
#concat - stitching the data 
a + b

[4, 5, 6, 8, 9, 2]

In [46]:
np.subtract(a,b)

array([-4, -4,  4])

### Norm of A Vector
- It calculates the vector's magnitude (it's a measurement of length)
- It's always a positive value
- A numerical example of the norm of a vector:


![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_8.png)

In [47]:
a = [4,5,6]
#use the linalg submodule 
np.linalg.norm(a)

8.774964387392123

## Matrices

- A matrix is a rectangular array of numbers or expressions, arranged in columns and rows.
- The definition differs about the number of dimensions. In DS, they call 2d array Matrix and 2+ d arrays Tensors

![sh](https://res.cloudinary.com/practicaldev/image/fetch/s--8pw60d5S--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/adhiraiyan/DeepLearningWithTF2.0/master/notebooks/figures/fig0201a.png)

In [48]:
m = np.array([[5,6,7],[7,4,8]])
m

array([[5, 6, 7],
       [7, 4, 8]])

In [49]:
m.ndim

2

In [50]:
m.shape

(2, 3)

### Matrix Operations

![m](https://i.ytimg.com/vi/d6lIaqQI0UE/sddefault.jpg)

![matrizmulti1.PNG](https://s3.us-east-1.amazonaws.com/static2.simplilearn.com/lms/testpaper_images/ADSP/Advanced_Statistics/LinearRegression/matrizmulti1.PNG)

In [51]:
a = np.array([[5,6,7],[7,4,8],[4,3,2]])
b = np.array([[7,2,1],[6,6,3],[7,5,2]])

square matrix: number of rows =number of columns

#### Matrix Multiplication (Square Matrices)

In [52]:
np.matmul(a,b)

array([[120,  81,  37],
       [129,  78,  35],
       [ 60,  36,  17]])

In [53]:
a @ b # syntax 2

array([[120,  81,  37],
       [129,  78,  35],
       [ 60,  36,  17]])

#### Element-Wise Matrix Multiplication 

In [54]:
a * b

array([[35, 12,  7],
       [42, 24, 24],
       [28, 15,  4]])

## Rank of A Matrix

**Note:** To find the rank of a matrix, first convert it into the row echelon form.

For a matrix to be in its echelon form, it must follow these three rules:

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_11.png)

#### __Example__

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_12.png)

The output, after using elementary transformations, is shown below:

**R2 → R2 – 2R1**

**R3 → R3 – 3R1**

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_13.png)

**R3 → R3 – 2R2**

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_14.png)

The above matrix is in row-echelon form.

Since the number of non-zero rows is 2,



**The rank of the matrix is 2.**

In [55]:
a = np.array([[1,2,3],
              [2,1,4],
              [3,0,5]
              ])
a

array([[1, 2, 3],
       [2, 1, 4],
       [3, 0, 5]])

In [56]:
print('Rank of matrix a:', np.linalg.matrix_rank(a))

Rank of matrix a: 2


## Determinant and Trace of a Matrix

The determinant of a matrix is a scalar quantity that is a function of the elements of the matrix.
- Determinants are defined only for square matrices.
- These are useful in determining the solution of a system of linear equations.

                               Let X = [aij] be an nxn matrix, where n ≥2

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_15.png)


**Note:** The determinant of a non-square matrix is not defined. The determinant of a matrix X is denoted by det X or |X|.


**Consider the matrices 2X2 and 3X3:**

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_16.png)

   
Substitute the expressions for the determinant of a $ 2\times 2 $ matrix in the above equation. So, the output will be shown as below:

![det2.PNG](https://s3.us-east-1.amazonaws.com/static2.simplilearn.com/lms/testpaper_images/ADSP/Advanced_Statistics/LinearRegression/det2.PNG)

In [57]:
print("Determinant of a:", np.linalg.det(a))

Determinant of a: 6.66133814775094e-16


The trace of a square matrix is the sum of its diagonal entries.</br>
![tr](https://media.geeksforgeeks.org/wp-content/uploads/20221119211255/tm2.jpg)

In [58]:
# Trace of matrix A
print("Trace of a:", np.trace(a))

Trace of a: 7


## Identity Matrix or Operator

An identity matrix (I) is a square matrix that, when multiplied with a matrix X, gives the same result as X.

![det3.PNG](https://s3.us-east-1.amazonaws.com/static2.simplilearn.com/lms/testpaper_images/ADSP/Advanced_Statistics/LinearRegression/det3.PNG)

**Hint**: This is equivalent to the number 1 in the number system.


The diagonal elements of I are all 1, and all its non-diagonal elements are 0.

#### Example:
![det4.PNG](https://s3.us-east-1.amazonaws.com/static2.simplilearn.com/lms/testpaper_images/ADSP/Advanced_Statistics/LinearRegression/det4.PNG)



In [59]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

## Eigenvalues and Eigenvectors

Eigenvalues are a special set of scalar values associated with the linear equations in matrix operations. Imagine you have a big box. You're asked to stretch or shrink everything inside the box, but only in certain directions. Eigenvalues tell you how much everything stretches or shrinks along those special directions.



Eigenvectors represent directions in which the linear transformation has a stretching or compressing effect. Imagine you have arrows inside the box. These arrows represent different directions. Eigenvectors are the special arrows that don't change their direction when you apply force. They might get longer or shorter, but they still point in the same direction.



Eigenvalues and eigenvectors are used in the following areas:

- Linear transformations: Eigenvalues and eigenvectors understand and analyze the behavior of linear transformations. They provide insights into how the transformation affects different directions in the vector space.

- Differential equations: Eigenvalues and eigenvectors solve systems of ordinary and partial differential equations. They help find solutions that have exponential growth or decay behavior.

- Structural analysis: In structural engineering, eigenvalues and eigenvectors analyze the stability and modes of vibration of structures.



Let X be an $ n\times n $ matrix. A scalar $ \lambda $  is called an Eigenvalue of X if there is a nonzero vector A such that AX =  $ \lambda $A.  In this context, the vector A is called an eigenvector of X corresponding to $ \lambda $.

![link text](https://labcontent.simplicdn.net/data-content/content-assets/Data_and_AI/ADSP_Images/Lesson_06_Maths_and_Stats/Linear_Algebra/Image_19.png)

Suppose X is an $ n\times n $ matrix. When you multiply X with a new vector A, it does two things to the vector A:

1.	It scales the vector.

2.	It rotates the vector.

When X acts on a certain set of vectors, it results in scaling the vector and not changing the direction of the vector.
- These specific vectors, which you do not rotate but may stretch or compress, are called eigenvectors.
- The amount by which these vectors stretch or compress is called the corresponding eigenvalue.

## Summary and Applications

Linear algebra concepts are foundational to many machine learning algorithms. Here are some key concepts and their applications:

1. Vectors and Matrices:
    - Used to represent data, parameters, and operations in ML models
    - Enable efficient computation and storage of large datasets

2. Matrix Operations:
    - Matrix Multiplication:
        - Neural network layers processing and transformations: RNN and CNN
        - Recommendation Systems
        - Dimensionality Reduction
    - Element by element multiplication is used in:
        - The attention mechanism, which is one of the key principles of transformers in GPT architecture.
        - Feature scaling and normalization 
        - Activation Functions for neural networks
    - Transpose: 
        - backpropagation
        - data reshaping
    - Inverse: 
        - Linear regression
        - Principal Component Analysis (PCA)

3. Eigenvalues and Eigenvectors:
    - Principal Component Analysis (PCA): Dimensionality reduction
    - Spectral clustering: Unsupervised learning technique

4. Vector Spaces and Subspaces:
    - Feature spaces: Represent data in high-dimensional spaces
    - Kernel methods: Transform data into higher-dimensional spaces

5. Linear Transformations:
    - Neural network layers: Apply linear transformations to inputs
    - Data preprocessing: Normalize or standardize input features
    
6. Singular Value Decomposition (SVD):
    - Matrix factorization: Used in recommender systems
    - Dimensionality reduction: Alternative to PCA

7. Least Squares:
    - Linear regression: Fitting models to data
    - Optimization: Minimizing error in various ML algorithms

8. Gradient Descent:
    - Optimization: Minimizing loss functions in various ML algorithms

9. Norm of a Matrix:
    - Loss functions: Measuring model performance
    - Regularization: Preventing overfitting


- Matrix factorization (dot prod, rank, etc..):
    - Recommendation Systems
- Norm of a matrix:
    - Optimization: Loss function and regularization
    - Feature selection
    - Gradient Descent 