# Linear Algebra in Machine Learning

To better understand the theory behind Machine Learning algorithms, which are prevalent in the Data Science field, one must know the concepts of Linear Algebra.

This is a field of mathematics applied in many fields of study, since it allows us to efficiently model concepts.

In this notebook, we will explore basic linear algebra terms and phenomena, in the eyes of a computer scientist.

## Index

1. [**Scalars and vectors**](#ScalVec)
2. [**Matrices and Tensors**](#MatTens)
3. [**Transformation Matrix**](#MatTransf)
4. [**Machine Learning Applications**](#MLApp)

## Scalars and Vectors <a id="ScalVec"></a>

A scalar is a single number, often representing a quantity or measurement (for example, 24). Scalars are the simplest form of data in linear algebra and are used extensively in machine learning to represent weights, biases, and other parameters.

A vector is an ordered list of numbers, which can be thought of as a point in a multi-dimensional space. Vectors are fundamental in machine learning as they are used to represent data points, feature sets, and gradients. 

In python, using the numpy library, it is simple to define vectors and do operations using scalars and other vectors:

In [1]:
import numpy as np

a1 = np.ones(10)
print("Vector of 10 ones:\n", a1)

a2 = np.linspace(3, 15, 10)
print("\n Vector of 10 numbers that have the same distance from each other, starting with 3 and ending with 15:\n", a2)

a3 = np.random.rand(10)
print("\n Vector of 10 random numbers between [0, 1):\n", a3)

a4 = np.random.randint(1, 10, size=10)
print("\n Vector of 10 random numbers between [1, 10):\n", a4)

Vector of 10 ones:
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]

 Vector of 10 numbers that have the same distance from each other, starting with 3 and ending with 15:
 [ 3.          4.33333333  5.66666667  7.          8.33333333  9.66666667
 11.         12.33333333 13.66666667 15.        ]

 Vector of 10 random numbers between [0, 1):
 [0.8604645  0.02176455 0.69184775 0.70485581 0.61359383 0.51915531
 0.19527979 0.77190631 0.13059284 0.38450186]

 Vector of 10 random numbers between [1, 10):
 [2 8 4 7 7 1 7 7 6 8]


In [9]:
a1 = np.array([1, 2, 3, 4])

print("Vector a1:\n", a1)

print("\nAdding the scalar 2 to the vector a1:\n", a1 + 2)

print("\nSubtract a scalar of value 1 to the vector.\n", a1 - 1)

print("\nMultiply the vector with a scalar of value 10:\n", a1 * 10)

print("\nDivide the vector by a scalar of value 5:\n", a1 / 5)

Vector a1:
 [1 2 3 4]

Adding the scalar 2 to the vector a1:
 [3 4 5 6]

Subtract a scalar of value 1 to the vector.
 [0 1 2 3]

Multiply the vector with a scalar of value 10:
 [10 20 30 40]

Divide the vector by a scalar of value 5:
 [0.2 0.4 0.6 0.8]


## Matrices and Tensors <a id="MatTens"></a>

Matrices and tensors are higher-dimensional generalizations of vectors and are crucial in the field of machine learning.

### Matrices

A matrix is a two-dimensional array of numbers, which can be thought of as a collection of vectors. Matrices are used extensively in machine learning to represent datasets, transformation operations, and more. For example, in supervised learning, a dataset is often represented as a matrix where each row corresponds to a data point and each column corresponds to a feature.

In python, using the numpy library, it is simple to define matrices and perform operations on them, using matrices, vectors and scalars:

In [2]:
# Define matrices
matrix1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix2 = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])

print("Matrix 1:\n", matrix1)
print("\nMatrix 2:\n", matrix2)

# Matrix addition
matrix_sum = matrix1 + matrix2
print("\nSum of Matrix 1 and Matrix 2:\n", matrix_sum)

# Matrix subtraction
matrix_diff = matrix1 - matrix2
print("\nDifference of Matrix 1 and Matrix 2:\n", matrix_diff)

Matrix 1:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Matrix 2:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]

Sum of Matrix 1 and Matrix 2:
 [[10 10 10]
 [10 10 10]
 [10 10 10]]

Difference of Matrix 1 and Matrix 2:
 [[-8 -6 -4]
 [-2  0  2]
 [ 4  6  8]]


In [3]:
# Element-wise multiplication
matrix_product = matrix1 * matrix2
print("\nElement-wise multiplication of Matrix 1 and Matrix 2:\n", matrix_product)

# Matrix dot product
matrix_dot_product = np.dot(matrix1, matrix2)
print("\nDot product of Matrix 1 and Matrix 2:\n", matrix_dot_product)

# Scalar multiplication
scalar = 2
matrix_scalar_product = matrix1 * scalar
print("\nMatrix 1 multiplied by scalar 2:\n", matrix_scalar_product)

# Matrix-vector multiplication
vector = np.array([1, 2, 3])
matrix_vector_product = np.dot(matrix1, vector)
print("\nMatrix 1 multiplied by vector [1, 2, 3]:\n", matrix_vector_product)


Element-wise multiplication of Matrix 1 and Matrix 2:
 [[ 9 16 21]
 [24 25 24]
 [21 16  9]]

Dot product of Matrix 1 and Matrix 2:
 [[ 30  24  18]
 [ 84  69  54]
 [138 114  90]]

Matrix 1 multiplied by scalar 2:
 [[ 2  4  6]
 [ 8 10 12]
 [14 16 18]]

Matrix 1 multiplied by vector [1, 2, 3]:
 [14 32 50]


#### Matrix Multiplication Properties

Matrix multiplication is bound to the following properties:

1. **Associativity**:
    $$
    (AB)C = A(BC)
    $$
    Matrix multiplication is associative, meaning the order in which matrices are multiplied does not affect the result.

2. **Distributivity**:
    $$
    A(B + C) = AB + AC
    $$
    $$
    (A + B)C = AC + BC
    $$
    Matrix multiplication is distributive over matrix addition.

3. **Non-Commutativity**:
    $$
    AB \neq BA
    $$
    In general, matrix multiplication is not commutative, meaning the order of multiplication matters.

4. **Identity Matrix**:
    $$
    AI = IA = A
    $$
    Multiplying any matrix \(A\) by the identity matrix \(I\) results in the original matrix \(A\).

    * **Inverse**
        $$
        A*1/A = I
        $$
        If the matrix \(A\) is multiplied by its Inverse matrix \(1/A\), it results in the Identity matrix. Note that not every matrix has an inverse.

5. **Zero Matrix**:
    $$
    A0 = 0A = 0
    $$
    Multiplying any matrix \(A\) by the zero matrix \(0\) results in the zero matrix.

6. **Transpose**:
    $$
    (AB)^T = B^T A^T
    $$
    The transpose of a product of two matrices is the product of their transposes in reverse order.

7. **Scalar Multiplication**:
    $$
    c(AB) = (cA)B = A(cB)
    $$
    Scalar multiplication can be distributed across matrix multiplication.

Understanding these properties is essential for performing and simplifying matrix operations in various applications, including machine learning and computer graphics.

### Tensors

A tensor is a multi-dimensional array of numbers. Depending on the number of indexes, they can be a vector (1 index), matrix (2 indexes), and beyond the third-order tensors (3 indexes, where the 1st points to the row, 2nd to the column and 3rd to the axis), we have higher-order tensors. This makes them a generalization of the previous concepts. This is important for machine learning porpouses because:

1. **Data Representation**: Tensors are used to represent data in machine learning. For example, a color image can be represented as a 3-dimensional tensor with dimensions corresponding to height, width, and color channels (RGB).

2. **Model Parameters**: In neural networks, tensors are used to represent the weights and biases of the model. These parameters are updated during the training process to minimize the loss function.

3. **Operations and Transformations**: Tensors allow for efficient mathematical operations and transformations. Libraries like TensorFlow and PyTorch are built around tensor operations, enabling efficient computation on GPUs and other hardware accelerators.

4. **Batch Processing**: Tensors enable batch processing of data, which is essential for training machine learning models. By processing multiple data points simultaneously, tensors help in speeding up the training process.

5. **Flexibility**: Tensors provide a flexible way to handle different types of data, including structured and unstructured data. This flexibility is crucial for building complex machine learning models that can handle diverse data sources.

In summary, tensors are fundamental to the field of machine learning, specially Deep Learning, as they provide a powerful and flexible way to represent and manipulate data and model parameters.

## Transformation Matrix <a id="MatTransf"></a>

A transformation matrix is a matrix used to perform linear transformations on vectors in a given space. These transformations include operations such as translation, scaling, rotation, and shearing. Transformation matrices are widely used in computer graphics, robotics, and machine learning.

### Types of Transformations

1. **Translation**: This moves every point of an object by the same distance in a given direction. In 2D, the translation matrix is:
    $$
    \begin{bmatrix}
    1 & 0 & t_x \\
    0 & 1 & t_y \\
    0 & 0 & 1
    \end{bmatrix}
    $$
    where t_x and t_y are the translation distances along the x and y axes, respectively.

2. **Scaling**: This changes the size of an object. In 2D, the scaling matrix is:
    $$
    \begin{bmatrix}
    s_x & 0 & 0 \\
    0 & s_y & 0 \\
    0 & 0 & 1
    \end{bmatrix}
    $$
    where s_x and s_y are the scaling factors along the x and y axes, respectively.

3. **Rotation**: This rotates an object around the origin. In 2D, the rotation matrix is:
    $$
    \begin{bmatrix}
    \cos(\theta) & -\sin(\theta) & 0 \\
    \sin(\theta) & \cos(\theta) & 0 \\
    0 & 0 & 1
    \end{bmatrix}
    $$
    where theta is the angle of rotation.

4. **Shearing**: This slants the shape of an object. In 2D, the shearing matrix is:
    $$
    \begin{bmatrix}
    1 & sh_x & 0 \\
    sh_y & 1 & 0 \\
    0 & 0 & 1
    \end{bmatrix}
    $$
    where sh_x and sh_y are the shearing factors along the x and y axes, respectively.

#### Combining Transformations

Multiple transformations can be combined into a single transformation matrix by matrix multiplication. The order of multiplication is important, as matrix multiplication is not commutative.

## Applications in Machine Learning <a id="MLApp"></a>

In machine learning, transformation matrices are used in various applications, such as:

- **Data Augmentation**: Applying transformations to training data to increase the diversity of the dataset.
- **Feature Engineering**: Transforming features to improve the performance of machine learning models.
- **Neural Networks**: Applying linear transformations in the layers of neural networks.

Understanding transformation matrices is crucial for effectively manipulating and interpreting data in machine learning and other fields.

## Resources

- https://builtin.com/data-science/basic-linear-algebra-deep-learning
- https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
