# Linear Algebra for Neural Networks
Linear algebra plays a crucial role in understanding and implementing neural networks. Here are some key concepts:

# 1. Vectors

> In neural networks, vectors are often used to represent inputs, weights, biases, and outputs of individual neurons or layers. Vector operations, such as addition, subtraction, and scalar multiplication, are fundamental operations in linear algebra and are commonly used in neural network computations.

## 1.2. Addition & Subtraction

These operations play a crucial role in various aspects of neural network computations, such as forward propagation, backpropagation, and weight updates. They enable the network to learn and adapt to different patterns and relationships in the data.

- #### *1.2.1. Addition*

	Vector addition is used to combine the inputs from multiple neurons or layers. It allows us to aggregate information and create more complex representations.

In [None]:
# Define two vectors
vector1 = [1, 2, 3]
vector2 = [4, 5, 6]

# Perform vector addition
result = [x + y for x, y in zip(vector1, vector2)]

print("Result:", result)


In [None]:
# Same example using NumPy
import numpy as np

# Define two vectors
vector1 = np.array([1, 2, 3])
vector2 = np.array([4, 5, 6])

# Perform vector addition
result = np.add(vector1, vector2)

print("Result:", result)


- #### *1.2.2. Subtraction*

	Vector subtraction is used to calculate the difference between two vectors. It can be used to measure the distance or dissimilarity between two sets of features or to perform element-wise comparisons.

In [None]:
# Define two vectors
vector1 = [1, 2, 3]
vector2 = [4, 5, 6]

# Perform vector subtraction
result = [x - y for x, y in zip(vector1, vector2)]

print("Result:", result)


In [None]:
# Same example using NumPy
import numpy as np

# Define two vectors
vector1 = np.array([1, 2, 3])
vector2 = np.array([4, 5, 6])

# Perform vector subtraction
result = np.subtract(vector1, vector2)

print("Result:", result)


## 1.3. Vector Dot Product

The dot product is used to calculate the similarity between two vectors. In neural networks, it is often used to calculate the weighted sum of inputs and weights.

	* Note that np.dot performs dot product only on vectors, otherwise it performs matrix multiplication underneath the hood.

In [None]:
import numpy as np

# Define two vectors
vector1 = np.array([1, 2, 3])
vector2 = np.array([4, 5, 6])

# Perform dot product
dot_product = np.dot(vector1, vector2)

print("Dot Product:", dot_product)


# 2. Matrices

> Matrices are used to represent the connections between layers in a neural network. Each element in the matrix represents the weight of the connection between two neurons.

In [None]:
import numpy as np

# Define a matrix
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(matrix)


## 2.1. Subtraction and Addition
Pretty much the same as Vector subtraction and addition, you add or subtract the corresponding element as long as both matrices has the same size.

In [None]:
import numpy as np

# Define two matrices
matrix1 = np.array([[1, 2, 3], [4, 5, 6]])
matrix2 = np.array([[7, 8, 9], [10, 11, 12]])

# Perform matrix subtraction
subtraction_result = np.subtract(matrix1, matrix2)

# Perform matrix addition
addition_result = np.add(matrix1, matrix2)

print("Subtraction Result:")
print(subtraction_result)

print("Addition Result:")
print(addition_result)


## 2.2. Matrix Multiplication

Matrix multiplication is used to propagate inputs through the layers of a neural network. It involves multiplying the input vector with the weight matrix to produce the output matrix.

In [None]:
import numpy as np

# Define two matrices
matrix1 = np.array([[1, 2, 3], [4, 5, 6]])
matrix2 = np.array([[7, 8], [9, 10], [11, 12]])

# Perform matrix multiplication
result = np.matmul(matrix1, matrix2)

# Another way to perform matrix multiplication
result = matrix1 @ matrix2

# Another way to perform matrix multiplication
result = np.dot(matrix1, matrix2) # the np.dot here DOES NOT PERFORM DOT PRODUCT, it performs matrix multiplication

print("Result:", result)

# 3. Vector Dot Product vs Matrix Multiplication

> Both dot product and matrix multiplication are mathematical operations used in linear algebra and have different purposes and applications. Understanding the difference between them is crucial for effectively using them in various scenarios.

## 3.1. Vector Dot Product

The dot product, also known as the scalar product or inner product, is an operation between two vectors that results in a scalar value. It calculates the similarity or projection of one vector onto another. The dot product is calculated by multiplying the corresponding elements of the vectors and summing them up.

The dot product is useful in various applications, including:

- `Calculating the similarity between two vectors`: The dot product can be used to measure the similarity or correlation between two vectors. If the dot product is close to zero, the vectors are orthogonal or independent. If the dot product is positive, the vectors are pointing in a similar direction, and if it is negative, they are pointing in opposite directions.

- `Calculating the projection of a vector onto another`: The dot product can be used to calculate the projection of one vector onto another. The projection represents the component of one vector that lies in the direction of the other vector.

- `Calculating the magnitude of a vector`: The dot product of a vector with itself gives the square of its magnitude or length.

## 3.2. Matrix Multiplication

Matrix multiplication is an operation between two matrices that results in a new matrix. It involves multiplying the corresponding elements of the matrices and summing them up. The resulting matrix has dimensions determined by the dimensions of the input matrices.

Matrix multiplication is useful in various applications, including:

- `Transforming vectors and coordinates`: Matrix multiplication can be used to transform vectors and coordinates in different coordinate systems. It is commonly used in computer graphics, computer vision, and robotics to perform transformations such as translation, rotation, scaling, and shearing.

- `Propagating inputs through neural networks`: In neural networks, matrix multiplication is used to propagate inputs through the layers. Each layer in a neural network can be represented as a matrix, and matrix multiplication is used to calculate the outputs of each layer based on the inputs and weights.

- `Solving systems of linear equations`: Matrix multiplication can be used to solve systems of linear equations. By representing the system of equations as a matrix equation, matrix multiplication can be used to find the solution.

## 3.3. When to Use Dot Product and Matrix Multiplication

- Use `dot product` when you need to measure similarity, calculate projections, or find the magnitude of vectors.

- Use `matrix multiplication` when you need to transform vectors and coordinates, propagate inputs through neural networks, or solve systems of linear equations.


## 3.4. When Not to Use Dot Product and Matrix Multiplication

- `Dot product` cannot be used when the dimensions of the vectors are not compatible. The dot product is only defined for vectors of the same length.

- `Matrix multiplication` cannot be used when the dimensions of the matrices are not compatible. The number of columns in the first matrix must be equal to the number of rows in the second matrix.

- `Dot product and matrix multiplication` may not be suitable for non-linear operations or when dealing with non-linear data. In such cases, other mathematical operations or algorithms may be more appropriate.

## 3.5. In Conclusion

Dot Product and Matrix Multiplication are different operations between different objects.

	* Dot product is defined between two vectors.

	* Matrix product is defined between two matrices.

The connection between the two operations is the following: To calculate the c<sub>i,j</sub> entry of the matrix C:=AB, one takes the dot product of the `i'th row of the matrix A` with the `j'th column of the matrix B`.

Understanding the differences between dot product and matrix multiplication and knowing when to use them is essential for effectively applying linear algebra concepts in various fields, including machine learning, computer science, and engineering.

In [None]:
import numpy as np

# Define the input vector
input_vector = np.array([1, 2, 3])

# Define the weight matrix (Imagine these are the weights which are learned during training)
weight_matrix = np.array([[0.1, 0.2, 0.3],
                          [0.4, 0.5, 0.6],
                          [0.7, 0.8, 0.9]])

# Define the bias vector
bias_vector = np.array([0.1, 0.2, 0.3])


In [None]:
# calculate the weighted sum using np.dot
weighted_sum = np.dot(input_vector, weight_matrix) + bias_vector # the np.dot here DOES NOT PERFORM DOT PRODUCT, it performs matrix multiplication
activation = 1 / (1 + np.exp(-weighted_sum))

print("Weighted Sum:", weighted_sum)
print("Activation:", activation)

In [None]:
# calculate the weighted sum using matrix multiplication
weighted_sum = np.matmul(input_vector, weight_matrix) + bias_vector
activation = 1 / (1 + np.exp(-weighted_sum))

print("Weighted Sum:", weighted_sum)
print("Activation:", activation)

# 4. Tensors

> Tensors are multi-dimensional arrays that generalize scalars, vectors, and matrices. They are used to represent and manipulate data in various fields, including mathematics, physics, and computer science Tensors enable efficient computation, parallel processing, and automatic differentiation, making them essential for working with neural networks.

## 4.1. Mathematical Concept of Tensors

In mathematics, tensors are multi-dimensional arrays that generalize the concepts of scalars, vectors, and matrices. They are used to represent and manipulate data in various fields, including physics, engineering, and computer science.

## 4.1.1. Rank and Shape of Tensors

Tensors have two important properties: rank and shape.

- `Rank`: The rank of a tensor refers to the number of dimensions it has. For example, a scalar has rank 0, a vector has rank 1, a matrix has rank 2, and so on.

- `Shape`: The shape of a tensor describes the size of each dimension. For example, a 2D tensor with shape (3, 4) has 3 rows and 4 columns.

## 4.1.2. Tensor Operations

Tensors support various mathematical operations, such as addition, subtraction, multiplication, and division. These operations can be performed element-wise or using matrix operations, depending on the rank and shape of the tensors involved.

In [None]:
import torch

# Create tensors
tensor1 = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor2 = torch.tensor([[7, 8, 9], [10, 11, 12]])

# Perform tensor addition
addition_result = torch.add(tensor1, tensor2)

# Perform tensor subtraction
subtraction_result = torch.sub(tensor1, tensor2)

# Perform tensor multiplication
multiplication_result = torch.mul(tensor1, tensor2)

# Perform tensor division
division_result = torch.div(tensor1, tensor2)

print("Addition Result:")
print(addition_result)

print("Subtraction Result:")
print(subtraction_result)

print("Multiplication Result:")
print(multiplication_result)

print("Division Result:")
print(division_result)


## 4.1.3. Tensor Types

There are different types of tensors, including:

- `Scalar`: A scalar is a tensor of rank 0, representing a single value.

- `Vector*`: A vector is a tensor of rank 1, representing a list of values arranged in a specific order.

- `Matrix*`*`: A matrix is a tensor of rank 2, representing a 2D array of values.

- `Higher-Rank Tensors`: Tensors of rank 3 or higher are called higher-rank tensors. They represent multi-dimensional arrays of values.

## 4.2. Importance of Tensors in Neural Networks

Tensors play a crucial role in neural networks for several reasons:

1. **Data Representation**: Tensors provide a flexible and efficient way to represent and store data. They can handle complex data structures, such as images, audio, and text, which are commonly used in machine learning tasks.

2. **Computation**: Tensors enable efficient computation in neural networks. They support various mathematical operations, such as element-wise operations, matrix multiplication, and convolution, which are essential for performing forward and backward propagation in neural networks.

3. **Parallel Processing**: Tensors can be easily parallelized and processed on specialized hardware, such as GPUs (Graphics Processing Units) and TPUs (Tensor Processing Units). This allows for faster training and inference in neural networks, especially for large-scale models and datasets.

4. **Automatic Differentiation**: Tensors are used to store intermediate values during the forward and backward passes in neural networks. They enable automatic differentiation, which is essential for calculating gradients and updating the model parameters during the training process.


## 4.3. Weighted sum example with tensors

In [None]:
import torch

# Define the input vector
input_vector = torch.tensor([1, 2, 3])

# Define the weight matrix (Imagine these are the weights which are learned during training)
weight_matrix = torch.tensor([[0.1, 0.2, 0.3],
                              [0.4, 0.5, 0.6],
                              [0.7, 0.8, 0.9]])

# Define the bias vector
bias_vector = torch.tensor([0.1, 0.2, 0.3])

# calculate the weighted sum using torch.matmul
weighted_sum = torch.matmul(input_vector, weight_matrix) + bias_vector
activation = 1 / (1 + torch.exp(-weighted_sum))

print("Weighted Sum:", weighted_sum)
print("Activation:", activation)


## 4.4. Conclusion

The mathematical concept of tensors provides a powerful framework for representing and manipulating multi-dimensional data. Understanding tensors is crucial for various fields, including mathematics, physics, and machine learning. By leveraging tensors, we can solve complex problems and build sophisticated models that can handle large-scale data efficiently.

Tensors are a fundamental concept in neural networks. They provide a powerful and efficient way to represent and process numerical data, enabling the development and training of complex models. Understanding tensors is crucial for effectively working with neural networks and achieving optimal performance.


Understanding these linear algebra concepts is essential for building and training neural networks effectively.