# Linear Algebra

The Uniform Code of Military Justice specifies court martial for any officer who sends a soldier into battle without a weapon. In a similar fashion, students should be well armed with an understanding of linear algebra beofre they are sent out to learn naural networks.


Linear algebra is the branch of mathematics that deals with vector spaces. A vector space is a collection of objects called vectors, which may be added together and multiplied by numbers, called scalars. Vectors are often used to represent data points, such as the features of a data set. Vectors are also used to represent model parameters, such as the weights and biases of a neural network.

In order to better understand linear algebra, we will first review the basic concepts of scalars, vectors, matrices, and tensors.

## Scalars

To put it simply a scalar is just a number. For now this is all we need to know about scalars. We will revisit them later when we discuss tensors. All you have to know is that they are usually denoted by lower case letters, such as $s$ or $n$. You will also often see the extression "Let $s \in \mathbb{R}$" which means that $s$ is a scalar and is a member of the set of real numbers, or "Let $n \in \mathbb{N}$" which means that $n$ is a scalar and is a member of the set of natural numbers.

## Vectors

Vectors are one step above scalars. Vectors are a collection of numbers in the form of an array. Here is an example of a vector:

$x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$

You can think of these numbers as points in space or features in a dataset. The numbers in the vector are called the components of the vector. The number of components in a vector is called the dimension of the vector. In the example above, the vector $x$ has three components and is a three dimensional vector. Vectors are usually denoted by lower case bold letters, such as $\mathbf{x}$ or $\mathbf{y}$. The components of a vector are usually denoted by lower case letters with subscripts, such as $x_1$ or $y_2$. The components of a vector can be accessed by their index, such as $x[0]$ or $y[1]$.

To get a better understanding of how we will be using vectors in python lets try to write a simple vector operation.


In [5]:
import numpy as np

# Function to demonstrate the use of vectors
def vector_operations():
    # Create a vector
    vector = np.array([1, 2, 3])
    print("Vector 1:", vector)
    
    # Perform vector addition
    vector2 = np.array([4, 5, 6])
    print("Vector 2:", vector2)

    # Perform vector addition
    print("Vector Addition:", np.add(vector, vector2))
    
    # Perform scalar multiplication
    print("Vector Multiplication :", np.multiply(vector, vector2))

print("--- Vector Operations ---")
vector_operations()


--- Vector Operations ---
Vector 1: [1 2 3]
Vector 2: [4 5 6]
Vector Addition: [5 7 9]
Vector Multiplication : [ 4 10 18]


Above we can see what a vector operation looks in python. It is important to make a mental model of what these operations represent. They can be mathematically represented as follows:

$\mathbf{z} = \mathbf{x} + \mathbf{y} = \begin{bmatrix} x_1 + y_1 \\ x_2 + y_2 \\ x_3 + y_3 \end{bmatrix}$

and

$\mathbf{z} = \mathbf{x} \cdot \mathbf{y} = \begin{bmatrix} x_1 \cdot y_1 \\ x_2 \cdot y_2 \\ x_3 \cdot y_3 \end{bmatrix}$

where $\mathbf{z}$ represents the resulting vector, $\mathbf{x}$ and $\mathbf{y}$ are the vectors we are operating on, and the dot represents the operation we are performing. In the first example we are adding the vectors together and in the second example we are multiplying the vectors together. We will be using these operations a lot in the future so it is important to understand what they represent.

## Matrices

One step above vectors are matrices. Matrices are a collection of vectors in the form of a two dimensional array. Matrices are usualy denoted by upper case bold letters, such as $\mathbf{X}$ or $\mathbf{Y}$. The components of a matrix are usually denoted by lower case letters with two subscripts, such as $x_{11}$ or $y_{22}$. The components of a matrix can be accessed by their two indices, such as $x[0][0]$ or $y[1][2]$.

One of the most important matrix operations is the Transpose and its defined as: Let \$A be an $m \times n$ matrix. The transpose of $A$ is the $n \times m$ matrix, denoted $A^T$, whose columns are formed from the corresponding rows of $A$. In other words, if $A = [a_{ij}]$, then $A^T = [b_{ij}]$, where $b_{ij} = a_{ji}$.

To better understand matrix operations we can print out the output of numpy operations such as multiplication, and transpose.

In [7]:

# Function to demonstrate the use of matrices
def matrix_operations():
    # Create a matrix
    matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    print("Matrix 1: \n", matrix)
    
    # Perform matrix multiplication
    matrix2 = np.array([[10, 11, 12], [13, 14, 15], [16, 17, 18]])
    print("Matrix 2: \n", matrix2)
    print("Matrix Multiplication: \n", np.dot(matrix, matrix2))
    
    # Perform matrix transposition
    print("Matrix 1 Transposition: \n", np.transpose(matrix))

print("--- Matrix Operations ---")
matrix_operations()


--- Matrix Operations ---
Matrix 1: 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Matrix 2: 
 [[10 11 12]
 [13 14 15]
 [16 17 18]]
Matrix Multiplication: 
 [[ 84  90  96]
 [201 216 231]
 [318 342 366]]
Matrix 1 Transposition: 
 [[1 4 7]
 [2 5 8]
 [3 6 9]]


It is important to note that when it comes to matrix multiplication it is distriutive, associative however it is not communative. This means that the order in which you multiply matrices is important. For example, if you have two matrices $A$ and $B$, then $A \cdot B \neq B \cdot A$ might not always hold. However Associative and Distributive properties do hold. For example, if you have three matrices $A$, $B$, and $C$, then $A \cdot (B \cdot C) = (A \cdot B) \cdot C$ and $A \cdot (B + C) = A \cdot B + A \cdot C$.

However, the dot product of two vectors is commutative. This means that if you have two vectors $\mathbf{x}$ and $\mathbf{y}$, then $\mathbf{x} \cdot \mathbf{y} = \mathbf{y} \cdot \mathbf{x}$. We will not dive deep into this but it is important to see how these operations are preformed.


In [11]:

# Function to demonstrate the commutative property of matrix multiplication
def commutative_property():
    # Create a matrix
    matrix = np.array([[1, 2], [3, 4]])
    print("Matrix 1: \n", matrix)
    
    # Create another matrix
    matrix2 = np.array([[5, 6], [7, 8]])
    print("Matrix 2: \n", matrix2)
    
    # Perform matrix multiplication

    
    print("Matrix Multiplication: \n", np.dot(matrix, matrix2))
    print("Matrix Multiplication: \n", np.dot(matrix2, matrix))

print("--- Commutative Property ---")
commutative_property()

# Function to demonstrate the use of the dot product
def dot_product():
    # Create a vector
    vector = np.array([1, 2, 3])
    print("Vector 1: \n", vector)
    
    # Create another vector
    vector2 = np.array([4, 5, 6])
    print("Vector 2: \n", vector2)
    
    # Perform the dot product
    print("Dot Product: \n", np.dot(vector, vector2))
    print("Dot Product: \n", np.dot(vector2, vector))

print("--- Dot Product ---")
dot_product()


--- Commutative Property ---
Matrix 1: 
 [[1 2]
 [3 4]]
Matrix 2: 
 [[5 6]
 [7 8]]
Matrix Multiplication: 
 [[19 22]
 [43 50]]
Matrix Multiplication: 
 [[23 34]
 [31 46]]
--- Dot Product ---
Vector 1: 
 [1 2 3]
Vector 2: 
 [4 5 6]
Dot Product: 
 32
Dot Product: 
 32


## Tensors

Tensors are a generalization of vectors and matrices and are represented as multi-dimensional arrays. Tensors are usually denoted by upper case bold letters, such as $\mathbf{X}$ or $\mathbf{Y}$. The components of a tensor are usually denoted by lower case letters with three subscripts, such as $x_{111}$ or $y_{222}$. The components of a tensor can be accessed by their three indices, such as $x[0][0][0]$ or $y[1][2][3]$.

Tensors are used to represent multi-dimensional data, such as images, which are usually represented as three-dimensional tensors. The first dimension represents the height of the image, the second dimension represents the width of the image, and the third dimension represents the color channels of the image. For example, a color image with a height of 256 pixels, a width of 256 pixels, and three color channels (red, green, and blue) can be represented as a three-dimensional tensor with a shape of (256, 256, 3).

To better understand tensors we can print out the output of numpy operations such as addition, and transpose.

In [9]:

# Function to demonstrate the use of tensors
def tensor_operations():
    # Create a tensor
    tensor = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[10, 11, 12], [13, 14, 15], [16, 17, 18]]])
    print("Tensor:", tensor)
    
    # Perform tensor addition
    tensor2 = np.array([[[19, 20, 21], [22, 23, 24], [25, 26, 27]], [[28, 29, 30], [31, 32, 33], [34, 35, 36]]])
    print("Tensor Addition:", np.add(tensor, tensor2))
    
    # Perform tensor reshaping
    print("Tensor Reshaping:", np.reshape(tensor, (3, 2, 3)))


Tensor operations are similar to matrix operations. However, the order in which you multiply tensors is important. For example, if you have two tensors $A$ and $B$, then $A \cdot B \neq B \cdot A$ might not always hold. However Associative and Distributive properties do hold. For example, if you have three tensors $A$, $B$, and $C$, then $A \cdot (B \cdot C) = (A \cdot B) \cdot C$ and $A \cdot (B + C) = A \cdot B + A \cdot C$.

## Conclusion

In this notebook, we reviewed the basic concepts of scalars, vectors, matrices, and tensors. We also reviewed the basic operations of addition, subtraction, multiplication, and transpose for vectors, matrices, and tensors. We will be using these concepts and operations in the future to help build our deep learning model from scratch. 