# CS 316 : Introduction to Deep Learning
## Lab 02 : Linear Algebra
### Dr. Abdul Samad

# Overview

We will cover the fundamentals of Linear Algebra using Numpy in this lab. We'll start by learning how to represent scalars, vectors, and matrices in Numpy, and then how to operate on them.

In [2]:
# Import numpy using the alias np
import numpy as np
np.__version__

'1.26.4'

# Scalars



In this course, we use mathematical notation in which scalar variables are denoted by lower-case letters (e.g., $x$, $y$, and $z$).
The space of all (continuous) *real-valued* scalars is denoted by $ \mathbb{R} $.
For the sake of simplicity, we will avoid rigorous definitions of what exactly *space* is, but remember that the expression $x \in \mathbb{R}$ is a formal way of saying that $x$ is a real-valued scalar. The symbol $\in$, which is pronounced "in," denotes membership in a set.
Similarly, we could write $x, y \in 0, 1$ to indicate that $x$ and $y$ are numbers with only $0$ or $1$ as values.

**A tensor with only one element represents a scalar.** In the following snippet, we instantiate two scalars and use them to perform common arithmetic operations such as addition, multiplication, division, and exponentiation.

In [3]:
x = np.array(3.0)
y = np.array(2.0)

x + y, x * y, x / y, x**y

(5.0, 6.0, 1.5, 9.0)

# Vectors



[**A vector is simply a list of scalar values.**]
These values are known as the vector's *elements* (or *entries* or *components*).
When our vectors represent examples from our dataset, their values have some real-world significance. For example, if we were training a model to predict the risk of a loan default, we might assign each applicant to a vector whose components correspond to their income, length of employment, number of previous defaults, and other factors. If we were researching the risk of heart attacks that hospital patients might face, we might represent each patient with a vector whose components capture their most recent vital signs, cholesterol levels, minutes of exercise per day, and so on. Vectors are typically denoted in mathematical notation as bold-faced, lower-cased letters (e.g., $\mathbf{x}$, $\mathbf{y}$, and $\mathbf{z}$).

We work with vectors using one-dimensional tensors. Tensors can have arbitrary lengths in general, subject to the memory constraints of your machine.

In [4]:
x = np.arange(4)
x

array([0, 1, 2, 3])

We can use a subscript to refer to any element of a vector. For example, we can refer to the $i^\mathrm{th}$ element of $\mathbf{x}$ as $x_i$.
Because the element $x_i$ is a scalar, we do not bold-face the font when referring to it. This book, like much of the literature, considers column vectors to be the default orientation of vectors. A vector $\mathbf{x}$ can be written as

$$\mathbf{x} =\begin{bmatrix}x_{1}  \\x_{2}  \\ \vdots  \\x_{n}\end{bmatrix},$$

where $x_1, \ldots, x_n$ are elements of the vector.
In code, we (**access any element by indexing into the tensor.**)


In [5]:
x[3]

3

## Length, Dimensionality, and Shape



To be clear. A vector is simply an array of numbers. And, like every array, every vector has a length.
If we want to express a vector in math notation as having $n$ real-valued scalars, we can write it as $\mathbf{x}$ in $\mathbb{R}^n$.
The length of a vector is commonly referred to as the vector's *dimension*. As with any other Python array, we can access the length of a tensor by using Python's built-in `len()` function.





In [6]:
len(x)


4

When a tensor represents a vector (with exactly one axis), its length can be accessed via the `.shape` attribute.
The shape is a tuple that lists the length (dimensionality) of the tensor along each axis.
(**The shape has only one element for tensors with only one axis.**)


In [7]:
x.shape

(4,)

It should be noted that the word "dimension" is frequently overused in these contexts, which leads to confusion.
To be clear, we refer to the dimensionality of a *vector* or a *axis* as its length, i.e. the number of elements of a vector or an axis.
However, we refer to a tensor's dimensionality as the number of axes that it has.
In this sense, the dimensionality of a tensor axis is the length of that axis.

## Vector - Vector Multiplication

### Hadamard Product

The Hadamard product is a binary operation that takes two vectors of the same dimensions and generates another vector of the same dimensions as the operands, with each element ij being the product of the original two vectors' elements i, j.

In [8]:
x = np.arange(4)
y = np.ones(4)
print(x.shape)
print(y.shape)
np.multiply(x,y)

(4,)
(4,)


array([0., 1., 2., 3.])

### Dot Product

We've only done elementwise operations, sums, and averages so far. And if this were all we had to offer, linear algebra might not even merit its own section. However, the dot product is one of the most fundamental operations.

Given two vectors $\mathbf{x}, \mathbf{y} \in \mathbb{R}^d$, their *dot product* $\mathbf{x}^\top \mathbf{y}$ or $\langle \mathbf{x}, \mathbf{y} \rangle$ is a sum of the products of the elements at the same index: $\mathbf{x}^\top \mathbf{y} = \sum_{i=1}^{d} x_i y_i$

In [9]:
x = np.arange(4)
y = np.ones(4)
print(x.shape)
print(y.shape)
print(np.dot(x.T,y))
print(x.T @ y)
print (np.sum(x * y))

(4,)
(4,)
6.0
6.0
6.0


### Outer product

In [10]:
u = np.arange(4).reshape((4,1))
v = np.ones(3).reshape((3,1))
print(u.shape)
print(v.shape)
print(u @ v.T)

(4, 1)
(3, 1)
[[0. 0. 0.]
 [1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


# Matrices

Just as vectors generalize scalars from order zero to order one,
matrices generalize vectors from order one to order two.
Matrices, which we will typically denote with bold-faced, capital letters
(e.g., $\mathbf{X}$, $\mathbf{Y}$, and $\mathbf{Z}$),
are represented in code as tensors with two axes.

In math notation, we use $\mathbf{A} \in \mathbb{R}^{m \times n}$
to express that the matrix $\mathbf{A}$ consists of $m$ rows and $n$ columns of real-valued scalars.
Visually, we can illustrate any matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ as a table,
where each element $a_{ij}$ belongs to the $i^{\mathrm{th}}$ row and $j^{\mathrm{th}}$ column:

$$\mathbf{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}.$$


For any $\mathbf{A} \in \mathbb{R}^{m \times n}$, the shape of $\mathbf{A}$
is ($m$, $n$) or $m \times n$.
Specifically, when a matrix has the same number of rows and columns,
its shape becomes a square; thus, it is called a *square matrix*.

We can [**create an $m \times n$ matrix**]
by specifying a shape with two components $m$ and $n$
when calling any of our favorite functions for instantiating a tensor.


In [11]:
A = np.arange(20).reshape(5, 4)
A

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [14]:
# len(A)
print(A.shape[0])

5


In [None]:
A.reshape(-1).shape[0]

20

In [None]:
I = np.eye(4)
I

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [None]:
D = np.diag([1,4,8,16])
D

array([[ 1,  0,  0,  0],
       [ 0,  4,  0,  0],
       [ 0,  0,  8,  0],
       [ 0,  0,  0, 16]])

## Matrix-Vector Products



Now that we know how to calculate dot products,
we can begin to understand *matrix-vector products*.
Recall the matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$
and the vector $\mathbf{x} \in \mathbb{R}^n$
defined .
Let us start off by visualizing the matrix $\mathbf{A}$ in terms of its row vectors

$$\mathbf{A}=
\begin{bmatrix}
\mathbf{a}^\top_{1} \\
\mathbf{a}^\top_{2} \\
\vdots \\
\mathbf{a}^\top_m \\
\end{bmatrix},$$

where each $\mathbf{a}^\top_{i} \in \mathbb{R}^n$
is a row vector representing the $i^\mathrm{th}$ row of the matrix $\mathbf{A}$.

[**The matrix-vector product $\mathbf{A}\mathbf{x}$
is simply a column vector of length $m$,
whose $i^\mathrm{th}$ element is the dot product $\mathbf{a}^\top_i \mathbf{x}$:**]

$$
\mathbf{A}\mathbf{x}
= \begin{bmatrix}
\mathbf{a}^\top_{1} \\
\mathbf{a}^\top_{2} \\
\vdots \\
\mathbf{a}^\top_m \\
\end{bmatrix}\mathbf{x}
= \begin{bmatrix}
 \mathbf{a}^\top_{1} \mathbf{x}  \\
 \mathbf{a}^\top_{2} \mathbf{x} \\
\vdots\\
 \mathbf{a}^\top_{m} \mathbf{x}\\
\end{bmatrix}.
$$

We can think of multiplication by a matrix $\mathbf{A}\in \mathbb{R}^{m \times n}$
as a transformation that projects vectors
from $\mathbb{R}^{n}$ to $\mathbb{R}^{m}$.
These transformations turn out to be remarkably useful.
For example, we can represent rotations
as multiplications by a square matrix.
As we will see in subsequent chapters,
we can also use matrix-vector products
to describe the most intensive calculations
required when computing each layer in a neural network
given the values of the previous layer.

Expressing matrix-vector products in code with tensors,
we use the same `dot` function as for dot products.
When we call `np.dot(A, x)` with a matrix `A` and a vector `x`,
the matrix-vector product is performed.
Note that the column dimension of `A` (its length along axis 1)
must be the same as the dimension of `x` (its length).


In [None]:
A = np.arange(20).reshape(5, 4)
x = np.arange(4)
A.shape, x.shape, np.dot(A, x)

((5, 4), (4,), array([ 14,  38,  62,  86, 110]))

## Matrix-Matrix Multiplication




Say that we have two matrices $\mathbf{A} \in \mathbb{R}^{n \times k}$ and $\mathbf{B} \in \mathbb{R}^{k \times m}$:

$$\mathbf{A}=\begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1k} \\
 a_{21} & a_{22} & \cdots & a_{2k} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{n1} & a_{n2} & \cdots & a_{nk} \\
\end{bmatrix},\quad
\mathbf{B}=\begin{bmatrix}
 b_{11} & b_{12} & \cdots & b_{1m} \\
 b_{21} & b_{22} & \cdots & b_{2m} \\
\vdots & \vdots & \ddots & \vdots \\
 b_{k1} & b_{k2} & \cdots & b_{km} \\
\end{bmatrix}.$$


Denote by $\mathbf{a}^\top_{i} \in \mathbb{R}^k$
the row vector representing the $i^\mathrm{th}$ row of the matrix $\mathbf{A}$,
and let $\mathbf{b}_{j} \in \mathbb{R}^k$
be the column vector from the $j^\mathrm{th}$ column of the matrix $\mathbf{B}$.
To produce the matrix product $\mathbf{C} = \mathbf{A}\mathbf{B}$, it is easiest to think of $\mathbf{A}$ in terms of its row vectors and $\mathbf{B}$ in terms of its column vectors:

$$\mathbf{A}=
\begin{bmatrix}
\mathbf{a}^\top_{1} \\
\mathbf{a}^\top_{2} \\
\vdots \\
\mathbf{a}^\top_n \\
\end{bmatrix},
\quad \mathbf{B}=\begin{bmatrix}
 \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{m} \\
\end{bmatrix}.
$$


Then the matrix product $\mathbf{C} \in \mathbb{R}^{n \times m}$ is produced as we simply compute each element $c_{ij}$ as the dot product $\mathbf{a}^\top_i \mathbf{b}_j$:

$$\mathbf{C} = \mathbf{AB} = \begin{bmatrix}
\mathbf{a}^\top_{1} \\
\mathbf{a}^\top_{2} \\
\vdots \\
\mathbf{a}^\top_n \\
\end{bmatrix}
\begin{bmatrix}
 \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{m} \\
\end{bmatrix}
= \begin{bmatrix}
\mathbf{a}^\top_{1} \mathbf{b}_1 & \mathbf{a}^\top_{1}\mathbf{b}_2& \cdots & \mathbf{a}^\top_{1} \mathbf{b}_m \\
 \mathbf{a}^\top_{2}\mathbf{b}_1 & \mathbf{a}^\top_{2} \mathbf{b}_2 & \cdots & \mathbf{a}^\top_{2} \mathbf{b}_m \\
 \vdots & \vdots & \ddots &\vdots\\
\mathbf{a}^\top_{n} \mathbf{b}_1 & \mathbf{a}^\top_{n}\mathbf{b}_2& \cdots& \mathbf{a}^\top_{n} \mathbf{b}_m
\end{bmatrix}.
$$

[**We can think of the matrix-matrix multiplication $\mathbf{AB}$ as simply performing $m$ matrix-vector products and stitching the results together to form an $n \times m$ matrix.**]
In the following snippet, we perform matrix multiplication on `A` and `B`.
Here, `A` is a matrix with 5 rows and 4 columns,
and `B` is a matrix with 4 rows and 3 columns.
After multiplication, we obtain a matrix with 5 rows and 3 columns.


In [None]:
A = np.arange(20).reshape(5, 4)
B = np.ones(shape=(4, 4))


In [None]:
np.dot(A, B)

array([[ 6.,  6.,  6.],
       [22., 22., 22.],
       [38., 38., 38.],
       [54., 54., 54.],
       [70., 70., 70.]])

In [None]:
np.matmul(A,B)

array([[ 6.,  6.,  6.],
       [22., 22., 22.],
       [38., 38., 38.],
       [54., 54., 54.],
       [70., 70., 70.]])

In [None]:
A @ B

array([[ 6.,  6.,  6.],
       [22., 22., 22.],
       [38., 38., 38.],
       [54., 54., 54.],
       [70., 70., 70.]])

# Linear Algebra Module

The NumPy library contains a submodule `linalg` which provides provide efficient low level implementations of standard linear algebra algorithms. In this lab, we will only cover functions such as finding an inverse or rank of a matrix. However, more details can be found on the following link. <a src="https://numpy.org/doc/stable/reference/routines.linalg.html">Documentation</a>

## Inverse

In [17]:
A = np.array([[1., 2.], [3., 4.]])
invA = np.linalg.inv(np.matrix(A))
result = invA @ A
# print(np.result, 'result')
print(f'det(A) = {np.linalg.det(A)}')
print(f'A = {A}')
print(f'A^-1 = {invA}')
print(f'A^-1 A = {np.round(result,2)}')

det(A) = -2.0000000000000004
A = [[1. 2.]
 [3. 4.]]
A^-1 = [[-2.   1. ]
 [ 1.5 -0.5]]
A^-1 A = [[1. 0.]
 [0. 1.]]
