Linear Algebra and Calculus

In [None]:
General notations

Vector: a vector is a list of numbers, usually written vertically as a column or horizontally as a row. The numbers that make up the list are called the entries of the vector.

Given a vector x with n entries, where each entry ∈ x^i ∈ R represents the i^th component of the vector, it can be represented as:

X = [x1,x2,x3,...,xn] ∈ R^n

In [None]:
Matrix: a matrix is a rectangular array of numbers, usually written vertically as a column or horizontally as a row. The numbers that make up the list are called the entries of the matrix.

For a matrix A with m rows and n columns, we denote A ∈ R^m×n. Each entry Ai,j ∈ R in the matrix represents the element located in the i^th row and j^th column. The matrix A can be expressed as:
A = [A1,1,A1,2,...,A1,n
     A2,1,A2,2,...,A2,n
     ...
     Am,1,Am,2,...,Am,n] ∈ R^m×n

The vector x defined above can be considered as a n×1 matrix. It is specifically called a column vector and can be written as:

X = [x1,x2,x3,...,xn] ∈ R^n×1

In [None]:
Identity matrix

The identity matrix I ∈ R^n×n is a square matrix with ones on its main diagonal and zeroes everywhere else.

The matrix I is represented as:

I = [1,0,0,...,0
     0,1,0,...,0
     ...
     0,0,0,...,1] ∈ R^n×n

Remark: For all matrices A ∈ R^n×n, we have:

A × I = I × A = A

Matrix operations

In [None]:
Vector-vector multiplication- There are two types of vector-vector products:
For the inner product of vectors x and y:
Given x,y ∈ R^n the inner product is given by:

x.y = x1y1 + x2y2 + x3y3 + ... + xnyn

​This represents the summation of the products of corresponding elements of vectors x and y.

 

In [None]:
For the outer product of vectors x and y:
Given x ∈ R^m and y ∈ R^n, the outer product is given by:

x ⊗ y = xy^T

Where:
      x ⊗ y = [x1y1,x1y2,...,x1yn
                x2y1,x2y2,...,x2yn
                ...
                xm,y1,...,xmn] 
This matrix is of size m×n and is formed by multiplying each element of x with each element of y.

In [None]:
Matrix-vector multiplication:

Given a matrix A ∈ R^m×n and a vector x ∈ R^n, the matrix-vector product is given by:
if have a matrix A ∈ R^m×n and a vector x ∈ R^n, their multiplication will result in a vector of size R^m. The multiplication is defined as follows:

y = A . x 
Here, y is the resulting vector y ∈ R^m and its i^th element can be computed using:

yi = sum(Ai,j.xj) for j = 1 to n
Where yi is the i^th element of vector y, Ai,j is the element of the matrix A located in the i^th row and j^th column, and xj
​is the j^th element of vector x.



In [None]:
Matrix-matrix multiplication:

Given matrices A ∈ R^m×n and B ∈ R^n×p, their product results in a matrix C ∈ R^m×p:
C = A . B
The element Cij ∈ R is given by:
Ci,j = sum(Ai,k.Bk,j) for k = 1 to n
Where Ci,j is the element of matrix C located in the i^th row and j^th column, Ai,k is the element of matrix A located in the i^th row and k^th column, and Bk,j is the element of matrix B located in the k^th row and j^th column.

In [None]:
Transpose:

Given a matrix A, its transpose, denoted A^T, is obtained by flipping the matrix over its diagonal. This switches its row and column indices.

(A^T)j,i = Ai,j for all i and j

if A is of size m×n, then A^T is of size n×m.

In [None]:
Inverse:

Given an invertible square matrix A, its inverse is denoted A^−1. The unique property of the inverse matrix is:

A.A^−1 = A^−1.A = I

Where I is the identity matrix of the same size as A.

Matrix calculus:

In [None]:
Gradient

Given a function f:R^m×n → R and a matrix A ∈ R^m×n, the gradient of f with respect to A is a matrix of size m×n, denoted as ∇ Af(A). Each entry (i,j) of ∇ Af(A) corresponds to the partial derivative of f with respect to the (i,j)-th entry of A:

[∇ Af(A)]i,j = ∂f(A) / ∂Ai,j

This means the entry at the i-th row and j-th column of the gradient matrix represents how f changes as Ai,j changes, keeping all other entries of A constant.


Matrix properties

In [None]:
Norm

Given a vector space V, a norm is a function N:V→[0,+∞) that satisfies the following properties for all vectors x,y in 
V and scalar α:

1- Non-negativity: N(x)≥0 and N(x)=0 if and only if x=0.
2- Scalar multiplication: N(αx)=∣α∣N(x).
3- Triangle inequality: N(x+y)≤N(x)+N(y).

These conditions ensure that the function N behaves like a measure of "length" or "size" for vectors in V.

In [None]:
Type of norms for vectors:

1 - L¹ norm (Manhattan):
||x||₁ = |x1| + |x2| + ... + |xn|

2- L² norm (Euclidean):
||x||₂ = √(|x1|² + |x2|² + ... + |xn|²)

3- L∞ norm (Maximum):
||x||∞ = max(|x1|,|x2|,...,|xn|)

Type of norms for matrices:

1- Frobenius norm:
||A||F = √(sum(Ai,j²)) for i = 1 to m and j = 1 to n