# Linear Algebra Fundamentals

## Scalars

### Operations with Python built-in functions

- Addition:
    - $c = a + b$

In [1]:
a = 6
b = 3
c = a + b

In [2]:
print(c)

9


- Subtraction:
    - $c = a - b$

In [3]:
c = a - b

In [4]:
print(c)

3


- Multiplication:
    - $c = a * b$

In [5]:
c = a * b

In [6]:
print(c)

18


- Division:
    - c = a / b

In [7]:
c = a / b

In [8]:
print(c)

2.0


### Scalar norm/magnitude:

- Absolute scalar value:
    - $|c| = \begin{cases} c \text{ if }c>0, \\ -c \text{ otherwise.} \end{cases}$

In [9]:
c = -1
c_new = abs(c)

In [10]:
print(c)
print(c_new)

-1
1


In [11]:
c = 1 
c_new = abs(c)

In [12]:
print(c)
print(c_new)

1
1


#### Dimensionality:
- A scalar real-numbered variable $x$, consist of 1 element. Thus it has dimensionality of one, or as denoted in literature $x\in\mathbb{R}$. 

## Vector operations using Numpy package

In [13]:
import numpy as np

- vector representation:
    - $\vec{a}=\Bigl[\begin{matrix}2 \\4\end{matrix}\Bigl]$
    
![Vector](./resources/vec.png)

In [39]:
a1 = np.array([2, 4]) #row vector
a2 = np.array([[2], [4]])

In [40]:
print(a1)
print(a1.shape)

[2 4]
(2,)


In [41]:
print(a2)
print(a2.shape)

[[2]
 [4]]
(2, 1)


ref. [Comparing two NumPy arrays for equality, element-wise](https://stackoverflow.com/questions/10580676/comparing-two-numpy-arrays-for-equality-element-wise)

In [47]:
a1

array([2, 4])

In [49]:
a2.T #transpose the colunm vector

array([[2, 4]])

In [58]:
np.array_equal(a2.T, a1)

False

In [59]:
a2.T[0]

array([2, 4])

In [84]:
print(np.array_equal(a2.T[0], a1)) #a2 has 2 dimension, and a1 has 1 dimension
print(a2.T.shape)
print(a2[0].shape)

True
(1, 2)
(1,)


In [85]:
a1.T #its shape is (2,), just 1 dimension(row vector). So it can not be transposed

array([2, 4])

In [86]:
a3 = a1.reshape((1,2))
print(a3)
print(a3.shape)

[[2 4]]
(1, 2)


In [87]:
a3.T #transpose the row vector

array([[2],
       [4]])

In [88]:
np.array_equal(a3.T, a2)

True

#### Dimensionality of vectors
- A vector $\vec{x}$, that consists of n real numbered variables, has dimensionality of n, which gets denoted as $\vec{x}\in\mathbb{R}^n$. 
- We call $x_{i}$ the vector entry on the i-th position of $\vec{x}$.

In [61]:
# get 2nd entry of vector a1 (remeber indexing in python starts at index 0)
print(a1[0], a1[1])

2 4


- Addition:
    - $\vec{c} = \vec{a} + \vec{b} = \Bigl[\begin{matrix}a_x + b_x \\ a_y + b_y\end{matrix}\Bigl]$
    
![VectorAddition](./resources/vadd.png)

In [62]:
a = np.array([2, 4])
b = np.array([3, 1])
c = a + b

In [63]:
print(c)

[5 5]


- Subtraction:
    - $\vec{c} = \vec{a} - \vec{b}=\Bigl[\begin{matrix}a_x - b_x \\ a_y - b_y\end{matrix}\Bigl]$
    
    
![VectorSubtraction](resources/vsub.png)

In [64]:
a = np.array([5, 3])
b = np.array([1, 2])
c = a - b

In [65]:
print(c)

[4 1]


- Vector scalar multiplication:
    - $\vec{c} = \beta \cdot \vec{a}=\Bigl[\begin{matrix}\beta\cdot a_x \\ \beta\cdot a_y\end{matrix}\Bigl]$
    
    
![vs](resources/vs.png)

In [66]:
a = np.array([2, 1])
beta = 2
c = beta * a

In [67]:
print(c)

[4 2]


### Vector magnitude/length:
- $||\vec{a}||=\sqrt{\sum_{i=1}^{2}a_{i}^{2}} = \sqrt{a_{x}^2 + a_{y}^2} = \sqrt{a_{1}^2 + a_{2}^2}$
- <img src="https://render.githubusercontent.com/render/math?math=%24%7C%7C%5Cvec%7Ba%7D%7C%7C%3D%5Csqrt%7B%5Csum_%7Bi%3D1%7D%5E%7B2%7Da_%7Bi%7D%5E%7B2%7D%7D%20%3D%20%5Csqrt%7Ba_%7Bx%7D%5E2%20%2B%20a_%7By%7D%5E2%7D%20%3D%20%5Csqrt%7Ba_%7B1%7D%5E2%20%2B%20a_%7B2%7D%5E2%7D%24">
- $||\vec{b}||=\sqrt{\sum_{i=1}^{3}b_{i}^{2}} = \sqrt{b_{x}^2 + b_{y}^2 + b_{z}^2} = \sqrt{b_{1}^2 + b_{2}^2 + b_{3}^2}$

![VectorNorm](./resources/vnorm.png)

[numpy.linalg.norm documentation](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html)

In [68]:
a = np.array([4, 2])
b = np.array([6, 1, 3])
a_length = np.linalg.norm(a,ord=2) #lin(ear) alg(ebra) library
b_length = np.linalg.norm(b,ord=2)

In [69]:
print(a_length)
print(b_length)

4.47213595499958
6.782329983125268


- normalizing vector length:
    - $\vec{a}_{normalized}= \frac{\vec{a}}{||\vec{a}||}$

In [70]:
a_normalized =  a / a_length
a_normalized_length = np.linalg.norm(a_normalized, 2)

In [71]:
print(a_normalized_length) #unit vector

0.9999999999999999


### Vector transpose
- $\vec{a}=\Bigl[\begin{matrix}a_x \\ a_y\end{matrix}\Bigl]\rightarrow \vec{a}^T=[a_x \ a_y ]$

In [89]:
a = np.array( [ [3] , [3] ] ) # 2 rows. each has just 1 component.

In [90]:
print(a)
print(a.shape)

[[3]
 [3]]
(2, 1)


In [91]:
print(a.T)
print(a.T.shape)

[[3 3]]
(1, 2)


In [92]:
a = np.array([3, 3])
print(a)
print(a.T)

[3 3]
[3 3]


### dot product and vector projection

- dot product:
    - $a_b = \vec{a}^T \vec{b} = a_x \cdot b_x + a_y \cdot b_y$
    
    
![DotProduct](resources/dp.png)

In [93]:
a = np.array([[2], [1]]) # (2,1) #caution: NOT np.array([2,1])
b = np.array([[3], [1]]) # (2,1)

ab = np.dot(a.T, b) # (1,2)dot(2,1)

In [94]:
print(ab)

[[7]]


- Dot product of orthogonal vectors:
    - The dot product of two orthogonal vectors(90° angle between them) equals zero, e.g.
    - $\vec{a}^T\vec{b}=\Bigl[\begin{matrix}2 \\ 2\end{matrix}\Bigl]^T\Bigl[\begin{matrix}1 \\ -1\end{matrix}\Bigl]=0$
    
![orthogonal](resources/orth.png)

In [95]:
a = np.array([[2], [2]])
b = np.array([[1], [-1]])
c = np.dot(a.T,b)
print(c)

[[0]]


- Vector projection
    - unnormalized: $\vec{a_{bu}}=\vec{a}^T\vec{b}\cdot\vec{b}$
    - normalized: $\vec{a_b}=\frac{\vec{a}^T\vec{b}}{||\vec{b}||}\cdot\frac{\vec{b}}{||\vec{b}||}$
![Projection](resources/dotext.png)

In [96]:
# unnormalized
a = np.array([5, 7])
b = np.array([11, 8])
abu = np.dot(a.T,b) * b
print(abu)

[1221  888]


In [97]:
# normalized
a = np.array([5, 7])
b = np.array([11, 8])
ab = (np.dot(a.T,b) / np.linalg.norm(b, ord=2)) * (b / np.linalg.norm(b, ord=2))
print(ab)

[6.6 4.8]


In [98]:
abu/ab # np.linalg.norm(b, ord=2)**2 == 185

array([185., 185.])

In [99]:
ab*185

array([1221.,  888.])

In [100]:
b/ab  

array([1.66666667, 1.66666667])

In [101]:
ab*1.67 # b is [11, 8]

array([11.022,  8.016])

## Matrices

- Matrix representation:
    - $A=\Bigl[\begin{matrix}2 & 1\\ 3 & 4\end{matrix}\Bigl]$

In [104]:
A = np.array([ [2, 1], [3, 4] ])
print(A)
print("\n")
print(A.shape)

[[2 1]
 [3 4]]


(2, 2)


### Dimensionality of matrices
- A matrix $A$, that consists of n rows, each of which has m real numbered variables, has dimensionality of $n x m$, which gets denoted as $A\in\mathbb{R}^{n,m}$.
- We call $A_{ij}$ the matrix entry in the i-th row, and j-th column of the matrix A.

In [105]:
# get entry in 1st row and 2nd column(remeber indexing in python starts at index 0)
A[0,1]

1

- Matrix-Vector multiplication:
    - $\vec{b}=A\vec{a}=\Bigl[\begin{matrix}2 & 1\\ 3 & 4\end{matrix}\Bigl]\Bigl[\begin{matrix}a_x\\ a_y\end{matrix}\Bigl]=\Bigl[\begin{matrix}2\cdot a_x + 1\cdot a_y\\ 3\cdot a_x + 4\cdot a_y\end{matrix}\Bigl]$ 

- General multiplication scheme:
![mm](./resources/mm.png)

- Geometric interpretation of Matrix-Vector multiplication:
    - A matrix can be interpreted as scaling and rotating the vector that it gets multiplied with, e.g.
    - $\vec{b}=A\vec{a}=\Bigl[\begin{matrix}2 & 1\\ 1 & 2\end{matrix}\Bigl]\Bigl[\begin{matrix}2\\ 1\end{matrix}\Bigl]=\Bigl[\begin{matrix}5\\ 4\end{matrix}\Bigl]$

In [106]:
A = np.array([[2, 1], [1, 2]])
a = np.array([2, 1])
b = np.dot(A, a)
print(b)

[5 4]


- Visually

![scalerotate](resources/scale_rotate.png)

- Matrix-Matrix multiplication:
    - $C=AB=\Bigl[\begin{matrix}a_{11} & a_{12}\\ a_{21} & a_{22}\end{matrix}\Bigl]\Bigl[\begin{matrix}b_{11} & b_{12}\\ b_{21} & b_{22}\end{matrix}\Bigl]=\Bigl[\begin{matrix}a_{11}b_{11}+a_{12}b_{21} & a_{11}b_{12}+a_{12}b_{22}\\ a_{21}b_{11}+a_{22}b_{21} & a_{21}b_{12}+a_{22}b_{22}\end{matrix}\Bigl]$
- General multiplication schema:
![m](resources/m.png)

In [107]:
A = np.array([[1, 1],[2, 2]])
B = np.array([[2, 3],[1, 2]])
C = np.dot(A,B)
print(C)

[[ 3  5]
 [ 6 10]]


- Another way of multiplying matrices, element-wise matrix multiplication(covered in NumPy lecture):

In [108]:
C = A*B 
print(C)

[[2 3]
 [2 4]]


- Matrix transposition:
    - $A=\Bigl[\begin{matrix}a_{11} & a_{12}\\ a_{21} & a_{22}\end{matrix}\Bigl]\rightarrow A^T=\Bigl[\begin{matrix}a_{11} & a_{21}\\ a_{12} & a_{22}\end{matrix}\Bigl]$

In [109]:
A = np.array([[1, 2],[1, 1]])
print(A)
print("\n")
print(A.T)

[[1 2]
 [1 1]]


[[1 1]
 [2 1]]


- Matrix norms:
    - Frobenius norm:
    - $A=\Bigl[\begin{matrix}a_{11} & a_{12}\\ a_{21} & a_{22}\end{matrix}\Bigl]$
   $||A||_{Frobenius}=\sqrt{\sum_{i=1}^{2}\sum_{j=1}^{2}A_{ij}^2}=\sqrt{A_{11}^2+A_{12}^2+A_{21}^2+A_{22}^2}$

In [110]:
A = np.array([[2, 2],[2, 2]])
A_magnitude = np.linalg.norm(A) # same with np.linalg.norm(A, ord=2)
print(A_magnitude)

4.0


- Eigenvalues/Eigenvectors:
    - Given a matrix A, the Eigenvalues $\lambda$ and Eigenvectors $\vec{e_v}$ of A solve the following equation:
    - $A\vec{e_v}=\lambda\vec{e_v}$

- Geometric interpretation of Matrix-Vector multiplication:
    - A matrix multiplied with it's eigenvector only scales the eigenvector and doesn't rotate it like in the example above. This means that multiplying the eigenvector with by a matrix has the same result as multiplying the vector by a scalar, which is called eigenvalue. Each eigenvector of the matrix has it's own corresponding eigenvalue.
    - Since Eigenvalues scale the Eigenvectors, the Eigenvector of the largest Eigenvalue can be interpreted as the direction of highest variance. 

![eigenvalues](resources/eigenvalues.png)

In [111]:
A = np.array([[1, 2], [3, -4]])
eigenvalues, eigenvetors = np.linalg.eig(A)

In [117]:
print(eigenvalues)
print("\n")
print(eigenvetors)
print("\n")
print(eigenvetors[:, 0])
print(eigenvetors[:, 1])

[ 2. -5.]


[[ 0.89442719 -0.31622777]
 [ 0.4472136   0.9486833 ]]


[0.89442719 0.4472136 ]
[-0.31622777  0.9486833 ]


In [116]:
# eigenvectors are normalized
print(np.linalg.norm(eigenvetors[:, 0], ord=2)) # [2,1] -> [0.89442719 0.4472136 ]
print(np.linalg.norm(eigenvetors[:, 1], ord=2))

0.9999999999999999
0.9999999999999999


- Special Matrices:
    - Unit matrix:
        - $I_n=\Bigl[\begin{matrix}1 & 0\\ 0 & 1\end{matrix}\Bigl]$, I_n denotes the number of rows/columns of $I$.
    - Orthogonal matrices:
        - Fulfill the equation: $A^TA=I_n$

- Note on GPUs:
    - Have many processors
    - Matrix multiplications can be be parallelized, each processor processes on part of the matrix multiplication
    - Deep Learning can be significantly speed up by using GPUs