# The Geometry of Data

**Linear Algebra in Practice**

![title image](images/title.png)

**Dr. Kristian Rother**

# Goal of this Workshop

1. introduce fundamentals
2. execute Python code for linear algebra
3. create plots and pictures

## Your host: Dr. Kristian Rother

- PhD on 3D structures of molecules (HU Berlin)
- using Python since 1999
- freelance trainer since 2011

# Exercises

- Python exercises on www.academis.eu/linear_algebra
- best executed with Jupyter or Google Colab
- used libraries: `numpy`, `pandas`, `seaborn`, `matplotlib`

# Overview

![Linear Algebra Mind Map](images/mind_map.png)

# Outline

1. Vectors
2. Matrices
3. Linear Transformations
4. Distances and Norms
5. Recommender Systems
6. Graph Analysis
7. Linear Equation Systems
8. Vectorizing images and text

# 1. Vectors

## What is a vector?

* **a geometric entity** with a direction and length, in an *n*-dimensional coordinate system.
* **a feature vector**: a data point consisting of *n* numerical features.
* in both cases, vectors are arrays of *n* real numbers

## Vector notation

Example: we were fruit shopping and store the amount of each type:

$$\vec{fruit} = \begin{pmatrix} a_1 🍎 \\ a_2🍌 \\ a_3 🍒 \end{pmatrix}$$

$$a_i \in {\rm I\!R}$$

For instance:

$$\vec{fruit} = \begin{pmatrix} 3 🍎 \\ 2🍌 \\ 1 🍒 \end{pmatrix}$$

### Conventions

- lowercase letters
- often written with an arrow on top (not always)
- can have any number of dimensions

## Adding and subtracting vectors

$$\vec{c} = \vec{a} + \vec{b} = \begin{pmatrix} a_1 - b_1 \\ a_2 - b_2 \end{pmatrix} = \begin{pmatrix} 3🍎 + 4 🍎 \\ 2🍌 + 5🍌 \\ 1🍒+6 🍒 \end{pmatrix}$$

![addition and difference](images/add_subtract.png)

## There are 4 ways to multiply vectors

1. scalar multiplication
2. component-wise product
3. dot product
4. cross product

*(even more types of products exist, but not in this workshop)*

## Scalar multiplications

$$3\vec{a} = \begin{pmatrix} 3a_1 \\ 3a_2 \\ 3a_3 \end{pmatrix}$$

- result is a vector of the same size
- multiply each item with the same number
- makes the vector longer or shorter (scaling)

## Component-wise Product

$$\vec{a} \circ \vec{b} = \begin{pmatrix} a_1 b_1 \\ a_2 b_2 \\ a_3 b_3 \end{pmatrix}$$

Example:

$$bill = \vec{fruit} \circ \vec{prices} = \begin{pmatrix} 3 🍎 \\ 2🍌 \\ 1 🍒 \end{pmatrix} \cdot \begin{pmatrix} 2 € \\ 1 € \\ 3 € \end{pmatrix} = \begin{pmatrix} 3🍎 \cdot 2 € \\ 2🍌 \cdot 1 € \\ 1🍒 \cdot 3 € \end{pmatrix} = \begin{pmatrix} 6 🍎€ \\ 2 🍌€ \\ 3 🍒€ \end{pmatrix}$$

- result is a vector of the same size
- multiply each item with the corresponding position
- also called Hadamard or Schur product

## Dot Product (inner product)

$$\vec{a} \cdot \vec{b} = \sum_i a_i b_i$$

Example:

$$bill = \vec{fruit} \cdot \vec{prices} = \begin{pmatrix} 3 🍎 \\ 2🍌 \\ 1 🍒 \end{pmatrix} \cdot \begin{pmatrix} 2 € \\ 1 € \\ 3 € \end{pmatrix} = \begin{pmatrix} 3🍎 \cdot 2 € \\ 2🍌 \cdot 1 € \\ 1🍒 \cdot 3 € \end{pmatrix} = 11€$$


- result is a scalar
- sum of the component-wise product
- geometrically a projection
- 0 if the vectors are *orthogonal*

## Cross Product (outer product)

$$\vec{a} \times \vec{b} = \begin{pmatrix} a_2 b_3 - a_3 b_2 \\ a_3 b_1 - a_1 b_3 \\ a_1 b_2 - a_2 b_1 \end{pmatrix}$$

- calculates an orthogonal vector
- non-commutative ($\vec{a} \times \vec{b} \neq \vec{b} \times \vec{a}$)
- 0 if vectors a and b are colinear
- works for > 3 dimensions but more complicated


## Special vectors

### Null vector

$$\vec{0} =  \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}$$

### Unit vector

$$\vec{u} =  \begin{pmatrix} 1 \\ 0 \end{pmatrix}$$

- length 1, aligned with an axis
- define a corrdinate system

![unit](images/unit_vecs.png)

## Summary

- **Vectors are arrays of numbers that represent geometrical entities or just data.**
- **there are four ways to multiply vectors**.

## Exercises:

![](exercises/sinwaves.png)

- Go to **www.academis.eu/linear_algebra**
- download and open the first Python notebeook
- run the code examples on **Vectors**.

# Overview

![Linear Algebra Mind Map](images/mind_map.png)

# 2. Matrices

## What is a matrix?

$$D = \begin{pmatrix} d_{11} & d_{12} \\ d_{21} & d_{22} \\ d_{31} & d_{32}\end{pmatrix} = \begin{pmatrix} 3🍎 & 6🍎 \\ 2🍌 & 0🍌 \\ 1🍒 & 10🍒 \end{pmatrix}$$

- a two-dimensional array of real numbers
- usually denoted by capital letters
- dimension n x m (rows x columns) ⚠️
- a table with data for many practical matters
- addition, subtraction and scalar multiplication are very similar to vectors

## Matrices in Numpy

⚠️`.shape` is the most important operation for debugging vectors and matrices!

In [None]:
import numpy as np

D = np.array([
    [3, 6],
    [2, 0],
    [1, 10],
])
D.shape  # -> 3, 2 (rows, columns)

## Transpose operation

$$D^T = \begin{pmatrix} d_{11} & d_{21} & d_{31} \\ d_{12} & d_{22} & d_{32}\end{pmatrix} = \begin{pmatrix} 3🍎 & 2🍌 & 1🍒 \\ 6🍎 & 0🍌 & 10🍒 \end{pmatrix}$$

- Swaps the axes of a matrix
- converts **column vectors** to **row vectors** and vice versa
- to transpose vectors in numpy, use `.reshape()` 

## Matrix-vector product (1)

Calculate the total cost for each shopper:

$$D^T \vec{prices} = \begin{pmatrix} 3🍎 & 2🍌 & 1🍒 \\ 6🍎 & 0🍌 & 10🍒 \end{pmatrix} \begin{pmatrix} 2 € \\ 1 € \\ 3 € \end{pmatrix} = \begin{pmatrix} 3🍎\cdot 2€ + 2🍌 \cdot 1€ + 1🍒 \cdot 3€ \\ 6🍎\cdot 2€ + 0🍌\cdot 1€ + 10🍒\cdot 3€ \end{pmatrix} = \begin{pmatrix} 11€ \\ 42€ \end{pmatrix}$$

- like the dot product, but with each row or column
- multiplies a **(2, 3)** matrix with a **(3, 1)** column vector
- result is a vector of size **2**
- non-commutative!

## Matrix-vector product (2)

Calculate the total cost for each shopper, reversed:

$$\vec{prices}^T D =  \begin{pmatrix} 2 € & 1 € & 3 € \end{pmatrix} \begin{pmatrix} 3🍎 & 6🍎 \\ 2🍌 & 0🍌 \\ 1🍒 & 10🍒 \end{pmatrix} = \begin{pmatrix} 3🍎\cdot 2€ + 2🍌 \cdot 1€ + 1🍒 \cdot 3€ \\ 6🍎\cdot 2€ + 0🍌\cdot 1€ + 10🍒\cdot 3€ \end{pmatrix} = \begin{pmatrix} 11€ & 42€ \end{pmatrix}$$


- multiplies a **(1, 3)** row vector with a **(3, 2)** matrix
- result is a **(1, 2)** row vector
- one dimension has to be the same, watch the indices!

## Matrix-vector product (3)

Calculate the total amount for each fruit type:

$$\vec{prices}^T D =
\begin{pmatrix} 2 € & 1 € & 3 € \end{pmatrix} \begin{pmatrix} 3🍎 & 6🍎 \\ 2🍌 & 0🍌 \\ 1🍒 & 10🍒 \end{pmatrix} = 
\begin{pmatrix} 3🍎\cdot 2€ + 2🍌 \cdot 1€ + 1🍒 \cdot 3€ \\ 6🍎\cdot 2€ + 0🍌\cdot 1€ + 10🍒\cdot 3€ \end{pmatrix} = 
\begin{pmatrix} 11€ \\ 42€ \end{pmatrix}$$

- multiplies a **(3, 2)** matrix with a **(3, 2)** matrix
- result is a **(2, 1)** matrix
- non-commutative!
- 
- 
- vector called a **row vector**, a (1, n) matrix, often denoted by transpose
  - one dimension has to be the same, watch the indices!

## Matrix-vector product (3)

$$\vec{a}^T D = \begin{pmatrix} a_1 & a_2 & a_3 \end{pmatrix} \begin{pmatrix} d_{11} & d_{12} \\ d_{21} & d_{22} \\ d_{31} & d_{32} \end{pmatrix} = \begin{pmatrix} a_1 d_{11} + a_2 d_{21} + a_3 d_{31} \\ a_1 d_{12} + a_2 d_{22} + a_3 d_{32} \end{pmatrix}$$

- vector called a **row vector**, a (1, n) matrix, often denoted by transpose
  - one dimension has to be the same, watch the indices!

## Matrix Multiplication

Generally, in matrix multiplication the corresponding values in the **inner** dimension are subject to a dot product:

$$C = AB$$

$$C_{ij} = \sum_k=1^n A_{ik} \cdot B_{kj}$$

### Properties

- non-commutative: $AB \neq BA$
- associative: $(AB)C = A(BC)$
- distributive: $A(B+C) = AB + AC$


## Square Matrices

Square matrices have two identical dimensions *(n, n)*.

$$A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix}$$

**⚠️ You will mess up the dimensions without causing an error!**



## Example: Linear Regression

- dot product is the foundation of Machine Learning and Neural Networks
- NVidia and other GPU chips are optimized hardware for calculating dot products and similar operations on matrices

$$\hat y = X \vec{w} + \vec{\epsilon}$$

## Summary

- Vectors are arrays of numbers that represent geometrical entities or just data.
- there are four ways to multiply vectors.
- **matrices are tables of numbers with *n x m* (rows x columns).**
- **vectors and matrices are multiplied similar to a dot product.**

# 3. Linear Transformations

![rotation](images/rotation.png)

- Matrices represent linear transformations of coordinate systems
- multplication with a matric transforms a vector
- if it is not a square matrix, the number of dimensions changes
- applications: 3D vector graphics, CAD, Unreal engine, 3D molecules

## Rotation by 90°

$$R_{90} = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}$$

#### Example:

$$R_{90} \vec{a} = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} \begin{pmatrix} 3 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 \cdot 3 + 1 \cdot 1 \\ -1 \cdot 3 + 0 \cdot 1 \end{pmatrix} = \begin{pmatrix} 1 \\ -3 \end{pmatrix}$$

## Rotation by other angles

$$R(\theta) = \begin{pmatrix} cos \theta & sin \theta \\ -sin \theta & cos \theta \end{pmatrix} $$

### In a 3D coordinate system:

$$R_x(\theta) = \begin{pmatrix}
1 & 0 & 0 \\
0 & cos \theta & -sin \theta \\
0 & sin \theta & cos \theta \end{pmatrix} $$

## Other linear transformations

![](images/lintrans.png)

- scale, flip, shear
- proportions on each axis stay the same
- angle between axes may change

## Special Transformation Matrices

### Identity Matrix

$$I = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}$$

Multiplication of a vector with an identita matrix results in the same vector.

$$I\vec{v} = \vec{v}$$

### Null Matrix

$$A = \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}$$

Multiplication of a vector with a null matrix results in a null vector.

# Determinants
**a key characteristic to describe linear transformation matrices**

![](images/determinant.png)

$$A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$$

$$det A = ad - bc$$

- the determinant of a matrix $det A$ specifies by what factor the area spanned by unit vectors changes during a linear transformation
- if $det A$ is 1, the scale of the image does not change (e.g. rotation)
- if $det A$ is negative, the orientation of the coordinate system reverses
- if $det A$ is 0, some axes fall together during the transformation


## Summary

- Vectors are arrays of numbers that represent geometrical entities or just data.
- there are four ways to multiply vectors.
- matrices are tables of numbers with *n x m* (rows x columns).
- vectors and matrices are multiplied similar to a dot product.
- **matrix multiplication rotates, scales, flips, shears vectors to a new coordinate system.**

# Overview

![Linear Algebra Mind Map](images/mind_map.png)

# 4. Distances and Norms

**A distance matrix of 333 penguins:**

![](images/dist_matrix.png)

We have seen how we can combine vectors and matrices in different ways.
Now, we will compare two vectors.

## Comparing vectors

**How can we measure the distance between 2 data points?**

![](images/penguin_table.png)

- **norms** measure the length of the difference vector.
- **cosine similarity** measures the angle between two vectors.

## L1 Norm: Manhattan distance

![](images/manhattan.png)

The Manhattan distance sums up the absolute of each component of a vector (or the difference of two)

$$||\vec{a}||_1 = \sum_i |a_i|$$

**Application:** regularization in Machine Learning.

## L2 Norm: Euclidean distance

![](images/euclidean.png)

Euclidean distance does the same calculation as in the Pythagorean theorem: it sums up square differences and then takes the square root.

$$||\vec{a}||_2 = \sqrt{\sum_i a_i^2}$$

**Application:** regularization in Machine Learning

## Cosine similarity

![](images/cosine.png)

- Cosine similarity measures the angle between two vectors
- between 1 (perfect match) and 0 (orthogonal)
- cosine rule explains why dot product is zero if angle is 90°

$$cos(\theta) = \frac{\vec{a} \cdot \vec{b}}{||\vec{a}||_2 \cdot ||\vec{b}||_2}$$

**Application:** search in RAGs, clustering, anomaly detection and recommender systems


## Example: Clustering

PENGUINS: distance matrix
PENGUINS: min-max scaling
PENGUINS: clustering

- distance matrix (for graph with weights)
- similarity matrix

In [None]:
PENGUINS: MSE, MAE
?? PENGUINS: Regularization

## Summary

- Vectors are arrays of numbers that represent geometrical entities or just data.
- there are four ways to multiply vectors.
- matrices are tables of numbers with *n x m* (rows x columns).
- vectors and matrices are multiplied similar to a dot product.
- matrix multiplication rotates, scales, flips, shears vectors to a new coordinate system.
- **The length of vectors can be measured by L1 and L2 norms.**
- **Cosine similarity measures the angle between vectors.**

# 5. Recommender Systems

In [None]:
Example: recommender system

- Exercise: find most similar user
  - find closest vector
  - RAG
  - which distance to use?

- matrix-matrix multiplication
- matrix factorization
- norm of matrix

Exercise: recommend a movie

PENGUINS: similarity search

# 6. Graph Analysis

In [None]:

Example: Markov chain
IMAGE: map of Europe

- adjacency matrix
- distance matrix (for graph with weights)
- similarity matrix
- symmetric matrix  : A = A.T
- unsymmetric: 

- centrality
  - we want to find out the important nodes
  - random movement
  - Markov chain
    - transition probability matrix
    - Markov rule A(t+1) = A(t) M
  - eigenvectors
    - pagerank

Example: number images

Application - PCA


# 7. Linear Equation Systems

In [None]:
PENGUINS: Linear Independence

  - inverse matrix (not the same as transpose!); not all matrices are invertible; singular matrix: ???

- determinant: area + orientation
  The determinant of a 2-D array [[a, b], [c, d]] is ad - bc:
- use determinant to check for linear independence (CHECK)

Add property weight in kg
- correlation matrix

- gauss elimination: requires square matrix N equations with N unknowns
- example: base fare + per mile * km
- robust: works with transposed matrices

Example applications:
- linear solvers
- traffic flow through network
- timetable problem, burrito


# 8. Vectorizing images and text

## Binary vectors (consist of 0 and 1)

convert penguin species to one-hot encoding

## 2D Convolution

IMAGE CNN

converts pixel information into features of images
technical basis for image recognition

- Convolution
- IMAGE: CNN
- interactive MNIST tool

## Word Vectors

- Word vectors
- coffee, espresso, cappuccino
- hand + glove = foot + shoe
- document vectors
- word embeddings
- Spacy example
- enumerate embedding dimensions
  -> applications: Neural

https://projector.tensorflow.org/
enter words like cat, shirt, apple. Note that the website projects 200-dimensional vectors into a 3D space using a linear transformation (PCA), therefore the distances between the points may not accurately reflect the real ones.