# CSS 201 / 202 - CSS Bootcamp

## Week 04 - Lecture 03

### Umberto Mignozzetti

## Matrices

**Applications**:

**Centered vector:** Use the following formula to compute the mean-centered vector:

$$\mathbf{x} - \dfrac{1}{n}\mathbf{i}\mathbf{i}^T\mathbf{x}$$

Where $\mathbf{i}$ is a vector of ones.

Hint 1: A vector of ones can be done with `np.ones([rows, cols])`.

Hint 2: To transpose a numpy vector, do `vec.T`.

In [None]:
# Importing libraries
import numpy as np
import pandas as pd
import scipy
import scipy.linalg
import matplotlib.pyplot as plt

## Matrices

**Applications**:

1. **Centered vector**: Prove that the operation $M_0 = I - \dfrac{1}{n}\mathbf{i}\mathbf{i}^T$ is idempotent.

2. **Sum of squares**: Compute the product $\mathbf{x}^TM_0\mathbf{x}$. It is equal to the sum of squares.

3. **Covariance matrix**: Suppose you have a dataset $X$. The covariance matrix is defined as $\dfrac{\mathbf{X}^TM_0\mathbf{X}}{n-1}$. Compute the covariance matrix of the data below.

4. **Covariance matrix**: A power move in here is to plot the matrix. Do that. Did you find it interesting?

In [None]:
# Your answers here
dat = pd.read_csv('https://raw.githubusercontent.com/umbertomig/POLI175public/main/data/anes2020.csv').values
dat

## Matrices

**Matrix Inverse** (cont'd): Solve a system with inverse matrices is straightforward:

$$
\begin{eqnarray*}
A\mathbf{x} & = \mathbf{b} \\
A^{-1}A\mathbf{x} & = A^{-1}\mathbf{b} \\
I\mathbf{x} & = A^{-1}\mathbf{b} \\
\mathbf{x} & = A^{-1}\mathbf{b}
\end{eqnarray*}
$$

In [None]:
A = np.random.randn(3, 3)
np.linalg.inv(A)

## Matrices

**Matrix Inverse**:

- Left Inverse: Inverse such that $LX = I$ is defined by $XL$ is not

- Right Inverse: Inverse such that $XR = I$ is defined by $RX$ is not.

- One efficient way to find a one-side inverse is to make the matrix square. But how to do that? Consider the following ideas:

    1. $X^TX$
    2. $XX^T$
    
- Let us see the derivations.

In [None]:
A = np.random.randn(10, 3)

## Matrices

**Matrix Inverse**:

**Check-in**: Derive the matrix for the right inverse.

**Proposition:** If the inverse exists, then it is unique. Prove that.

**Pseudoinverse**: Makes a singular matrix close (but not equal) to the inverse. Not unique!
    
- *Moore-Penrose Pseudoinverse*:

In [None]:
A = np.array([[1, 2], [2, 4]])
print('Pseudoinverse: ')
print(np.linalg.pinv(A), end = '\n\n')

print('Pseudoinverse in action: ')
print(A @ np.linalg.pinv(A), end = '\n\n')

## Matrices

**Matrix Inverse**:

Inverting a matrix is a mathematically demanding operation.

The computer may get it wrong eventually.

This is a major source of discrepancy in solutions to the same problem.

Example: *Hilbert matrices*

In [None]:
A = scipy.linalg.hilbert(3)
A

## Matrices

**Decompositions**: 

The main reason to study so much matrices, is to create new representation of data.

We achieve this by doing decompositions into matrices that have desirable properties.

What is a decomposition? It is when we rewrite an object as a combination of others, provided that these others have desirable properties.

Example: For numbers, if you get the number $3.13$, a useful decomposition is

$$3.13 = 3 + 0.13$$

## Matrices

**Orthogonal Matrices**: Very useful types of matrices. We will learn how to decompose a matrix into its orthogonal matrices.

**Definition**: A matrix $Q$ is said to be **orthogonal** if:

1. All columns are pair-wise orthogonal
2. The norm of each column is 1 (unit column vectors)

Meaning:
$$
<\mathbf{q}_i, \mathbf{q}_j> = \begin{cases}
    0 & \text{if } i \neq j \\
    1 & \text{otherwise}
  \end{cases}
$$

## Matrices

**Orthogonal Matrices**: Why do we care about those? Simple:

$$Q^TQ = I$$

Awesome! How to compute an orthogonal matrix from a non-orthogonal one?

## Matrices

**Gram-Schmidt** procedure:

For all column vectors in  starting from the first (leftmost) and moving systematically to the last (rightmost):

1. Orthogonalize $\mathbf{v}_k$ to all previous columns in matrix $Q$ using orthogonal vector decomposition.
    - Compute the component of $\mathbf{v}_k$ that is perpendicular to $\mathbf{q}_{k-1}$, $\mathbf{q}_{k-2}$, and so on down to $\mathbf{q}_{1}$. The orthogonalized vector is called $\mathbf{v}^*_k$

2. Normalize $\mathbf{v}^*_k$ to unit length. This is now $\mathbf{v}_k$, the kth column in matrix $Q$.

## Matrices

**QR Decomposition**: We decompose a matrix $A$ into its orthogonal component and the rest.

$$A = QR$$

Let us find $R$?

In [None]:
A = np.random.randn(3,3)
Q, R = np.linalg.qr(A)
print(Q)
print(R)

## Matrices

**QR Decomposition**:

1. Note that the matrix $R$ is always upper-triangular.
2. How much is $A^{-1}$ in QRese?
3. Inverse with QR has fewer computation errors! Show that for a 30 x 30 random matrix.

In [None]:
# Your code here

## Matrices

**LU Decomposition**: Another useful decomposition.

We transform a matrix into a product of a lower-triangular times an upper triangular matrices.

How to do that? It is part of the steps we do to find the inverse using Gauss-Jordan.

In [None]:
A = np.array([ [2, 2, 4], [1, 0, 3], [2, 1, 2] ])
_, L, U = scipy.linalg.lu(A)
print(L, end = '\n\n')
print(U)

## Matrices

**Applications**:

**Least Square method**: The least square method is a projection.

<div>
<img src="./imgl3/projreg.png" width="500"/>
</div>

1. Derive the Least Square method (meaning: derive $\mathbf{\beta}$.

2. Compute the least square method and apply it on the dataset, where the first row is the $\mathbf{y}$ and the rest is $X$.

3. Compute the least-square method using QR decomposition.

In [None]:
# Your answers here
y = dat[:,0]
X = np.append(np.ones([dat.shape[0], 1]), dat[:,1:], axis = 1)

## Matrices

**Eigendecomposition:** Probably the most useful decomposition in data sciences.

We want to find vectors $\mathbf{v}$ (eigenvector) and constants $\lambda$ (eigenvalues) such that

$$A\mathbf{v} = \lambda\mathbf{v}$$

*The effect of the matrix on the vector is the same as the effect of the scalar on the same vector!*

## Matrices

**Eigendecomposition:** Geometry

<div>
<img src="./imgl3/eigen1.png" width="500"/>
</div>

## Matrices

**Eigendecomposition:** Principal Components Analysis

<div>
<img src="./imgl3/pcaill.svg" width="500"/>
</div>

## Matrices

**Eigendecomposition:** It is straightforward to compute them in Python.

In [None]:
A = np.array([[1, 2],
              [3, 4]])
np.linalg.eig(A)

## Matrices

**Eigendecomposition:** But what are we doing?

$$
\begin{eqnarray*}
A\mathbf{v} &=& \lambda \mathbf{v} \\
A\mathbf{v} - \lambda \mathbf{v} &=& \mathbf{0} \\
(A - \lambda I)\mathbf{v} &=&\mathbf{0}
\end{eqnarray*}
$$

And so, the matrix $A - \lambda I$ should be singular. To this be true:

$$det(A - \lambda I) = 0$$

And to compute determinants in numpy: `np.linalg.det`! This will find the *characteristic polynomial*.

## Matrices

**Eigendecomposition:** The decomposition leads to:

$$
\begin{eqnarray*}
A\mathbf{v}_1 &=& \lambda_1 \mathbf{v}_1 \\
&\vdots& \\
A\mathbf{v}_n &=& \lambda_n \mathbf{v}_n
\end{eqnarray*}
$$

And thus, the decomposition is:

$$AV = V\Lambda$$

## Matrices

**Eigendecomposition:**

***Symmetric matrices have orthogonal eigenvectors!***

In [None]:
# just some random symmetric matrix
A = np.random.randint(-3, 4, (3, 3))
A = A.T @ A

# its eigendecomposition
L, V = np.linalg.eig(A)

# all pairwise dot products
print( np.dot(V[:,0], V[:,1]) )
print( np.dot(V[:,0], V[:,2]) )
print( np.dot(V[:,1], V[:,2]) )

## Matrices

**Quadratic Forms**: One application is in quadratic forms. These forms are:

$$\mathbf{w}^TA\mathbf{w} = \alpha$$

Important:

$$
\begin{eqnarray*}
A\mathbf{v} &=& \lambda\mathbf{v} \\
\mathbf{v}^TA\mathbf{v} &=& \lambda\mathbf{v}^T\mathbf{v} \\
\mathbf{v}^TA\mathbf{v} &=& \lambda \lVert \mathbf{v} \rVert^2
\end{eqnarray*}
$$



## Matrices

And this is useful to classify matrices:

<div>
<img src="./imgl3/deff1.png" width="500"/>
</div>

## Matrices

And this is useful to classify matrices:

<div>
<img src="./imgl3/deff2.png" width="500"/>
</div>

Let us do some calculus now :)

# Great job!!!