<a href="https://colab.research.google.com/github/rahiakela/deep-learning-research-and-practice/blob/main/math-and-architectures-of-deep-learning/introduction-to-vectors-matrices-and-tensors/02_pseudo_inverse_and_over_determined_system.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Pseudo inverse and over-determined system

Let's consider the over-determined system corresponding to cat-brain.

There are 15 training examples, each with input and desired outputs specified.

Our goal is to determine 3 unkwnowns (w0, w1, b).

This can be cast as an over-determined system of equations

$$
A\vec{w} = \vec{y}
$$
where
$$ 
A =
\begin{bmatrix}
        0.11 & 0.09 & 1.00 \\
        0.01 & 0.02 & 1.00 \\
        0.98 & 0.91 & 1.00 \\
        0.12 & 0.21 & 1.00 \\
        0.98 & 0.99 & 1.00 \\
        0.85 & 0.87 & 1.00 \\
        0.03 & 0.14 & 1.00 \\
        0.55 & 0.45 & 1.00 \\
        0.49 & 0.51 & 1.00 \\
        0.99 & 0.01 & 1.00 \\
        0.02 & 0.89 & 1.00 \\
        0.31 & 0.47 & 1.00 \\
        0.55 & 0.29 & 1.00 \\
        0.87 & 0.76 & 1.00 \\
        0.63 & 0.24 & 1.00
\end{bmatrix}
\;\;\;\;\;\;\;
\vec{y} = 
\begin{bmatrix}
        -0.8 \\
        -0.97 \\
        0.89 \\ 
        -0.67 \\ 
        0.97 \\ 
        0.72 \\ 
        -0.83 \\ 
        0.00 \\
        0.00 \\
        0.00 \\
        -0.09 \\
        -0.22 \\ 
        -0.16 \\
        0.63 \\
        0.37
\end{bmatrix}
\;\;\;\;\;\;\;
\vec{w} = \begin{bmatrix} w_{0}\\w_{1}\\b\end{bmatrix}
$$

We solve for $\vec{w}$ using the pseudo inverse formula $\space\space\large{\vec{w} = (A^TA)^{-1}A^Ty}$

Note that this is not a classic system of equations - it has more equations than unknowns. 

We cannot solve this via matrix inversion. We can however, use the pseudo-inverse mechanism
to solve this. 

The resulting solution yields the ”best fit” or ”best effort” solution, which minimizes the total error over all the training examples.

In [1]:
import numpy as np
import torch

In [2]:
torch.manual_seed(42)

<torch._C.Generator at 0x7fd655359a70>

Let us revisit our cat brain data set.

Notice that there are 15 training examples, with 3 unkwnowns (w0, w1, b).
This is an over determined system.

It can be easily seen that the solution is roughly
$w_{0} = 1, w_{1} = 1, b = -1$.

It has been deliberately chosen as such.
But the equations are not fully consistent (i.e., there is
 no solution that satisfies all the equations).
 
We want to find the best values such that it minimizes $Aw - b$.
This is what the pseudo-inverse does.

In [3]:
def pseudo_inverse(A):
  return torch.matmul(torch.linalg.inv(torch.matmul(A.T, A)), A.T)

In [5]:
# The usual cat-brain input dataset
X = torch.tensor([
  [0.11, 0.09], [0.01, 0.02], [0.98, 0.91], [0.12, 0.21],
  [0.98, 0.99], [0.85, 0.87], [0.03, 0.14], [0.55, 0.45],
  [0.49, 0.51], [0.99, 0.01], [0.02, 0.89], [0.31, 0.47],
  [0.55, 0.29], [0.87, 0.76], [0.63, 0.24]
])

# Output threat score modeled as a vector
y = torch.tensor([-0.8, -0.97, 0.89, -0.67, 0.97, 0.72, -0.83, 0.00, 0.00, 0.00, -0.09, -0.22, -0.16, 0.63, 0.37])
# Column stack will add an additional column of 1s to the training dataset to represent the coefficient of the bias
A = torch.column_stack((X, torch.ones(15)))  # A is the augmented data matrix

# Pseudo-inverse finds the ”best fit” solution - minimizes total error for all the equations
w = torch.matmul(pseudo_inverse(A), y)
# Expect the solution to be close to [1, 1, -1]
print(f"The solution is {w}\nNote that this is almost equal to [1.0, 1.0, -1.0])")

The solution is tensor([ 1.0766,  0.8976, -0.9582])
Note that this is almost equal to [1.0, 1.0, -1.0])
