+ This notebook is part of lecture 32 *Left-, right-, and pseudoinverses* in the OCW MIT course 18.06 by Prof Gilbert Strang [1]
+ Created by me, Dr Juan H Klopper
    + Head of Acute Care Surgery
    + Groote Schuur Hospital
    + University Cape Town
    + <a href="mailto:juan.klopper@uct.ac.za">Email me with your thoughts, comments, suggestions and corrections</a> 
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" href="http://purl.org/dc/dcmitype/InteractiveResource" property="dct:title" rel="dct:type">Linear Algebra OCW MIT18.06</span> <span xmlns:cc="http://creativecommons.org/ns#" property="cc:attributionName">IPython notebook [2] study notes by Dr Juan H Klopper</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.

+ [1] <a href="http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm">OCW MIT 18.06</a>
+ [2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June 2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org

In [1]:
from IPython.core.display import HTML, Image
css_file = 'style.css'
HTML(open(css_file, 'r').read())

In [2]:
from sympy import init_printing, Matrix, symbols, sqrt, Rational
from numpy import matrix, transpose, sqrt
from numpy.linalg import pinv, inv, det, svd, norm
from scipy.linalg import pinv2
from warnings import filterwarnings

In [3]:
init_printing(use_latex = 'mathjax')
filterwarnings('ignore')

# Left- and right-sided inverses and pseudoinverses

## The inverse

+ Recall the four fundamental subspaces
    + The rowspace (with **x**) and nullspace in &#8477;<sup>n</sup>
    + The columnspace (with A**x**) and the nullspace of A<sup>T</sup> in &#8477;<sup>m</sup>

+ The two-sided inverse gives us the following
$$ {A}{A}^{-1}=I={A}^{-1}{A} $$
    + For this we need *r* = *m* = *n* (i.e. full rank)

+ For a left-inverse we have the following
    + Full column rank, with *r* = *n* (but possibly more rows)
    + The nullspace contains just the zero vector (columns are independent)
    + The rows might not all be independent
    + We thus have either no or only a single solution to A**x**=**b**
    + A<sup>T</sup> will now also have full rank
    + From (A<sup>T</sup>A)<sup>-1</sup>A<sup>T</sup>A = I follows the fact that (A<sup>T</sup>A)<sup>-1</sup>A<sup>T</sup> is a left-sided inverse (A<sup>-1</sup>)
    + Note, though, that (A<sup>T</sup>A)<sup>-1</sup>A<sup>T</sup> is a *n* &times; *m* matrix and A is of size *m* &times; *n*, resulting in a *n* &times; *n* identity matrix
    + We cannot do AA<sup>-1</sup> and have a *n* &times; *n* identity matrix, though, but instead will be a projection matrix (onto the columnspace)

+ For a right-inverse we have the following
    + Full row rank, with *r* = *m* < *n*
    + The nullspace of A<sup>T</sup> is the zero vector (rows are independent)
    + Elimination will result in many solutions to A**x**=**b** (*n* - *m* free variables)
    + Now there will be an A<sup>-1</sup> to the right of A to give I
    + AA<sup>T</sup>(AA<sup>T</sup>)<sup>-1</sup> = I
    + A<sup>-1</sup> is now A<sup>T</sup>(AA<sup>T</sup>)<sup>-1</sup>
    + Putting the right-inverse on the left is also a projection (onto the rowspace)

## The pseudoinverse

+ Consider a matrix where *r* is less than *m* and *n*
+ Remember that the rowspace is in &#8477;<sup>r</sup> and the columnspace is also in &#8477;<sup>r</sup>
+ The nullspace of the rowspace is in &#8477;<sup>n-r</sup> and the nullspace of A<sup>T</sup> is in &#8477;<sup>m-r</sup>
+ The rowspace and columnspace are in the same dimension and every vector **x** in one translate to another vector in the other (one-to-one)
    + If **y** in another vector in the rowspace (not same as **x**) then A**x** &ne; A**y**

+ The pseudoinverse A<sup>+</sup>, then, maps **x** (or **y**) from the columnspace to the rowspace
$$ y={A}^{+}{Ay} $$

+ Suppose A**x** = A**y** or A(**x**-**y**) = 0
    + Now (**x**-**y**) is in the nullspace *and* in the rowspace, i.e. it has to be the zero vector

### Finding the pseudoinverse A<sup>+</sup>

+ One way is to start from the singular value decomposition
$$  {A}={U}{\Sigma}{V}^{T} $$
+ &Sigma; has along the main diagonal all the square roots of the eigenvalues and *r* pivots, but *m* row and *n* columns which can be more than *r*
+ &Sigma;<sup>+</sup> will have 1 over the square roots of the eigenvalues along the main diagonals and then (possibly) zero values further along, but be of size *n* &times; *m*
+ &Sigma;&Sigma;<sup>+</sup> will have 1<sup>'s</sup> along the main diagonal, and then 0<sup>'s</sup> (if larger tha *r*)
    + It will be of size *m* &times; *m*
    + It is a projection onto the columnspace
+ &Sigma;<sup>+</sup>&Sigma; will also have 1<sup>'s</sup> along the main diagonal as well, but be of size *n* &times; *n*
    + It is a projection onto the rowspace

+ We now have the following
$$ {A}^{+}={V}{\Sigma}^{+}{U}^{T} $$

+ Let's see how easy this is in python™

In [4]:
A = matrix([[3, 6], [2, 4]]) # Not sympy
A, det(A) # The det is zero, so no inverse exists

(matrix([[3, 6],
         [2, 4]]), 0.0)

In [5]:
# The numpy pinv() function use SVD
Aplus = pinv(A)
Aplus

matrix([[ 0.04615385,  0.03076923],
        [ 0.09230769,  0.06153846]])

In [6]:
# The scipy pinv2() function also uses SVD
# The scipy pinv() function uses least squares to approxiamte
# the pseudoinverse and as matrices get BIG, this
# becomes computationally expensive
Aplus_sp = pinv2(A)
Aplus_sp

array([[ 0.04615385,  0.03076923],
       [ 0.09230769,  0.06153846]])

## Example problem

### Example problem 1

+ Calculate the pseudoinverse of A=[1,2]
+ Calculate AA<sup>+</sup>
+ Calculate A<sup>+</sup>A
+ If **x** is in the nullspace of A what is the effect of A<sup>+</sup>A on **x** (i.e. A<sup>+</sup>A**x**)
+ If **x** is in the columnspace of A<sup>T</sup> what is A<sup>+</sup>A**x**?

#### Solution

In [7]:
A = matrix([1, 2])
A

matrix([[1, 2]])

+ Let's use singular value decomposition

In [8]:
U, S, VT = svd(A)

In [9]:
U

matrix([[-1.]])

In [10]:
S

array([ 2.23606798])

In [11]:
VT

matrix([[-0.4472136 , -0.89442719],
        [-0.89442719,  0.4472136 ]])

+ Remember,
$$ {A}^{+}={V}{\Sigma}^{+}{U}^{T} $$
+ &Sigma; must be of size 2 &times; 1, though

In [12]:
S = matrix([[sqrt(5)], [0]])

In [13]:
Aplus = transpose(VT) * S * U
Aplus

matrix([[ 1.],
        [ 2.]])

+ This needs to be normalized

In [14]:
norm(Aplus)

2.2360679775

In [15]:
1 / norm(Aplus) * Aplus

matrix([[ 0.4472136 ],
        [ 0.89442719]])

In [16]:
Aplus = pinv(A)
Aplus

matrix([[ 0.2],
        [ 0.4]])

In [17]:
A * Aplus

matrix([[ 1.]])

In [18]:
Aplus * A

matrix([[ 0.2,  0.4],
        [ 0.4,  0.8]])

+ Let's create a vector in the nullspace of A
    + It will be any vector
    $$ c\begin{bmatrix}-2\\1\end{bmatrix} $$
+ Let's choose the constant *c* = 1

In [19]:
x_vect_null_A = matrix([[-2], [1]])
Aplus * A * x_vect_null_A

matrix([[ 0.],
        [ 0.]])

+ This is now surprise as A<sup>+</sup>A reflects a vector onto the rowspace of A
    + We chose **x** in the nullspace of A, so A**x** must be **0** and A<sup>+</sup>A**x** = **0**

+ The columnsapce of A<sup>T</sup> is any vector
$$ c\begin{bmatrix}1\\2\end{bmatrix} $$
+ We'll choose *c* = 1 again

In [20]:
x_vect_null_AT = matrix([[1], [2]])
Aplus * A * x_vect_null_AT

matrix([[ 1.],
        [ 2.]])

+ We recover **x** again

+ For fun, let's just check what A<sup>+</sup> is when A is invertible

In [21]:
A = matrix([[1, 2], [3, 4]])

In [22]:
pinv(A)

matrix([[-2. ,  1. ],
        [ 1.5, -0.5]])

In [23]:
inv(A)

matrix([[-2. ,  1. ],
        [ 1.5, -0.5]])