## The Generalized Inverse of Matrices Revisited

The $\mathbf{P}\mathbf{L}\mathbf{U}$ decomposition is a nice algorithmic
method that allows the solution of systems of simultaneous equations,
$\mathbf{A}\mathbf{x}=\mathbf{b}$, without explicitly calculating the
inverse of matrix $\mathbf{A}$, whenever it exists. For $\mathbf{A}$
invertible, $\mathbf{A}^{-1}$ is unique, the solution of the system of
equations is also unique, but we may not need to calculate
$\mathbf{A}^{-1}$ after all when solving for $\mathbf{x}$. However, when
the system admits an infinite number of solutions, we may wonder how to
proceed.

Let $\mathbf{A}:\mathbb{R}^m\rightarrow\mathbb{R}^n$ be a linear
transformation induced by matrix $\mathbf{A}$, with $n\le m$ and
$\text{rank}(\mathbf{A})=r\le n$. The case in which $n>m$ will be
treated afterwards. We can always decompose
$\mathbf{A}=\mathbf{P}\mathbf{L}\mathbf{U}$, with $\mathbf{P}$ and
$\mathbf{L}$ both square invertible matrices, and $\mathbf{U}$ upper
triangular as follows: $$\mathbf{U}=\left[\begin{matrix}
      \mathbf{u}_1^{\rm T}\\
      \mathbf{u}_2^{\rm T}\\
      \vdots\\
      \mathbf{u}_r^{\rm T}\\
      \mathbf{0^{\rm T}}\\
      \vdots\\
      \mathbf{0^{\rm T}}
    \end{matrix}\right]
    =
    \left[\begin{matrix}
      \mathbf{U}_1^{\rm T}\\
      \mathbf{0}
    \end{matrix}\right]$$ in which $\mathbf{u}_i\in\mathbb{R}^m$,
$i=1,\ 2,\ \ldots,\ r$, are linearly independent vectors. If we wish to
find a matrix $\mathbf{B}$ that satisfies the first two Penrose
equations defined in [2.6](https://martinamj.github.io/Online-Linear-Algebra/lab/index.html?path=2.6+-+GENERALIZED+INVERSE+AND+PENROSE+EQUATIONS.ipynb), rewriten
here for convinience, 
$$\begin{split}
    &\mathbf{A}\mathbf{B}\mathbf{A}=\mathbf{A}\\
    &\mathbf{B}\mathbf{A}\mathbf{B}=\mathbf{B}
  \end{split}$$ 
then it suffices to write, for
$\mathbf{A}=\mathbf{P}\mathbf{L}\mathbf{U}$,
$$\mathbf{B}= \mathbf{U}^\#\mathbf{L}^{-1}\mathbf{P}^{-1}$$ where
$\mathbf{U}^\#$ is a matrix with $m$ rows and $n$ columns, of which the
first $r$ have the right inverse of $\mathbf{U}_1^{\rm T}$, and the
remaining $n-r$ columns are zero, i.e., $$\mathbf{U}^\# = 
    \left[\begin{matrix}\mathbf{u}_1^\# & \mathbf{u}_2^\# & \cdots & \mathbf{u}_r^\# & \mathbf{0} & \cdots & \mathbf{0}
    \end{matrix}\right]$$ and
$$\mathbf{U}_1^{\rm T}\mathbf{U}^\#=\left[\begin{matrix}\mathbf{I}& \mathbf{0}\end{matrix}\right]$$
with
$$<\mathbf{u}_i^\#,\mathbf{u}_j>=\begin{cases}1, \quad &i=j\\0,& \text{otherwise}\end{cases}$$
The reader is invited to verify that the condition of the $n-r$ columns
equal to zero in the definition of $\mathbf{U}^\#$ is sufficient for the
first Penrose equation, but necessary for satisfying both Penrose
equations written above.

However, if we wish to find the generalized inverse,
$\mathbf{A}^\dagger$, also called the Moore-Penrose inverse, then all
four Penrose equations need to be satisfied, i.e., $$\begin{split}
    &\mathbf{A}\mathbf{A}^\dagger\mathbf{A}=\mathbf{A}\\
    &\mathbf{A}^\dagger\mathbf{A}\mathbf{A}^\dagger=\mathbf{A}^\dagger\\
    &(\mathbf{A}\mathbf{A}^\dagger)^{\rm T}=\mathbf{A}\mathbf{A}^\dagger\\
    &(\mathbf{A}^\dagger\mathbf{A})^{\rm T}=\mathbf{A}^\dagger\mathbf{A}
  \end{split}$$ In this case, we need a full rank decomposition of
$\mathbf{A}$. Let $\mathbf{A}:\mathbb{R}^m\rightarrow\mathbb{R}^n$ be a
linear transformation induced by matrix $\mathbf{A}$, with $n\le m$ and
$\text{rank}(\mathbf{A})=r\le n$. Let
$\mathbf{Q}:\mathbb{R}^r\rightarrow\mathbb{R}^n$ be a matrix whose $r$
columns are linearly independent and span the range of $\mathbf{A}$,
i.e., $\mathcal{R}(\mathbf{Q})=\mathcal{R}(\mathbf{A})$. Therefore each
column $\mathbf{a}_i$, $i=1,\ 2,\
  \ldots,\ m$, of $\mathbf{A}$ can be written as a linear combination of
the columns $\{\mathbf{q}_i\}$, $i=1,\ 2,\
  \ldots,\ r$, of $\mathbf{Q}$,
$$\mathbf{a}_i = \sum_{j=1}^r r_{ji}\mathbf{q}_j$$ and
$\mathbf{A}=\mathbf{Q}\mathbf{R}$, where
$\mathbf{R}:\mathbb{R}^m\rightarrow\mathbb{R}^r$ is the matrix with the
elements $r_{ji}$ above. Both matrices $\mathbf{Q}$ and $\mathbf{R}$
have rank $r$. $\mathbf{Q}$ has $r$ linearly independent columns,
therefore $\text{rank}(\mathbf{Q})=r$. $\mathbf{R}$ has to have rank $r$
because it has $r$ rows, therefore $\text{rank}(\mathbf{R})\le r$, but
$r=\text{rank}(\mathbf{A})=\text{rank}(\mathbf{Q}\mathbf{R})\le\min(\text{rank}(\mathbf{Q}),\text{rank}(\mathbf{R}))$.
Therefore $\mathbf{R}$ has $r$ linearly independent rows. For such
decomposition, which exists for all matrices $\mathbf{A}$, we can
calculate the generalized inverse of matrix $\mathbf{A}$ as
$$\mathbf{A}^\dagger = \mathbf{R}^{\rm T}\left(\mathbf{Q}^{\rm T}\mathbf{A}\mathbf{R}^{\rm T}\right)^{-1}\mathbf{Q}^{\rm T}$$
The inverse
$\left(\mathbf{Q}^{\rm T}\mathbf{A}\mathbf{R}^{\rm T}\right)^{-1}$ is
guaranteed to exist, for
$\mathbf{Q}^{\rm T}\mathbf{A}\mathbf{R}^{\rm T}$ is a square $r\times r$
matrix with
$\text{rank}(\mathbf{Q}^{\rm T}\mathbf{A}\mathbf{R}^{\rm T})=r$.
Verification that $\mathbf{A}^\dagger$ satisfies all four Penrose
equations is straightforward.

The case where $\mathbf{A}:\mathbb{R}^m\rightarrow\mathbb{R}^n$ with
$m<n$ and $\text{rank}(\mathbf{A})=r\le m$ is similar, and the
expression for the generalized inverse is the same as above.

### Example

This chapter gave us a way to find the generalized inverse of a matrix using PLU decomposition. 

We know that non-square matrices may not have a unique inverse matrix. In this case, we can seach for a matrix $Gy=x$ such that $Ax = y \rightarrow AGy = y$. Equivalently, we can define: $AGA = A$. Matrix $G$ may not be unique, therefore, for each pair $(x,y)$, its respective $G$ matrix is different from that of other pairs. In such cases, since it is impossible to find an unique matrix that gives $x$, given $y$, we may look for an unique matrix $\tilde{G}$ that gives the nearest possible $\tilde{x}$ to $x$ given $y$. Notice that this is not an inverse, because $\tilde{x} \neq x$. Remember that $\tilde{G}$ is a pseudo-inverse. In more mathematical terms: $\tilde{G}$ is such that $\tilde{G}y=\tilde{x}, \tilde{x} = argmin_{x}||Ax - y||$

The pseudo-inverse that solves the problem above is called a least squares pseudo-inverse. 

All least squared generalized inverses are Moore-Penrose generalized inverses. Can you prove that?

In the example below, we provide a representation of how the pseudo-inverse minimizes $||A\tilde{x} - y||$. Insert any rectangular matrix $A$ below and a vector $x$. As a result, you will see the resulting vector $y$, the generalized pseudo-inverse $\tilde{G}$ and $\tilde{x}$. You will observe that vector $\tilde{x}$ will be very close, if not identical, to the vector you input. 

In [1]:
%pip install -q ipywidgets==8.0.7 

Note: you may need to restart the kernel to use updated packages.


In [2]:
import ipywidgets as widgets
import numpy as np
import matplotlib.pyplot as plt

In [3]:
a11 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a12 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a13 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a21 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a22 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a23 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a31 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a32 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a33 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a41 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a42 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
a43 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))

b1 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
b2 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))
b3 = widgets.FloatText(value=0, description='', step=1, layout=widgets.Layout(width='50px'))

matrix_inputs = widgets.GridBox([a11, a12, a13, a21, a22, a23, a31, a32, a33, a41, a42, a43], layout=widgets.Layout(grid_template_columns="60px 60px 60px"))        
vector_inputs = widgets.GridBox([b1, b2, b3], layout=widgets.Layout(grid_template_columns="60px"))

matrix_description = widgets.Label("A = ")
vector_description = widgets.Label(", x = ")

matrix_hbox = widgets.HBox([matrix_description, matrix_inputs])
vector_hbox = widgets.HBox([vector_description, vector_inputs])

general_hbox = widgets.HBox([matrix_hbox, vector_hbox])

display(general_hbox)

HBox(children=(HBox(children=(Label(value='A = '), GridBox(children=(FloatText(value=0.0, layout=Layout(width=…

In [4]:
matrixA = np.array([[a11.value, a12.value, a13.value],[a21.value, a22.value, a23.value],[a31.value, a32.value, a33.value],[a41.value, a42.value, a43.value]])
vectorx = np.array([b1.value, b2.value, b3.value])

y = np.matmul(matrixA,vectorx)
pseudo_inv = np.linalg.pinv(matrixA)
x_til = np.matmul(pseudo_inv,y)
norm = np.linalg.norm((np.matmul(matrixA, x_til)) - y)

print(f"Vector y: {y} \nPseudo-inverse: \n {pseudo_inv} \nVector x~: {x_til}")

Vector y: [0. 0. 0. 0.] 
Pseudo-inverse: 
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]] 
Vector x~: [0. 0. 0.]


### References

Rao C. Radhakrishna, Mitra Sujir Kumar. Generalized Inverse of Matrices and Its Applications. [place unknown]: John Wiley and Sons; 1971. ISBN: 0-470 70821-6.