# Topic 14: Matrix INverse and Pseudo Inverses
Author: Evan Chrisney, echrisney@gmail.com

# Introduction
 
Matrix inverses are matrices which when multiplied by their oringinal matrices equate to the identity matrix. Only square matrices are invertible, but non square matrices may have a left or right psuedo inverse. A matrix $\mathbf{A}$ has a left inverse if there exists a matrix $\mathbf{B}$ such that $\mathbf{BA} = I$, and a right inverse if there exists a matrix $\mathbf{C}$ such that $\mathbf{AC} = I$ (Moon and Stirling, 2000). Square matrices that are not invertible are called singular matrices. A matrix is invertible if it has both a right and left inverse. Left and right inverses are types of pseudo inverses. 

Matrix inverses are important because they provide solutions to the ubiquitous equation $\mathbf{A}x = b$. There are many conditions that satisfy the invertibility of a matrix which will be explained in detail below. If a true matrix inverse does not exist for $\mathbf{A}$, or if $\mathbf{A}$ is not square, then the left and right pseudo inverses are solutions to $\mathbf{A}x = b$. 

# Explanation of the theory

$\textbf{Matrix Inverse (nxn):}$

A square (n x n) matrix is invertible if there exists a square (n x n) matrix such that
$$\mathbf{A}\mathbf{B} = \mathbf{B}\mathbf{A} = I,$$
where $I$ is the identity matrix of size n x n. $\mathbf{B}$ is a unique matrix hereby notated as $\mathbf{A^{-1}}$. 

A matrix $\mathbf{A}$ is invertible if and only if it has the following properties: 

1. The null space of $\mathbf{A}$ is the null vector {0}. 
2. $\mathbf{A}$ is full rank, meaning the rank of $\mathbf{A}$ is n. 
3. $\mathbf{A}x = 0$ implies that $x = 0$. 
4. The rows and columns of $\mathbf{A}$ are linearly independent.  
5. The determinant of $\mathbf{A}$ is non zero. 
6. $\mathbf{A}$ has no zero eigenvalues. 
7. $\mathbf{A^HA}$ is positive definite
8. $\mathbf{A}$ is nonsingular. 
9. $\mathbf{A}$ has n pivots. 
10. The transpose of $\mathbf{A}$, $\mathbf{A}^T$, is also invertible. 
11. $\mathbf{A}$ has both a left inversve and a right inverse $\mathbf{B}$ and $\mathbf{C}$ such that $\mathbf{B}$ = $\mathbf{C}$ = $\mathbf{A}^{-1}$. 
12. $(\mathbf{A}^{-1})^{-1} = \mathbf{A}$
13. $(\mathbf{A}^{T})^{-1} = (\mathbf{A}^{-1})^{T}$

The adjugate of $\mathbf{A}$ can be used to determine $\mathbf{A}^{-1}$ as
$$ \mathbf{A}^{-1} = \frac{1}{det(\mathbf{A})}adj(\mathbf{A}), $$
where $adj$ denotes the adjugate and $det$ denotes the determinant. 

$\textbf{Matrix Pseudo Inverses (nxm):}$
A non-square (n x m) matrix may have a left or right pseudo inverse, as explained above. 

$\textit{Left Inverse} (nxm), m >= n:\\$
For a system $\mathbf{A}x = b$, there exists a unique left inverse if and only if the columns of $\mathbf{A}$ are linearly independent, or $rank(\mathbf{A} = n)$, and $m >= n$, ie $\mathbf{A}$ is a tall matrix. 

The left inverse, or moore penrose pseudo inverse, is 
$\mathbf{B} = (\mathbf{A}^{H}\mathbf{A})^{-1}\mathbf{A}^H$, 
which is the identical to the least squares solution for $\mathbf{A}x = b$, or 
$x = (\mathbf{A}^{H}\mathbf{A})^{-1}\mathbf{A}^{H}b$.

$\textit{Right Inverse} (nxm), n >= m:\\$
For a system $\mathbf{A}x = b$, there exists at least one solution for any $b$ if and only if the rows of $\mathbf{A}$ are linearly independent, or $rank(\mathbf{A} = m)$, and $n >= m$, ie $\mathbf{A}$ is a fat matrix. In this case, the solution is a right inverse. 

The right inverse is 
$\mathbf{C} = \mathbf{A}^H(\mathbf{A}\mathbf{A}^{H})^{-1}$, 
which is the identical to the minimum norm solution for $\mathbf{A}x = b$, or 
$x = \mathbf{A}^H(\mathbf{A}\mathbf{A}^{H})^{-1}b$.

# Simple Numerical Example

Here is some simple python code that shows the concepts of matrix inversion for a square matrix, as well as computation of the left and right inverses, which all should be equal for this example. 

In [1]:
import numpy as np

# numpy includes its own inverse, but this function drives home the topics above. 
######################################################
# Compute the matrix inverse of a square matrix using the adjugate method


# Declare a matrix that we know is invertible
A = np.matrix([[-3, 2, -5], [-1, 0, 2], [3, -4, 1]])

# get the determinant
m = np.linalg.det(A)

# Grab length of A which we know is nxn
l = len(A) 

# Initialize the adjugate matrix to 0's
C = np.zeros((l,l))

# Loop through A and compute the adjugate
for i in range(l):
        for j in range(l):
                # Single out the rows/cols to compute minor determinants
                #
                # I realize i and j are swapped, this saves the step of
                # transposing C later. 
                temp = np.delete(A, (j), axis = 0)
                temp = np.delete(temp, (i), axis = 1)
                # Compute minor determinant
                M = np.linalg.det(temp)
                # Compute the adjugate
                C[i][j] = (-1)**(i+j)*M

# Finish by computing the adjugate
A_inv = 1/m*C
# Print out matrix inverse
print(A_inv)

#####################################################
# Now find the left inverse, which should be the same
A_tran = A.getH()
Grammian_l = np.linalg.inv(A_tran*A)
A_dagger_l = Grammian_l*A_tran
print(A_dagger_l)


#####################################################
# Now find the right inverse which is the same
Grammian_r = np.linalg.inv(A*A_tran)
A_dagger_r = A_tran*Grammian_r
print(A_dagger_r)

[[-0.26666667 -0.6        -0.13333333]
 [-0.23333333 -0.4        -0.36666667]
 [-0.13333333  0.2        -0.06666667]]
[[-0.26666667 -0.6        -0.13333333]
 [-0.23333333 -0.4        -0.36666667]
 [-0.13333333  0.2        -0.06666667]]
[[-0.26666667 -0.6        -0.13333333]
 [-0.23333333 -0.4        -0.36666667]
 [-0.13333333  0.2        -0.06666667]]


# Engineering Application: Remote Sensing

As written above, the matrix inverses provide a solution to the ubiquitous equation $\mathbf{A}x = b$. An engineering example used in my research quite frequently is using least squares (LS) to fit a line to some data, where we'll solve for $s$ in $\mathbf{A}s = x$ using a left pseudo inverse. 

As a brief introduction to this application, I will fit normalized radar cross section (RCS) $\sigma^0$ data vs incidence angle from a satellite radar at C band know as the advanced scatteromer, or ASCAT. ASCAT is a fan beam scatterometer that observes the earth surface at a variable range of incidence angles, from about 30 to 60 degrees. The purpose of this application is to determine what the $\sigma^0$ incidence angle dependence ($s$) is at C band in order to normalize ASCAT measurements to one incidence angle for cross calibration purposes.  

Previous studies have shown that $\sigma^0$ exhibits a log-linear dependence with incidence angle over tropical rainforests over the mid incidence angle range from about 30 to 60 degrees incidence (N. Madsen, BYU Masters Thesis, 2015). Due to the log-linear dependence, the dependence is easily estimated as the slope $s$ of the first order polynomial 

$s_1\theta + s_2 = \sigma^0, $

where $s_1$ and $s_2$ are the coefficients of $s$ and $s_1$ is the slope which we're solving for, $\theta$ is the ASCAT incidence angle data, and $\sigma^0$ is the ASCAT RCS data. In matrix form, this equation is

$ [\mathbf{\theta}, \mathbf{1}] [s_1, s_2]^{T} = \sigma^0$,

which easily seen of the form $\mathbf{A}s = x$. 

To determine the ASCAT $\sigma^0$ incidence angle dependence, we create the LS equation

$ s = \mathbf{A}^{\dagger}\mathbf{x}, $

where $s$ is the dB/degree dependence, $\mathbf{A}^{\dagger}$ is the left psuedo inverse of the ASCAT incidence angle data, and $x$ is $\sigma^0$ RCS data. Again, only the first coefficient of $s$ is of interest, as it gives the slope of the line.

In our application, we will only use a small portion of the actual ASCAT data, since there are millions of measurements in a given day. To simplify this, I truncated millions of measurements for a day into 21 measurements to give a feel for what the approximate incidence angle dependence $s$ is. 

In [2]:
import numpy as np

##############################################################################################
# Evan Chrisney
# 671 Application code for Pseudo Inverses
# This script will use a left inverse to solve a 1st order poylnomial equation as described above

# Create A matrix as described above using actual ASCAT incdience angle values
A = np.matrix([[59, 48.63, 50.14, 54.22, 39.23, 52.36, 52.6, 44.27, 55.07, 59.06\
,58.32, 58.5, 37.49, 45.61, 52.13, 42.04, 54.74, 58.53, 38.19, 45.74, 59.31], \
[1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

# Create sigma_0 row vector using actual ASCAT RCS values
sigma_0 = np.array([-9.5103, -8.2828, -8.0002, -8.2415, -8.4277, -9.0573, \
-8.6784, -8.5909, -9.0481, -9.6975, -8.8720, -9.7875, -8.3998, -10.0404, \
-10.0735, -6.8197, -9.2742, -9.5769, -7.5610, -8.9871, -10.3787])

# Get length of sigma_0 for use in reshape the row vector to a column vec
l = len(sigma_0)

# Transpose A into a 21x2 matrix instead of 2x21
A = A.transpose()
# Transpose sigma_0 into a column vector
sigma_0 = np.reshape(sigma_0, (l, 1))

# Get A hermitian, although transpose would also work since these are reals
A_tran = A.getH()
# Create the left inverse using the Grammian as above
Grammian_l = np.linalg.inv(A_tran*A)
# Compute the left inverse of A as above
A_dagger_l = Grammian_l*A_tran
# Solve for s using the left inverse
s = A_dagger_l*sigma_0
# Print out s, where the first coefficient is the approximate sigma^0 incidence angle dependence for ASCAT, 
# which is -0.0758 dB/degree 
print(s)

[[-0.07582702]
 [-5.07314663]]
