# DSCI 6001 - 1.4: Inversion and Classical Problems

### By The End of This Lecture You Will Be Able To:
1. Describe in your own words what Gauss-Jordan elimination is
2. Use Gauss-Jordan elimination to obtain the inverse of a matrix
3. Use the General Inverse rule to compute the inverse of a matrix
4. Set up matrices to describe information and data
5. Use matrices to solve basic problems
6. Use matrices to describe networks
7. Use matrices to solve Markov chain problems

**Assigned Reading: Kreyszig 7.8**

This lecture is intended to primarily be a review of the material learned this week. 

Recall in the previous lecture our interest in matrix invertibility and achieving the inversion ${\bf A}^{-1}$ of a given matrix ${\bf A}$. 

In practice, there are several ways one can find ${\bf A}^{-1}$, both computationally and by hand. We will introduce the first way today, Gauss-Jordan elimination. (Additional methods will also be introduced to you later)

The basic principle of Gauss-Jordan emerges from the properties of matrices and the inverse itself:

If $\textbf{A} x = b$

Then:

$\textbf{A}^{-1}\textbf{A} x = \textbf{A}^{-1}b$

$\textbf{I} x = \textbf{A}^{-1}b$

This may seem trivial but the intuition of it becomes more clear if one remembers the properties of the identity.

Recall that matrices do not have a transitive property.

So if $\textbf{A}^{-1}\textbf{A} = \textbf{I}$

Then we can write $\textbf{A}^{-1} = \textbf{I}\textbf{A}^{-1}$

In order to get the inverse in practice we compute the transformation:

$$ \textbf{A}\textbf{I} = \textbf{A} \rightarrow \textbf{I}\textbf{A}^{-1} = \textbf{A}^{-1} $$

This is done in practice by augmenting the original matrix ${\bf A}$ with the identity matrix to the right. We follow proper elimination procedures in order to produce the identity matrix on the left, resulting in $\textbf{A}^{-1}$ appearing to the right. 

**Example:**

$$ \begin{array}{c} L_1 \\ L_2 \\ L_3 \end{array}\hspace{20pt} \left[\begin{array}{ccc|ccc}
1 & 2 & 3 & 1 & 0 & 0\\
2 & 5 & 3 & 0 & 1 & 0\\
1 & 0 & 8 & 0 & 0 & 1\\
\end{array} \right]$$

$$ \begin{array}{c} L_1 \\ L_2 - 2L_1 \\ L_3-L_1 \end{array}\hspace{20pt} \left[\begin{array}{ccc|ccc}
1 & 2 & 3 & 1 & 0 & 0\\
0 & 1 & -3 & -2 & 1 & 0\\
0 & -2 & 5 & -1 & 0 & 1\\
\end{array} \right]$$

$$ \begin{array}{c} L_1 \\ L_2 - 2L_1 \\ -(L_3-L_1+2(L_2-2L_1)) \end{array}\hspace{20pt} \left[\begin{array}{ccc|ccc}
1 & 2 & 3 & 1 & 0 & 0\\
0 & 1 & -3 & -2 & 1 & 0\\
0 & 0 & 1 & 5 & -2 & -1\\
\end{array} \right]$$

$$ \begin{array}{c} L_1+3(L_3-L_1+2(L_2-2L_1))-2(L_2 - 2L_1 - 3(L_3-L_1+2L_2)) \\ L_2 - 2L_1 - 3(L_3-L_1+2L_2) \\ -(L_3-L_1+2L_2) \end{array}\hspace{20pt} \left[\begin{array}{ccc|ccc}
1 & 0 & 0 & -40 & 16 & 9\\
0 & 1 & 0 & 13 & -5 & -3\\
0 & 0 & 1 & 5 & -2 & -1\\
\end{array} \right]$$

Therefore:

$\textbf{A}^{-1} = \left[\begin{array}{ccc|ccc}
-40 & 16 & 9\\
13 & -5 & -3\\
 5 & -2 & -1\\
\end{array} \right]$

Verify this in numpy by computing $\textbf{A}^{-1}\textbf{A}$. Also note that $\textbf{A}\textbf{A}^{-1}$ gets us the same result. Note that inverses are the only case wherein matrices display transitive behavior. 


In [10]:
import numpy as np
A = np.asarray([[1,2,3],[2,5,3],[1,0,8]])
Ainv = np.asarray([[-40,16,9],[13,-5,-3],[5,-2,-1]])
print Ainv.dot(A)
print A.dot(Ainv)

[[1 0 0]
 [0 1 0]
 [0 0 1]]
[[1 0 0]
 [0 1 0]
 [0 0 1]]


Amazing! We can also use the builtin libraries to validate this finding. 

**Note:** These numerical methods are capable of approximating inverses for matrices that are not entirely invertible. Normally such matrices have elements of varying magnitudes and determinants very close to zero. We shall discuss this matter in laboratory exercises.


# QUIZ:
Use G-J elimination to produce the inversion of 

$${\bf A} = \begin{bmatrix}3 & 7\\10 & 3\end{bmatrix}$$

In [7]:
np.linalg.inv(A)

array([[-40.,  16.,   9.],
       [ 13.,  -5.,  -3.],
       [  5.,  -2.,  -1.]])

### Notes on G-J inversion:

1. If it is impossible to form ${\bf I}$ of the same rank as ${\bf A}$, no inverse is possible. 
2. We know that G-J elimination will only function correctly with a nonzero determinant.


## The General Inverse of a n x n Nonsingular Matrix:

It's worth mentioning that matrix inversion has a shortcut based on the cofactor matrix.

The General Inverse for a nonsingular $n \times n$ matrix (this cannot be done with rectangular matrices) is written using its cofactor matrix:

$${\bf{A}^{-1}} = \frac{1}{det\ \bf{A}}C_{jk}^{T}$$

$${\bf{A}^{-1}} = \frac{1}{det\ \bf{A}}\left[ \begin{array}{cccc}
C_{11} & C_{21} & \cdots & C_{n1}\\
C_{12} & C_{22} & \cdots & C_{n2}\\
. & . & \cdots & \vdots \\
C_{1n} & C_{2n} & \cdots & C_{nn}\\
\end{array}\right]$$

For 2 x 2 matrices, this is particularly easy. If

$${\bf{A}} = \left[ \begin{array}{cc} a_{11} & a_{12}\\
a_{21} & a_{22}\end{array} \right]$$

$${\bf{A}^{-1}} = \frac{1}{det\ \bf{A}} \left[ \begin{array}{cc} a_{22} & -a_{12}\\
-a_{21} & a_{11}\end{array} \right]$$

**Example 1:**

$${\bf{A}} =  \left[ \begin{array}{cc} 3 & 1\\
2 & 4\end{array} \right]$$

$${\bf{A}^{-1}} =  \frac{1}{10} \left[ \begin{array}{cc} 4 & -1\\
-2 & 3\end{array} \right]$$

Verify this with numpy:

In [5]:
import numpy as np
Ainv = np.asarray([[4, -1],[-2, 3]])
A = np.asarray([[3,1],[2,4]])

(1./10.)*A.dot(Ainv)

array([[ 1.,  0.],
       [ 0.,  1.]])

**Example 2:**

$$ \textbf{A} = \begin{bmatrix} 7 & 2 & 1 \\ 0 & 3 & -1 \\ -3 & 4 & -2 \end{bmatrix} $$

Computing the cofactor matrix:

$$ \textbf{C} = \begin{bmatrix} -2 & 3 & 9 \\ 8 & -11 & -34 \\ -5 & 7 & 21 \end{bmatrix} $$

Taking the transpose of the cofactor matrix (adjugate):

$$ {\bf{C}^{T}} = \begin{bmatrix} -2 & 8 & -5 \\ 3 & -11 & 7 \\ 9 & -34 & 21 \end{bmatrix} $$

The determinant of $\bf{A}$ is given by:

$$7(-6+4)-2(-3)+1(9) = 1$$

Therefore ${\bf{A}^{-1}} = \frac{1}{1}{\bf{C^{T}}} = \begin{bmatrix} -2 & 8 & -5 \\ 3 & -11 & 7 \\ 9 & -34 & 21 \end{bmatrix} $

You can check this with numpy:

In [6]:
import numpy as np
A = np.asarray([[7,2,1],[0,3,-1],[-3,4,-2]])
Ainv = np.asarray([[-2,8,-5],[3,-11,7],[9,-34,21]])

A.dot(Ainv)

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

### QUIZ:

Take the inverse of $${\bf A} = \begin{bmatrix}3 & 7\\10 & 3\end{bmatrix}$$ with the general method.

## Classic Problems in Basic Linear Algebra

As review for this week of instruction, we will cover several problems posed to the reader in the text:

1. Solving basic problems
2. Modeling networks with matrices
3. Modeling Markov processes with matrices 
4. Solving circuit diagrams with linear algebra

The student should become as comfortable as possible with the basic techniques of linear algebra that solve these problems.


## Setting up matrices to solve basic problems:

### Weight Loss

Weight loss can be modeled (for most people) as a process of (caloric intake)-(caloric output). Ambient levels of certain hormones, body weight set point and basal metabolic rate play into this equation, but most people's body weight is reflective of this basic paradigm. Weight is lost by adjusting caloric intake to be slightly below that which is needed for maintenance and then applying regular exercise.

Suppose we know the mean caloric cost of a given type of exercise per hour and build an exercise schedule for me.

Someone my size burns about 650 calories an hour using steady cycling,
about 900 calories an hour running (hard), about 700 calories an hour swimming slowly, 400 calories an hour doing yoga and about 300 calories an hour lifting weights.

Let's plan a workout schedule for me, as I'm training for an obstacle race. Let's have me follow the present schedule:

1. Weights for 30 minutes M, W, F.
2. Yoga class for 90 minutes T, Th, S.
3. Cycling 90 minutes T, Su
4. Running 30 minutes W, F, S
5. Swimming 30 minutes M, W, Th, Su

(To be fair, this schedule is pretty tough and I might need some days off once in a while in order to keep weight loss consistent. Also I'll have less time to do other things I like.).

Since our units of calories burnt are in hours, we should change units of minutes to fractions of an hour. Now we can write the schedule as a matrix, with each day as a row, and the time spent for each exercise as an entry.

The matrix should be:

$$\textbf{A} = \begin{bmatrix} 0.5 & 0. & 0. & 0. & 0.5 \\ 0. & 1.5 & 1.5 & 0. & 0. \\ 0.5 & 0. & 0. & 0.5 & 0.5 \\ 0. & 1.5 & 0. & 0. & 0.5 \\ 0.5 & 0. & 0. & 0.5 & 0. \\ 0. & 1.5 & 0. & 0.5 & 0. \\ 0. & 0. & 1.5 & 0. & 0.5 \end{bmatrix} $$

We might code this all in python as follows

`row_labels = ['M', 'T', 'W', 'Th', 'F', 'S', 'Su']`

`column_labels = ['Weights', 'Yoga', 'Cycling', 'Running', 'Swimming']`

`A= np.asarray([[0.5, 0., 0., 0., 0.5],[0., 1.5, 1.5, 0., 0.],[0.5, 0., 0., 0.5, 0.5],[0., 1.5, 0., 0., 0.5],[0.5, 0., 0., 0.5, 0.],[0., 1.5, 0., 0.5, 0.],[0., 0., 1.5, 0., 0.5 ]])`

`calories_burnt = [300,400,650, 900, 700]`

#### Burning Calories:

So we can calculate the amount of calories I burn per week with exercise by calculating the matrix - vector product of my schedule with my exercise. Recall that we do this by dotting the rows of $\bf A$ with the `calories_burnt` vector. We will produce a $ [7 \times 5] \times [5 \times 1] = [7 \times 1]$ product:


In [2]:
import numpy as np
row_labels = ['M', 'T', 'W', 'Th', 'F', 'S', 'Su']
column_labels = ['Weights', 'Yoga', 'Cycling', 'Running', 'Swimming']
A= np.asarray([[0.5, 0., 0., 0., 0.5],[0., 1.5, 1.5, 0., 0.],[0.5, 0., 0., 0.5, 0.5],[0., 1.5, 0., 0., 0.5],[0.5, 0., 0., 0.5, 0.],[0., 1.5, 0., 0.5, 0.],[0., 0., 1.5, 0., 0.5 ]])
calories_burnt = np.asarray([300,400,650, 900, 700])
print A.dot(calories_burnt)
print A.dot(calories_burnt).sum()

[  500.  1575.   950.   950.   600.  1050.  1325.]
6950.0


These are the calories I burn each day of this schedule. Over a whole week it amounts to 6950 calories. Additionally, we add my basal metabolic rate, which I happen to know is 2200 calories per day. Therefore each week I burn 22350 calories. To lose 1 pound a week, I would need to intake 3500 calories less than I eat per week. This means I need to eat 18850 calories a week, or about 2690 calories a day on average.

### QUIZ:

If I was going to add a new exercise called 'Rowing', how would that change the above set up?

## Modeling networks with matrices

It is very common to model probabilistic models with matrices. Probabilistic models are normally used to describe processes where the probability of achieving a given state is dependent on the model arriving at one or more previous states. 

Example:

We will use the following (simple) three-state model of a stock market:

![Simple Markov](./Markov_A.png)

This somewhat intimidating diagram can be described in terms of its **nodes** and **edges** as labeled above (Kreyszig calls edges "branches"). 

We call this reformatting of a diagram a *nodal incidence matrix*, where the nodes (in this case A-C) are marked as the rows, or variables, and the edges are marked as the columns. If edge $j$ is inbound to a node $i$, the entry $a_{ij}$ of the nodal incidence matrix is -1. If edge $j$ is outbound to a node $i$, the entry $a_{ij}$ of the nodal incidence matrix is 1. Otherwise it is 0. You will often see rules like these summarized in this format:

$$    a_{ij}= 
\begin{cases}
    1 ,& \textit{if branch j leaves node i}\\
    -1,              & \textit{if branch j enters node i}\\
    0,              & \textit{otherwise}
\end{cases}$$

Returning to our above diagram, we can write the nodal incidence matrix $\bf{A}$ as follows (we will set the loops to 0 in this case):

$$\textbf{A} = \begin{bmatrix} 0 & 0 & 0 & 1 & 0 & -1 & 0 & 1 & -1\\ 0 & 0 & 0 & -1 & 1 & 0 & -1 & 0 & 1 \\ 0 & 0 & 0 & 0 & -1 & 1 & -1 & 1 & 0 \end{bmatrix} $$


###QUIZ:
What type of rectangular matrix is this?

There are several other types of descriptive matrices. One of these we will discuss next.

### Modeling Markov Processes with Matrices

Let us take the above system into our analysis using 'transition probabilities.' In this case, we will rebuild the matrix $\bf A$ into a **stochastic matrix** describing the probability of entry into one market state from another. We'll use the below diagram describing the probability that a market might change from one state to another within a given year. We can now write the stochastic matrix $\bf B$:

![Simple Markov 2](./Markov_B.png)

We write $\bf B$ in terms of transitions to and from a certain state, where the columns are the state (market) that we are transitioning from and the rows are the state we transition to:

$$\textbf{B} = \begin{bmatrix} 0.3 & 0.1 & 0.6\\ 0.1 & 0.3 & 0.6  \\ 0.15 & 0.15 & 0.7  \end{bmatrix} $$

`row_labels = ['A','B','C']`

`column_labels = ['A','B','C']`

`B = [[.3,.1,.6],[.1,.3,.6],[.15,.15,.7]]`

Assuming these probabilities do not change from year to year, we can easily estimate the future state of the system over a given period of time by taking the product of $\bf B$ with a vector describing previous states. This vector is commonly called the **prior**.

Suppose that we have a history of previous years showing that the market was bull 50% of the time, bear 30% of the time, and sideways-moving 20% of the time. We want to know the likelihood of where the market will have been in 3, 5, 10, and 50 years.

We can simply create a prior vector:

`prior = [[0.5,0.3,0.2]]`

And multiply by powers of $\bf B$ to determine the prediction:

$$prior_1 = B \cdot prior$$
$$prior_2 = B \cdot prior_1$$
$$prior_3 = B \cdot prior_2$$

Therefore,

$$prior_3 = {\bf{B}^{3}} \cdot prior$$

more generally, we can calculate the prior any $n$ units forward in time by multiplying the original prior by $\bf B$ raised to the power of $n$.

In [16]:
import numpy as np
prior = np.asarray([[.95],[.05],[.0]])
B = np.asarray([[.3,.1,.6],[.1,.3,.6],[.15,.15,.70]])

for i in [3, 5, 10, 50]:
    print('In ', i, 'years, we have:')
    print(np.linalg.matrix_power(B,i).dot(prior))



In  3 years, we have:
[[ 0.1706]
 [ 0.1634]
 [ 0.1665]]
In  5 years, we have:
[[ 0.166814]
 [ 0.166526]
 [ 0.166665]]
In  10 years, we have:
[[ 0.16666671]
 [ 0.16666662]
 [ 0.16666667]]
In  50 years, we have:
[[ 0.16666667]
 [ 0.16666667]
 [ 0.16666667]]


##Assigned Problems:

Chapter 7 Review Questions:

11, 13, 17, 20, 24, 25, 26, 30, 33, 35 