## Unit 2 - Linear Systems of Equations

### Linear Systems

A linear system of $m$ equations in $n$ uknowns has the forms:

$a_{11}x_1 + a_{12}x_2 + ... + a_{1n}x_n = b_{1}$\
$a_{21}x_1 + a_{22}x_2 + ... + a_{2n}x_n = b_{2}$\
$.$\
$.$\
$.$\
$a_{m1}x_1 + a_{m2}x_2 + ... + a_{mn}x_n = b_{m}$

$x$'s are unknowns, $a$'s and $b$ are known. There are $m$ equations.

Solving systems of these equations is the focus of this unit. 

This system can also be succinctly expressed as a matrix equation: \
$Ax = b$ \
where $A$ is the matrix of coefficients, $x$ is a vector of unknowns, $b$ is a vector of constants. More specifically:\
$A = [a_{ij}]_{i=1..m,j=1..n}$\
$x = [x_1, x_2, ... x_n]$\
$b = [b_1, b_2, ... b_m]$

We can also express this linear systems using an augmented matrix.\
$[A|b]$ (this is just shorthand that allows you to skip writing the $x$s)

Examples:

1) $x_1 + x_2 = 4$ \
   $x_1 - x_2 = 2$ 

2) $ \left[\begin{matrix} 1 & 1 \\ 1 & -1 \end{matrix}\right]
\left(\begin{matrix} x_1 \\ x_2 \end{matrix}\right)
=
\left(\begin{matrix} 4 \\ 2 \end{matrix}\right)
$ (This shows that the matrix-vector product of the coefficient matrix and the vector of unknowns gives the solution [[4],[2]] )

3) $ \left[\begin{matrix} 1 & 1 & | & 4 \\ 1 & -1 & | & 2 \end{matrix}\right]$

All three ways represent the same linear system.

### Types of Linear Systems

#### Underconstrained Linear System
$ \left[\begin{matrix}  &  &  &  & | \\  &  &  &  & | \end{matrix}\right]$

Typically the number of rows is less than the number of columns ($m<n$). There are an infinite number of solutions. There are many more variables than there are equations. 

Example:

$ \left[\begin{matrix} 1 & 2 & | & 3 \\ 0 & 0 & | & 0 \end{matrix}\right]$

$x_1 + 2x_2 = 3$\
$0x_1 + 0x_2 = 0$

There is an infinite number of solutions that can be expressed as \
$\left(\begin{matrix} x_1 \\ x_2 \end{matrix}\right)$ such that $x_1 + 2x_2 = 3$

An underconstrained system with $m<n$ can be augmented byy rows of 0's to make it square (perfect).

#### Perfectly Constrained
$ \left[\begin{matrix}  &  &  &  & | \\  &  &  &  & | \\  &  &  &  & |\end{matrix}\right]$

Typically, $m=n$. One unique solution.

Example:

$ \left[\begin{matrix} 1 & 0 & | & 3 \\ 0 & 2 & | & 0 \end{matrix}\right]$

$x_1 + 0x_2 = 3$\
$0x_1 + 2x_2 = 0$

There is a unique solution that can be expressed as \
$\left(\begin{matrix} x_1 \\ x_2 \end{matrix}\right)$ = $\left(\begin{matrix} 3 \\ 0 \end{matrix}\right)$

Our attention to linear systems will focus on Perfectly Constrained systems where $m=n$

#### Over Constrained
$ \left[\begin{matrix}  &  &  &  & | \\  &  &  &  & | \\  &  &  &  & | \\  &  &  &  & | \\  &  &  &  & | \end{matrix}\right]$

Typically, $m>n$. There is no solution.

Example:

$ \left[\begin{matrix} 1 & 0 & | & 3 \\ 2 & 0 & | & 0 \end{matrix}\right]$

$1x_1 + 0x_2 = 3$\
$2x_1 + 0x_2 = 0$

There is no solution since these equations are inconsistent.

$x_1 = 3$\
2x_1 = 0

This is a very common case in data science. Number of features is usually much less than data points so there's not exact solution. In this case you want to find some weighting of features that is a best fit. You can get a good approximate solution to over constrained systems by solving a related system with a square matrix.

### Solving Linear Systems by Direct Methods

Three operations on linear systems that do not change the solution

1) Multiply an equation by a non-zero constant
2) Exchange the order of two equations
3) Replace an equation with the sum of itself and a constant multiple of another equation

Example

$x_1 + x_2 = 4$\
$x_1 - x_2 = 2$

Same solution as the following equations:

By #1\
$2x_1 + 2x_2 = 8$ (multiplied this equation by 2)\
$x_1-x_2=2$

By #2 (just swap the order of the equations)\
$x_1 - x_2 = 2$\
$2x_1 + 2x_2 = 8$\

By #3\
$x_1 - x_2 = 2$\
$2x_1 + 2x_2 = 8 = eq_2 = eq_2 - 2eq_1$ (replace equation 2 with a constant multiple of another equation. in this case $eq_2 = eq_2 - 2eq_1$\
This gives the following new equations:\
$x_1 - x_2 = 2$\
$0 + 4x_2 = 4$

These operations don't need to be applied in order - this was just an example.

#### Elementary Row Operations on Augmented Matrices

1) $R_i \leftarrow cR_i$ (Replace $Row_i$ with a consant $c$ times $Row_i$)

2) $R_i \leftarrow\rightarrow R_j$ (Swap rows i and j)

3) $R_i \leftarrow R_i + aR_j$ (replace row i with row i + a time row j)

Example (same problem from above example)

$x_1 + x_2 = 4$\
$x_1 - x_2 = 2$

$ \left[\begin{matrix} 1 & 1 & | & 4 \\ 1 & -1 & | & 2 \end{matrix}\right] \rightarrow \left[\begin{matrix} 1 & 1 & | & 4 \\ 0 & -2 & | & -2 \end{matrix}\right]$

Using operation #3, we can replace the second equation with the sum of the equation and constant multiple of another equation (in this case, -1). Multiplying equation 1 by -1 gives $-x_1 - x_2 = - 4$. 

Then, $R_2 - R_1 = x_1 - x_1 -x_2 - x_2 = -4 + 2$

$R_2 \leftarrow R_2 - R_1$ (this is the notation used for the operation above. It means "replace row 2 with row 2 - row 1)

The second equation can now be rewritten as

$-2x_2 = -2 $ \
So $x_2 = 1$

Now, going back to the first equation:

$x_1 + 1 = 4$\
$x_1 = 3$

$\left(\begin{matrix} x_1 \\ x_2 \end{matrix}\right) = \left(\begin{matrix} 3 \\ 1\end{matrix}\right)$

Solving this system of equations amounts to finding the vector of unknowns such that the vector product of the given matrix and the vector of unknowns give ( [[4],[2]]). 

In [50]:
import numpy as np

# In python

given_matrix = [[1,1],[1,-1]]
matrix_of_unknowns = [[3],[1]]
solution = [[4],[2]]

result = np.dot(given_matrix, matrix_of_unknowns)  #matrix-vector product
print(result)
assert np.array_equal(result, solution)


[[4]
 [2]]


### Gaussian Elimination with Backward Substitution

Specific method that can be applied to a square coefficient matrix

Given an augmented matrix $[A,|b]$ with $A \in \mathbb{R}^{m*n}$

Step 1 (Gaussian Elimination) - Row reduce $[A|b]$ to get $[U|b']$ where $U$ is upper triangular with nonzero diagonal entries.

$U$ = 
$ \left[\begin{matrix} x & x & x & x \\ 0 & x & x & x \\ 0 & 0 & x & x \\ 0 & 0 & 0 & x \end{matrix}\right]$ = all the information in $U$ is in the upper right triangular area.

Step 2 (Backward Substitution) - Iteratively solve for $x$ in the order $x_x, x_{n-1}, ... x_1$

Example:

$ \left[\begin{matrix} 1 & 1 & 1 & | & 6 \\ -1 & 1 & 1 & | & 4 \\ 2 & -1 & 1 & | & 3\end{matrix}\right]$

**Step 1 Gaussian Elimination** - Turn the bottom left 3 entries into $0$s. "Use the top left element ($a_{11}$ - the very top left 1) as a pivot against row 2 and row 3. "

$Row_2 \leftarrow Row_2 + Row_1$ (Replace row2 with row2+row1)\
$Row_3 \leftarrow $Row_3 - 2*Row_1$ (Replace row3 with row3 - 2 times row1)

That gives:

$\left[\begin{matrix} 1 & 1 & 1 & | & 6 \\ 0 & 2 & 2 & | & 10 \\ 0 & -3 & -1 & | & -9\end{matrix}\right]$

Gaussian elimination always says to use the diagonal row as the pivot, so here we pivot on $Row_2$

$Row_3 \rightarrow Row_3 + 3/2*Row_2$

That gives:

$\left[\begin{matrix} 1 & 1 & 1 & | & 6 \\ 0 & 2 & 2 & | & 10 \\ 0 & 0 & 2 & | & 6\end{matrix}\right]$

That completes the Guassian Elimination piece.

**Step 2 Backward Substitution**: 

Solve for x_3 (the final row in the matrix): \
$2x_3 = 6$\
$x_3 = 3$

Solve for x_2:\
$2x_2 + 2x_3 = 10 $\
$2x_2 + 2*3 = 10$ (replace x_3 with 3 solved in the previous step)\
$2x_2 = 4$\
$x_2 = 2$

Solve for x_1:\
$x_1 + x_2 + x_3 = 6$\
$x_1 + 2 + 3 = 6$\
$x_1 = 1$

So, the solution vector is:

$\vec{x} = \left(\begin{matrix} x_1 \\ x_2 \\ x_3\end{matrix}\right)$ = $\left(\begin{matrix} 1 \\ 2 \\ 3\end{matrix}\right)$

The pseudocode for these algorithms is very simple. 

**Gaussian Elimination Algorithm**:

Input $[A|b]$, $A \in \mathbb{R}^{m*n}$, $b \in \mathbb{R}^n$\
Output $[U|b']$, $U \in \mathbb{R}^{m*n}$, $b' \in \mathbb{R}^n$ and $U$ is upper triangular: $U_{ij} = 0$ for $i > j$ and nondiagonal: $U_{jj} != 0$ for all $j$

For columns $j = 1$ to $n$:\
If $a_jj == 0$, Output FAIL\
&nbsp;&nbsp;For rows $i > j$\
&nbsp;&nbsp;&nbsp;&nbsp;$Row_i \rightarrow Row_i + (a_{ij} / a_{jj}) * Row_j$\
&nbsp;&nbsp;Output Resulting $[U,b']$

**Backward Substituion Algorithm**:

Input $[U,b']$
Ouput $x \in \mathbb{R}^n$

For $i = n$ to $1$\
&nbsp;&nbsp;$x_i = (b_i - \sum_{j>i}^na_{ij}x_j) / a_{ii}$\
Output x

**Run Times:**

The Gaussian Elimination algorithm works in $n^3$ time\
The Backward Substitution algorithm works in $n^2$ time

Gaussian Elimination is very straightforward, but we e need faster, and possibly less straighforward, methods to solve this problem. 

In [51]:
# TODO - Implement the algorithms described above

### Elementary Matrices

The elementary row operations previously introduced that do not change the solution to a linear system can be accomplished by matrix multiplication.

**Type 1 Operations:** $Row_i \leftarrow cRow_i$


$\left[\begin{matrix} & & & &\\ x & x & x & x \\ & & & & \end{matrix}\right]$
$\rightarrow$ c
$\left[\begin{matrix} &  & & &\\ cx & cx & cx & cx \\ & & & & \end{matrix}\right]$

Take the $i$th row, and multiply the constant $c$ by the whole row.

We claim there's some elementary matrix that accomplishes the same thing as the row operation described above.

What is that matrix? It's this one:

Explanation: If the $c$ weren't in the leftmost matrix, it would just be the identity matrix and would just give the original matrix as the result. Having $c$ in the identity matrix will multiply $c$ against the $i$th row of the original matric (since $c$ is in the $i$th row).  

$\left[\begin{matrix} 1& 0 & 0  \\ 0 & c & 0  \\ 0 & 0 & 1 \end{matrix}\right]$
$\left[\begin{matrix} &  &  & &  \\ x & x & x & x \\  &  &  & & \end{matrix}\right]$
$\rightarrow$ c
$\left[\begin{matrix} &  &  & &  \\ x & x & x & x \\  &  &  & & \end{matrix}\right]$

**Type 2 Operations:** $Row_i \leftarrow\rightarrow Row_j$

Replace the $ii$th entry and the $jj$th entry with 0's and place a 1 in the $ij$th position and the $ji$th position.

$\left[\begin{matrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0  \\ 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{matrix}\right]$
$\left[\begin{matrix} & & & & & \\ i & i & i & i & i \\ & & & & & \\ j & j & j & j & j & \\ & & & & & \end{matrix}\right]$
$\rightarrow$
$\left[\begin{matrix} &  & & & &\\ j & j & j & j &j\\ & & & & &\\ i& i& i& i& i& \\ & & & & & \end{matrix}\right]$

**Type 3 Operations:** $Row_i \leftarrow Row_i + cRow_j$

Below, $c$ is in the $ij$th row ($i$ is rows, $j$ is columns). This has the effect of multiplying the $jth$ row by $c$ and placing that result in the $i$th row of the resulting matrix. 

$\left[\begin{matrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0  \\ 0 & 0 & 1 & 0 & 0 \\
0 & c & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{matrix}\right]$
$\left[\begin{matrix} & & & & & \\ i & i & i & i & i \\ & & & & & \\ j & j & j & j & j & \\ & & & & & \end{matrix}\right]$
$\rightarrow$
$\left[\begin{matrix} &  & & & &\\ cj & cj & cj & cj &cj\\ & & & & &\\ j& j& j& j& j& \\ & & & & & \end{matrix}\right]$

Examples:

$E_1$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 0  \\ 0 & 0 & 2 \end{matrix}\right]$ (Type 1 Elementary Matrix)

$E_2$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 0 & 1  \\ 0 & 1 & 0 \end{matrix}\right]$ (Type 2 Elementary Matrix)

$E_3$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 1  \\ 0 & 2 & 1 \end{matrix}\right]$
(Type 3 Elementary Matrix)

$A$ = $\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 7 &8 & 9 \end{matrix}\right]$ 

By Matrix Multiplication:

$E_1A$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 0  \\ 0 & 0 & 2 \end{matrix}\right]$ $\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 7 &8 & 9 \end{matrix}\right]$ = $\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 14 & 16 & 18 \end{matrix}\right]$ $Row_3 \leftarrow cRow_3$ where $c=2$

$E_2A$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 0 & 1  \\ 0 & 1 & 0 \end{matrix}\right]$ $\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 7 &8 & 9 \end{matrix}\right]$ = 
$\left[\begin{matrix} 1& 2 & 3  \\ 7 & 8 & 9  \\ 4 & 5 & 6 \end{matrix}\right]$ $Row_3 \leftarrow\rightarrow Row_2$

$E_3A$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 0  \\ 0 & 2 & 1 \end{matrix}\right]$ $\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 7 &8 & 9 \end{matrix}\right]$ = 
$\left[\begin{matrix} 1& 2 & 3  \\ 4 & 5 & 6  \\ 15 & 18 & 21 \end{matrix}\right]$ $Row_3 \leftarrow Row_3 + 2Row_2$

To undo and elementary operation:

$E_1$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 0  \\ 0 & 0 & 1/2 \end{matrix}\right]$

$E_2$ = $E_2$ (This will swap the rows back)

$E_3$ = $\left[\begin{matrix} 1& 0 & 0  \\ 0 & 1 & 0  \\ 0 & -2 & 1 \end{matrix}\right]$


In [71]:
import numpy as np

e_1 = [[1,0,0],[0,1,0],[0,0,2]]
e_2 = [[1,0,0],[0,0,1],[0,1,0]]
e_3 = [[1,0,0],[0,1,0],[0,2,1]]
A = [[1,2,3],[4,5,6],[7,8,9]]

e_1A = np.dot(e_1, A)
e_2A = np.dot(e_2, A)
e_3A = np.dot(e_3, A)

print(f'E_1A = \n {e_1A}\n')
print(f'E_2A = \n {e_2A}\n')
print(f'E_3A = \n {e_3A}\n')

# Undo the operations

e_1_undo = [[1,0,0],[0,1,0],[0,0,1/2]]
e_2_undo = e_2
e_3_undo = [[1,0,0],[0,1,0],[0,-2,1]]

e_1A_undo = np.dot(e_1_undo, e_1A)
e_2A_undo = np.dot(e_2_undo, e_2A)
e_3A_undo = np.dot(e_3_undo, e_3A)

print(f'E_1A_undo = \n {e_1A_undo}\n')
print(f'E_2A_undo = \n {e_2A_undo}\n')
print(f'E_3A_undo = \n {e_3A_undo}\n')


E_1A = 
 [[ 1  2  3]
 [ 4  5  6]
 [14 16 18]]

E_2A = 
 [[1 2 3]
 [7 8 9]
 [4 5 6]]

E_3A = 
 [[ 1  2  3]
 [ 4  5  6]
 [15 18 21]]

E_1A_undo = 
 [[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

E_2A_undo = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

E_3A_undo = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]



### LU Factorization