$$
\newcommand{theorem}{\textbf{Theorem: }}
\newcommand{proof}{\textbf{Proof: }}
\newcommand{lemma}{\textbf{Lemma: }}
\newcommand{corollary}{\textbf{Corollary: }}
\newcommand{prop}{\textbf{Proposition: }}
$$

In [1]:
import numpy as np
from module.utility import print_arr, frac_arr
from common.utility import show_implementation

# Introduction

A linear system is a set of equations which are linear.

The general form of a linear system is:
$$
a_{11} x_1 + a_{12} a_2 + \dots + a_{1n}x_n = b_1\\
a_{21} x_1 + a_{22} a_2 + \dots + a_{2n}x_n = b_2\\
\dots\\
a_{m1} x_1 + a_{m2} a_2 + \dots + a_{mn}x_n = b_m\\
$$

## Augmented Matrix
Now suppose that we have a system of linear equations

$$
x + y + z = 4 \\
x + 2y + z = 5 \\
4x + y + 2z = 2 \\
$$

It is more convenient to view this as a matrix, since each variable term is the same for each equation.

Thus, we get 


$$
\left(
  \begin{matrix}
    1 & 1 & 1 \\
    1 & 2 & 1 \\
    4 & 1 & 2 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      4  \\
      5  \\
      2  \\
    \end{matrix}
  \right.
\right)
$$

where the numbers to the left of the bar represents all the coefficients of the variables, while those to the right represents the constants.

We call this the augmented matrix because the constants are "stitched together" with the coefficients and separated by a bar.

## Elementary Row Operations
Suppose we were given 

$$
x + y = 4 \\
x - y = 2 \\
$$

### Multiplying by a scalar
Notice that for possible solutions $x + y = 4$, $2x + 2y = 8$ will results in the same solutions.
Indeed, for any $k \neq 0, kx + ky = 4k$ still produces the same solution.

Thus, we obtain our first elementary row operation, for any row $R_i$, we can replace it with $kR_i, k \neq 0$ and still retain the same solution set.

### Swapping rows
It is trivial to see that swapping of rows will not change the solution set.

$$
\begin{align}
x + y = 4 & \quad &x - y = 2 \\
x - y = 2 & \quad &x + y = 4 \\
\end{align}
$$

The 2 linear systems above obviously have the same solution set.

Thus, our second elementary row operation would be, for any two row $R_i, R_j$, we can swap the position of those rows in our matrix. $R_i \leftrightarrow R_j$

### Adding/Subtracting Row
Suppose that we know that $x+y=4$ and $x-y=2$.

Then we can sum the two equations together on both sides, getting $x+y+x-y=4+2 \Rightarrow 2x + 0y = 6$

The solution set will not change because we are simply summing equalities on both sides.

And it is the same for subtraction

Thus, our last elementary row operation is that we can replace any row $R_i$ with $R_i \pm R_j$ for some other row $R_j$.

## Row Equivalence
Since elementary row operations retain the solution set, if we can derive $\vec B$ from $\vec A$ via a series of elementary row operations, then the 2 systems must have the same solution set.

$\theorem$ Linear systems have the same solution set if and only if they are row equivalent

We will be building this pool of equivalence as we progress through this module.

## Row-echelon form

However, it may not always be easy to find the series of operations that can modify $\vec A$ to $\vec B$.

Thus to check if the two system are equivalent, we change each system into their **row-echelon form**.

A **zero row** is an row with all 0's.

The **leading entry** of a non-zero row is the left-most entry that is non-zero.

The row-echelon form of a augmented matrix is one where:
* All zero rows are at the bottom
* The leading entry "move rightwards" as we move down the rows

Hence, it will have the following shape

$$
\left(
  \begin{matrix}
    * & * & * &*& * & *&*\\
    0 & \dots & 0  & * & * & * & * &\\
    0 & \dots & \dots & \dots & 0 & * &  * &  \\
    0 &\dots &\dots  &\dots & \dots &\dots& 0 \\
    \vdots & &  & & & & \vdots \\
    0 &\dots &\dots &\dots  &\dots&\dots& 0 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      *  \\
      *  \\
      *  \\
      0  \\
      \vdots  \\
      0  \\
    \end{matrix}
  \right.
\right)
$$

where $*$ can be any number, and the $\cdots$ between $0$'s are all $0$.

### Obtaining solution

We can read off the correspond equations to obtain solutions to our system from its row-echelon form.

For example, given this augmented matrix:


$$
\left(
  \begin{matrix}
    1 & 1 & 2 \\
    0 & 3 & 1 \\
    0 & 0 & 1 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      4  \\
      5  \\
      20  \\
    \end{matrix}
  \right.
\right)
$$

We can read off these equations:
$$
\begin{matrix}
x_1 + & x_2 + & 2x_3 & = 4\\
& 3x_2 +& x_3 &= 5\\
&& x_3 &= 20
\end{matrix}
$$

From this, we can do simple substitution to get the following solution:
$$
x_1 = -31\\
x_2 = -5\\
x_3 = 20
$$

## Reduced row-echelon form

For further standardization, the reduced row echelon form is such that:
* All leading entries are 1
* For all pivot columns, all entries except the leading entry is 0

$$
\left(
  \begin{matrix}
    1 & \dots &0&  \dots & 0\\
    0 & \dots   & 1  & \dots & 0 &\\
    0 & \dots  & \dots  & \dots &  1 &  \\
    0 &\dots   &\dots  &\dots& 0 \\
    \vdots &  & &  & \vdots \\
    0 &\dots  &\dots  &\dots& 0 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      *  \\
      *  \\
      *  \\
      0  \\
      \vdots  \\
      0  \\
    \end{matrix}
  \right.
\right)
$$

**The row-echelon form of a linear system is not unique, but the reduced row-echelon form is.**

$\theorem$ Linear systems are equivalent if and only if their reduced row-echelon forms are the same

### Obtaining solution

Obtaining the solution from the reduced row-echelon from is very straightforwards, we just simply read off each row to get the corresponding value to the variable.
The reduced row-echelon form of the previous matrix is:

$$
\left(
  \begin{matrix}
    1 & 0 & 0 \\
    0 & 1 & 0 \\
    0 & 0 & 1 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      -31 \\
      -5  \\
      20  \\
    \end{matrix}
  \right.
\right)
$$

Hence, we just read off the row to obtain our previous solution.

#### Inconsistency

We can detect **inconsistent** linear system (_ie_ there are no solutions) in their row-echelon form if the rightmost column ($b$) is a pivot column.

It is an inconsistency as the system would require $$0x_1+0x_2+0x_3=b_i, \quad b_i \neq 0$$, which has no solutions.

$$
\left(
  \begin{matrix}
    1 & 0 & 0 \\
    0 & 1 & 0 \\
    0 & 0 & 0 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      -31 \\
      -5  \\
      20  \\
    \end{matrix}
  \right.
\right)
$$

#### Multiple solution

Suppose that we have the below system

$$
\left(
  \begin{matrix}
    1 & 0 & 0 \\
    0 & 1 & 1 \\
    0 & 0 & 0 \\
  \end{matrix}
  \left|
    \,
    \begin{matrix}
      -1 \\
      -5  \\
      0  \\
    \end{matrix}
  \right.
\right)
$$

Then the corresponding solution is:
$$
x_1 = -1\\
x_2 + x_3 = -5
$$

This means that the solution is not unique, but rather there are **infinite possible solutions** to the system, of the form $x_1 = -1, x_2 = z, x_3 = -5 -z, z \in \mathbb R$.

## Gaussian Elimination

Since we are interested in the row-echelon form, we need a process to generate it from any augmented matrix.


A simple method is the **Gaussian elimination**, which is as follows:
1. Select the left-most non-zero column
2. Swap rows if needed to ensure that the first row of the selected column is non-zero
3. For each row below the top row, add multiples of the top row to it such that the entry in the selected column of that row is 0
4. If it is in row-echelon form, exit
5. Repeat step 1-4, but ignore the top row and perform it on the sub-matrix

In [2]:
from module.elimination import gaussian_elim

show_implementation(gaussian_elim)

def gaussian_elim(arr: arr_type, b: Optional[arr_type] = None) -> arr_type|tuple[arr_type, arr_type]:
    _b:arr_type = np.zeros((arr.shape[0], 1)).astype(arr.dtype) if b is None else b.copy()
    arr = arr.copy()

    if arr.dtype == np.dtype('O'):
        _b = _b + Fraction()

    if 0 in arr.shape:
        return (arr, _b) if b is not None else arr

    r = 0
    for c in range(arr.shape[1]):
        if np.all(arr[:,c] == 0):
            continue
        if arr[r][c] == 0:
            indices = np.flatnonzero(arr[:, c])
            indices = indices[indices > r]
            if len(indices) == 0:
                continue
            i = indices[0]

            arr[i], arr[r] = arr[r].copy(), arr[i].copy()
            _b[i], _b[r] = _b[r].copy(), _b[i].copy()

        #arr[r] /= arr[r][c]
        #factors = arr[:, c].copy() 
        factors = arr[:, c].copy() / arr[r][c]
        factors[:r+1] = 0
        dx = np.tile(arr[r], (arr.shape[0], 1)) * factors[:,None]
        arr -= dx

    

In [3]:
arr = frac_arr([[1, 1, 2], [-1, 2, -1], [2, 0, 3]])
b = frac_arr([4, 1, -2]).reshape((3, 1))

print_arr(*gaussian_elim(arr, b))

1	1	2 	 | 4
0	3	1 	 | 5
0	0	-1/3 	 | -20/3


### Gauss-Jordan elimination

Once we have a row-echelon form, we can further process it to obtain the reduced row-echelon form

6. Multiply each row such that its leading entry is 1
7. Add suitable multiples to each row such that the entries about the leading entry is 0

In [4]:
from module.elimination import gauss_jordan_elim

show_implementation(gauss_jordan_elim)

def gauss_jordan_elim(arr: arr_type, b: Optional[arr_type] = None) -> arr_type|tuple[arr_type, arr_type]:
    arr = arr.copy()
    _b = np.zeros((arr.shape[0], 1)).astype('int') + Fraction() if b is None else b.copy()

    if arr.dtype == np.dtype('O'):
        _b = _b + Fraction()

    arr, _b = gaussian_elim(arr, _b)
    
    for i in range(arr.shape[0]):
        cols = np.argwhere(arr[i] != 0).ravel()
        if not cols.size:
            break

        col = cols[0]
        _b[i] /= arr[i][col]
        arr[i] /= arr[i][col]


        for j in range(i):
            factor = arr[j][col]
            arr[j] -= arr[i] * factor
            _b[j] -= _b[i] * factor

    return (arr, _b) if b is not None else arr


In [5]:
print_arr(*gauss_jordan_elim(arr, b))

1	0	0 	 | -31
0	1	0 	 | -5
0	0	1 	 | 20


In [6]:
arr = frac_arr(
    [
        [1, 3, -2, 0, 2, 0],
        [2, 6, -5, -2, 4, -3],
        [0, 0, 5, 10, 0, 15],
        [2, 6, 0, 8, 4, 18],
    ]
)
b = frac_arr([0, -1, 5, 6]).reshape((4, 1))
print_arr(*gauss_jordan_elim(arr, b))

1	3	0	4	2	0 	 | 0
0	0	1	2	0	0 	 | 0
0	0	0	0	0	1 	 | 1/3
0	0	0	0	0	0 	 | 0
