<p hidden>$
\newcommand{\phm}{\phantom{-}}
\newcommand{\vb}{\underline{\mathbf{b}}}
\newcommand{\vf}{\underline{\mathbf{f}}}
\newcommand{\vk}{\underline{\mathbf{k}}}
\newcommand{\vx}{\underline{\mathbf{x}}}
\newcommand{\vy}{\underline{\mathbf{y}}}
\newcommand{\deriv}[3][]{\frac{\mathrm{d}^{#1}#2}{\mathrm{d}#3^{#1}}}
\newcommand{\partderiv}[3][]{\frac{\partial^{#1}#2}{\partial#3^{#1}}}
\newcommand{\intd}{\,\mathrm{d}}
\newcommand{\rmd}{\mathrm{d}}
\DeclareMathOperator{\Uniform}{Uniform}
\DeclareMathOperator{\Poisson}{Poisson}
\DeclareMathOperator{\Normal}{Normal}
\DeclareMathOperator{\Exponential}{Exponential}
\DeclareMathOperator{\GammaDist}{Gamma}
\DeclareMathOperator{\Prob}{P}
\DeclareMathOperator{\Exp}{E}
\DeclareMathOperator{\Var}{Var}
$</p>

# Lab 2: Linear Systems

### Topics

- **Mathematics:** Gaussian elimination; limitations of this basic method.
- **Python:** Numpy array indexing, writing code for row operations, back substitution and basic partial pivoting.

In [2]:
import numpy as np

## Basic Gaussian elimination

Gaussian elimination is the basic method for solving linear systems by row operations.  For example, consider the linear system $A\vx = \vb$ where

$$
  A = \begin{bmatrix}
        2 & \phm4 & -1 \\
        2 & \phm2 & \phm8 \\
        1 & \phm1 & \phm9 \\
      \end{bmatrix}, \qquad
  \vb = \begin{bmatrix} -2 \\ \phm1 \\ \phm0 \end{bmatrix}.
$$

**Maths usually numbers the rows 1, 2, 3, but Python numbers them 0, 1, 2.  In this lab we will number the rows 0, 1, 2 in the maths as well, to be consistent.**  The first row operation in the Gaussian elimination process for this system eliminates the 2 from the second row (i.e. row 1): $R_1 \leftarrow R_1 - R_0$.

Enter the matrix $A$ and column vector $\vb$ as defined above into Python below.  **Be careful about the data type of the arrays**: `np.array([[1, 2], [3, 4]])` will make an array of `int`s, which will cause problems later.  To get an array of `float`s, you need either to have one or more `float`s in the list you pass in, e.g. `np.array([[1., 2.], [3., 4.]])`, or to specify the array data type: `np.array([[1, 2], [3, 4]], dtype=float)`.

It's a bit simpler to have $\vb$ as a 1D array `np.array([a, b, c])` rather than a 2D single-column matrix `np.array([[a], [b], [c]])`, but 2D single-column matrices print out nicer.  Either will work as long as you are consistent.

Try typing in commands that do the row operation $R_1 \leftarrow R_1 - R_0$.  Remember to do the row operation on the right-hand side vector $\vb$ as well as on $A$.

You should use indexing to do the operations.  E.g. `A[0, :]` refers to the whole first row of `A`, as does `A[0]`, so `A[1, :] = A[1, :] + 3*A[0, :]` would do the row operation $R_1 \leftarrow R_1 + 3R_0$, as would `A[1] = A[1] + 3*A[0]`.

In [45]:
A = np.array([[2, 4, -1], [2, 2, 8], [1, 1, 9]], dtype=float)
b = np.array([-2,1,0])
print(A)
print(b)

[[ 2.  4. -1.]
 [ 2.  2.  8.]
 [ 1.  1.  9.]]
[-2  1  0]


Write a function `rowop` that performs a specified row operation on a given matrix and column vector:
```
def rowop(A, b, i, j, r):
    A = A.copy()
    b = b.copy()
    ...
    return A, b
```
Note: like lists and dictionaries, Numpy arrays are passed into functions by reference, not by value.  This means if you modify `A` inside `rowop` without making a copy first, you are actually modifying the original array.  Modifying things passed into functions is usually a bad idea, as it makes the code harder to reason about, and may trip you up.

The function `rowop` should take as arguments:
- `A`: an $n \times n$ matrix.
- `b`: an $n \times 1$ column vector.
- `i`, `j`: row numbers between $0$ and $n-1$.
- `r`: a real number.

The function should perform the row operation

$$
  R_i \leftarrow R_i + r R_j
$$

on the matrix `A` and vector `b`.  The new matrix and vector are then returned by the function.

**Hint:** You should use indexing to operate on the whole row of `A` with a single command.

Check that your `rowop` function works by defining `A` and `b` as above and running
```
  new_A, new_b = rowop(A, b, 1, 0, -1)
```
which should give

$$
  \mathtt{new}\_\mathtt{A} = \begin{bmatrix} 2 & \phm4 & -1 \\ 0 & -2 & \phm9 \\ 1 & \phm1 & \phm9 \end{bmatrix}
  \!\qquad \text{and} \qquad
  \mathtt{new}\_\mathtt{b} = \begin{bmatrix} -2 \\ \phm3 \\ \phm0 \end{bmatrix}.
$$


In [5]:
def rowop(A, b, i ,j, r):
    A = A.copy()
    b = b.copy()
    A[i] = A[i] + r*A[j]
    b[i] = b[i] + r*b[j]
    return A, b



`rowop` is now a useful building block, and we won't modify it again.  Below, write code that defines `A` and `b` and uses your `rowop` function three times, to perform the above row operation and two more to reduce $A$ to an upper-triangular matrix (i.e. all entries below the main diagonal are zero).

In [20]:
A, b = rowop(A, b, 1, 0 ,-1)
A, b = rowop(A, b, 2, 0, -1/2)
A, b = rowop(A, b, 2, 1, -1/2)
print(A, b)

[[ 2.  4. -1.]
 [ 0. -2.  9.]
 [ 0.  0.  5.]] [-2  3  0]


Now write a function `gauss_elim3` that takes in a matrix `A` and right hand side vector `b` and calls `rowop` to perform the (three) required row operations to reduce a general $3 \times 3$ matrix to upper-triangular form.

In [17]:
A1 = np.array([[2, 4, -1], [2, 2, 8], [1, 1, 9]], dtype=float)
b1 = np.array([-2,1,0], dtype=float)
def gauss_elim3(A, b):
    A = A.copy()
    B = b.copy()
    A, b = rowop(A, b, 1, 0, -A[1][0]/A[0][0])
    A, b = rowop(A, b, 2, 0, -A[2][0]/A[0][0])
    A, b = rowop(A, b, 2, 1, -A[2][1]/A[1][1])
    print(A, b)
    return A, b
A1, b1 = gauss_elim3(A1, b1)
print(A1, b1)

[[ 2.  4. -1.]
 [ 0. -2.  9.]
 [ 0.  0.  5.]] [-2.   3.  -0.5]
[[ 2.  4. -1.]
 [ 0. -2.  9.]
 [ 0.  0.  5.]] [-2.   3.  -0.5]


## Back substitution

The next step in solving the linear system is to find $\vx$ by back substitution.  Below, write a `back_sub3` function that takes in a row-reduced `A` and `b` and returns the solution $\vx$.
```
def back_sub3(A, b):
    ...
    return x
```

To calculate $\vx$ by back substitution, start by defining `x` as a $3 \times 1$ vector of zeros, and then calculating the last element of $\vx$, `x[2]`:
```
  x = np.zeros((3, 1))
  x[2] = b[2]/A[2, 2]
```
Now calculate `x[1]` and then `x[0]`.  **Hint:** The command for `x[1]` is NOT: `x[1] = b[1]/A[1, 1]`.  If you can't remember how to do back substitution, **try doing it on paper**, or look at your EMTH118 lecture notes from last year!

Test your `back_sub3` function on the results of your `gauss_elim3` function above.

In [19]:
def back_sub3(A, b):
    x = np.zeros((3,1), dtype=complex)
    x[2] = b[2]/A[2, 2]
    x[1] = (b[1]-x[2]*A[1,2])/A[1,1]
    x[0] = (b[0]-x[2]*A[0,2]-x[1]*A[0,1])/A[0,0]
    return x
print(back_sub3(A1, b1))

[[ 2.85+0.j]
 [-1.95-0.j]
 [-0.1 +0.j]]


Below, write a `gauss3_basic` function that uses your `gauss_elim3` and `back_sub3` functions and returns the solution $\vx$:
```
def gauss3_basic(A, b):
    ...
    return x
```

Incorporate these commands into your `gauss3_basic` function to find $\vx$ by back substitution.  Your functions should not display anything.

<div class="alert alert-warning">
  <h3 style="margin-top: 0;">Checkpoint 1</h3>

  Use <code>gauss3_basic</code> to solve this system $A\vx = \vb$ where $A$ is a random $3 \times 3$ matrix and $\vb$ is a vector of $1$s.  You can define $A$ and $\vb$ with the commands
<pre><code>A = np.random.rand((3, 3))
b = np.ones(3)
</code></pre>
  Check that the answer given by your function is correct (think about how you can do this).
</div>

In [22]:
def gauss3_basic(A, b):
    A = A.copy()
    b = b.copy()
    A, b = gauss_elim3(A, b)
    return back_sub3(A, b)

A = np.array([[3+16j, 29, -22], [-3, 3+16j, -10], [5, 11, -10+16j]], dtype=complex)
b = np.zeros(3, dtype=complex)
x = gauss3_basic(A, b)
print(x)

[[ 3.00000000e+00+1.60000000e+01j  2.90000000e+01+0.00000000e+00j
  -2.20000000e+01+0.00000000e+00j]
 [ 4.44089210e-16-1.11022302e-16j  3.98490566e+00+1.07471698e+01j
  -1.07471698e+01+3.98490566e+00j]
 [-3.88578059e-16+3.33066907e-16j  1.77635684e-15+0.00000000e+00j
   0.00000000e+00+1.77635684e-15j]] [0.+0.j 0.+0.j 0.+0.j]
[[0.+0.j]
 [0.+0.j]
 [0.+0.j]]


## Problems with the basic method
Now try using `gauss3_basic` to solve $A\vx = \vb$ where $\vb$ is a vector of $1$s and

$$
  \textrm{(i)} \qquad
  A = \begin{bmatrix}
        \phm0 & \phm1 & \phm2 \\
        \phm1 & -1 & \phm8 \\
        -4 & -1 & \phm3 \\
      \end{bmatrix},
  \qquad \qquad
  \textrm{(ii)} \qquad
  A = \begin{bmatrix}
        2 & \phm2 & -1 \\
        2 & \phm2 & \phm8 \\
        1 & -1 & \phm9 \\
      \end{bmatrix}.
$$

What do you think went wrong?

Reminder: Make sure your arrays are arrays of `float`s.

In [154]:
# i) tries to divide by 0 on the first step
# ii)  tries to divide by 0 on the third step
A1 = np.array(([0, 1, 2], [1, -1, 8], [-4, -1, 3]), dtype=float)
A2 = np.array(([2, 2, -1], [2, 2, 8], [1, -1, 9]), dtype=float)
b = np.ones(3, dtype=float)
gauss3_basic(A1, b)
gauss3_basic(A2, b)
print(A1[1:, 1])

[-1. -1.]


  A, b = rowop(A, b, 2, 1, -A[2][1]/A[1][1])
  A[i] = A[i] + r*A[j]
  b[i] = b[i] + r*b[j]


Below, make a copy of `gauss_elim3` called `gauss_elim_pivot3` and modify it so that it can cope with this type of problem.  Make a copy of `gauss3_basic` called `gauss3` that uses the new `gauss_elim_pivot3` function.

**Hint:** You can index with lists, not just numbers, so e.g. `A[[0, 2]]` will give you rows 0 and 2.  This makes swapping rows easy: `A[[0, 1]] = A[[1, 0]]` will swap the first two rows.

**Hint 2:** You may find `np.argmax` useful.  It returns the index of the maximum element.  Alternatively, you can find the maximum location with `if` and `>`.

<div class="alert alert-warning">
  <h3 style="margin-top: 0;">Checkpoint 2</h3>

  Use your new function to solve these two systems $A\vx = \vb$.  Check that the answer given by your function is correct.
</div>

In [8]:

def gauss_elim3_pivot(A, b):
    A = A.copy()
    B = b.copy()
    max_index = np.argmax(np.abs(A[0:, 0]))
    A[[0, max_index]] = A[[max_index, 0]]
    b[[0, max_index]] = b[[max_index, 0]]
    A, b = rowop(A, b, 1, 0, -A[1][0]/A[0][0])
    A, b = rowop(A, b, 2, 0, -A[2][0]/A[0][0])
    max_index = np.argmax(np.abs(A[1:, 1])) + 1
    A[[1, max_index]] = A[[max_index, 1]]
    b[[1, max_index]] = b[[max_index, 1]]
    A, b = rowop(A, b, 2, 1, -A[2][1]/A[1][1])
    return A, b

def gauss3(A, b):
    A = A.copy()
    b = b.copy()
    A, b = gauss_elim3_pivot(A, b)
    return back_sub3(A, b)

A1 = np.array(([0, 1, 2], [1, -1, 8], [-4, -1, 3]), dtype=float)
A2 = np.array(([2, 2, -1], [2, 2, 8], [1, -1, 9]), dtype=float)
b = np.ones(3, dtype=float)
x1 = gauss3(A1, b)
x2 = gauss3(A2, b)
sum_value_A1 = 0
sum_value_A2 = 0
print(A1)
print(x1)
print(A2)
print(x2)
for i in range(3):
    sum_value_A1 = 0
    sum_value_A2 = 0
    for j in range(3):
        sum_value_A1 += A1[i][j]*x1[j]
        sum_value_A2 += A2[i][j]*x2[j]
    print(f"A1 Row {i} * x1 = {sum_value_A1}")
    print(f"A2 Row {i} * x2 = {sum_value_A2}")

[[ 0.  1.  2.]
 [ 1. -1.  8.]
 [-4. -1.  3.]]
[[-0.22222222]
 [ 0.55555556]
 [ 0.22222222]]
[[ 2.  2. -1.]
 [ 2.  2.  8.]
 [ 1. -1.  9.]]
[[ 0.75]
 [-0.25]
 [ 0.  ]]
A1 Row 0 * x1 = [1.]
A2 Row 0 * x2 = [1.]
A1 Row 1 * x1 = [1.]
A2 Row 1 * x2 = [1.]
A1 Row 2 * x1 = [1.]
A2 Row 2 * x2 = [1.]


## General systems of $n$ linear equations
Below, make copies of your original `gauss_elim3`, `back_sub3`, and `gauss3_basic` functions (the versions without pivoting) called `gauss_elim`, `back_sub`, and `gauss_basic`, and modify them so that `gauss_basic` can solve a general system of $n$ linear equations, where $n$ can be any positive integer.  (You don't need to worry about pivoting here.)

**Hint:** you will need to use two nested `for` loops to do the elimination process, followed by another two nested `for` loops for the back substitution process.  Think about which matrix entries you would need to ‘eliminate’ (i.e. make into zeros), and the order in which you would eliminate them, if you were doing Gaussian elimination on a large matrix.  Your first set of nested `for` loops should go through these entries in the same order, performing the necessary row operation at each step to eliminate that entry.  Your second set of nested loops should perform a similar procedure for back substitution.

<div class="alert alert-warning">
  <h3 style="margin-top: 0;">Checkpoint 3</h3>

  Use your code to solve $A\vx = \vb$ where $A$ is a random $10 \times 10$ matrix and $\vb$ is a vector of $1$s.  Check that the answer given by your function is correct.
</div>

In [6]:
def gauss_elim(A, b):
    A = A.copy()
    b = b.copy()
    for i in range(len(A)):
        #max_index = np.argmax(np.abs(A[i:, i])) + i
        #A[[i, max_index]] = A[[max_index, i]]
        #b[[i, max_index]] = b[[max_index, i]]
        for j in range(i+1, len(A)):
            A, b = rowop(A, b, j, i, -A[j][i]/A[i][i])
            
    return A, b

def back_sub(A, b):
    x = np.zeros(len(A))
    for i in reversed(range(len(A))):
        x[i] = (b[i] - np.sum(A[i][i+1:]*x[i+1:]))/A[i][i]
    return x
        
def gauss_basic(A, b):
    A = A.copy()
    b = b.copy()
    A, b = gauss_elim(A, b)
    return back_sub(A, b)

A = np.random.rand(10, 10)
b = np.ones(10, dtype=float)
x = gauss_basic(A, b)

print(A)
for i in range(10):
    sum_value_A = 0
    for j in range(10):
        sum_value_A += A[i][j]*x[j]
    print(f"A Row {i} * X = {sum_value_A}")

    

[[0.52841264 0.9518528  0.2326402  0.27005863 0.64892044 0.5540158
  0.25178432 0.76735655 0.21555475 0.30714899]
 [0.57141256 0.41495023 0.25807481 0.8245794  0.20966663 0.93242235
  0.30529906 0.86312607 0.24886879 0.33384041]
 [0.18223678 0.10089395 0.02933131 0.7647584  0.25927486 0.18883317
  0.35692532 0.37957023 0.86121296 0.12852486]
 [0.40322416 0.18926006 0.42933395 0.08277748 0.34153945 0.99139404
  0.15419263 0.72489741 0.65530812 0.97152536]
 [0.91312521 0.82777595 0.76187377 0.53189324 0.85863271 0.37764819
  0.77003766 0.58044337 0.64962717 0.57320556]
 [0.48400183 0.69608418 0.62953188 0.74280714 0.83086551 0.73947331
  0.17918879 0.60145506 0.91320514 0.36598444]
 [0.84066097 0.74414356 0.98580326 0.05303135 0.79188358 0.01116999
  0.89561754 0.86373669 0.33801734 0.03506211]
 [0.33795624 0.52584554 0.28677812 0.52473533 0.82768921 0.22217368
  0.95474193 0.07661543 0.53429984 0.17584501]
 [0.80091648 0.56770384 0.01462499 0.11738328 0.50618784 0.76432688
  0.85997169 

In [8]:
A = np.array([[3+16j, 29, -22], [-3, 3+16j, -10], [5, 11, -10+16j]])
b = np.zeros(3, dtype=complex)
x = gauss_basic(A, b)
print(x)

[0. 0. 0.]


  x[i] = (b[i] - np.sum(A[i][i+1:]*x[i+1:]))/A[i][i]
