<a href="https://colab.research.google.com/github/johanhoffman/DD2363_VT23/blob/main/template-report-lab-X.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 2: Iterative methods**
**Lovisa Strange**

# **Abstract**

In this report, some iterative algorithms are presented. First, two variants of the same algorithm to solve a system of linear equations are presented. Then, a method for solving a non-linear scalar equation is presented. Additionally, these methods are tested and the result is presented.  

#**About the code**

In [4]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2024 Lovisa Strange (lstrange@kth.se)

# This file is part of the course DD2365 Advanced Computation in Fluid Mechanics
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This template is maintained by Johan Hoffman
# Please report problems to jhoffman@kth.se

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

The modules needed to run this code is presented here.

In [1]:
# Load neccessary modules.
from google.colab import files

import time
import numpy as np

from matplotlib import pyplot as plt
from matplotlib import tri
from matplotlib import axes
from mpl_toolkits.mplot3d import Axes3D

# **Introduction**

Iterative methods are methods that compute an approximate solution in each step. Typically, the algorithm runs until a ceratain number of iterations have been made or until an error approximation or similar is small enough. This means that we get a worse approximation if we only let the algorithm run for a few iterations. On the other hand, if the approximate solution does not need to be precise, for example if it will be used as a starting value for another algorithm, they save a lot of computing power that an exact solving method would have to use.

An iterative method has a rate of convergence $q$ if $$
lim_{k \to \inf} \frac{||x-x^{(k+1)}||}{||x-x^{(k)}||^q} = C
$$
for $C>0.$ The rate of convergence is a mesure on how quickly a method converges, so a large value of $q$ means that it converges more quickly.

It also important to be able to approximate the error of an iterative method, for example to know when to stop iterating. If the exact solution is known, we can compare the result with the known solution. However, most of the time, the exact solution is not known. Then we can use the residual, which can be computed by putting the computed solution $x^{(k)}$ into the original equation. For a system of equations, we get that $$
r^{(k)} = b - Ax^{(k)}
$$

In this report, one iterative method for nonlinear equations and two iterative methods for linear systems will be presented.

# **Method**

One iterative method that can be used for linear systems of equations is Richardson iteration (Methods in Computational Science, p.149). It is a form of fixed point iteration (Methods in Computational Science, p.148), which is a method of the form $$
x^{(k+1)} = g(x^{(k)}),
$$
which solves the equation $x = g(x).$ More specifically, we can use a fixed point iteration method for linear systems, where we have $$g(x) = Mx +c$$ where $M \in R^{n \times n}$ and $c ∈ R^n.$

For Richardson iteration, we can solve the system $Ax=b$ by setting $M=I-\alpha A$ and $c = \alpha b,$ $\alpha$ a real number. We get the iteration $$
x^{(k+1)} = (I-\alpha A)x^{(k)}+ \alpha b.
$$
Additionally, we know that Richardson iteration converges if $$
||I-\alpha A||<1.
$$
This factor also is an upper limit for how fast the method converges for a given A. So, to increase the rate of convergence, we want to make $||I-\alpha A||$ as small as possible, but still getting the same system. To do this, we can (left) precondition this system. We can use a matrix $B \approx A^{-1}$ so that  $$BAx = Bb.$$ Then, the rate of convergence is bounded by $$||I-\alpha BA||.$$   



## Jacobi iteration

One variant of preconditioned Richardson iteration is using the Jacobi preconditioner (Methods in Computational Science, p.150). Here we use the matrix $$
B = D^{-1},
$$
where $D$ is the diagonal matrix created by taking the diagonal from A, and $\alpha = 1 $.

The Jacobi iteration is guaranteed to converge if A is diagonally dominant, that is if the diagonal element on each row is at least as big as the sum of the rest of the elements on that row.

An alternative way to formulate Jacobi iteration is by matrix splitting (Methods in Computational Science, p.151). In this case, we spltit $A$ into two parts by $$
A = A_1 + A_2 = D + (A-D)
$$
where D is as above. We then get $$
M = -D^{-1}(A-D) = I-D^{-1}A
$$
and $$
c = D^{-1}b.
$$

The convergence criteria is $$
||I-D^{-1}A|| <1.
$$
Below, the code for constructing the inverse of the diagonal matrix D constructed from an arbitrary non-singular matrix A.

In [2]:
## Inverse of diagonal matrix created from matrix A

# Input: Matrix A
# Output: Inverse of diagonal of A

def diag_inverse(A):

  D_inv = np.zeros(A.shape)

  for i in range(A.shape[0]):
    for j in range(A.shape[0]):
      if i==j:
        D_inv[i,j] = 1/A[i,j]

  return D_inv



Then, the code for the Jacobi iteration can be found below.

In [32]:
## Jacobi iteration, partly based on Algorithm 7.1 (Richardson iteration), p.149, Methods in Computational Science

# Input: Matrix A, vector b
# Output: Vector x so that Ax = b

def jacobi_iteration(A,b):
  x = np.zeros(b.shape)
  tolerance = 10**(-16) # tolerance of algorithm

  D_inv = diag_inverse(A)

  BA = np.matmul(D_inv,A)
  Bb = np.matmul(D_inv,b)

  r = 1

  while np.linalg.norm(r)/np.linalg.norm(Bb) > tolerance:
    r = np.matmul(BA,x)

    r[:] = Bb[:]-r[:]

    x[:] = x[:] + r[:]

    print("||Ax-b|| = ",np.linalg.norm(np.matmul(A,x)-b))


  return x

## Gauss Seidel iteration

Another iterative method based on Richardsson iteration is Gauss Seidel iteration (Methods in Computational Science, p.151). Similarly to Jacobi iteration, this method can be seen as a version of preconditioned Richardson iteration. Here, we do an incomplete LU-factorisation, so that $$
A \approx LU.
$$
Then, we can precondition the system with $$
B = U^{-1} L^{-1}
$$
and use $\alpha = 1$ to improve the rate of convergence as before.

As before, we can also view the Gauss Seidel iteration as the splitting (Methods in Computational Science, p.152) $$
A = A_1 + A_2 = L + (A-L).
$$
Then, we get the matrix $$
M = -L^{-1}(A-L) = I-L^{-1}A
$$
and the vector

$$
c=L^{-1}b.
$$
The algorithm converges if $$
||I-L^{-1}A||.
$$

Below, the algorithm for forward substitution to solve the system $$
Lx = b
$$
to find the effect of $L^{-1}$ on a vector $b.$

In [4]:
# Forward substitution (based on algorithm from Lecture 1, slide 55)

## Input: Lower triangular matrix L, vector b
## Output: solution to Lx = b

def forward_substitution(L,b):
  n = len(b)
  x = np.zeros(n)
  x[0] = b[0]/L[0,0]
  for i in range(1,n):
    sum = 0
    for j in range(0, i-1):
      sum += L[i,j]*x[j]
    x[i] = (b[i]-sum)/L[i,i]

  return x

Here, the lower triangular part L of a matrix A is constructed. Also, the forward-substitution algorithm from above is used to find how the inverse of L acts on a vector b.

In [15]:
## Inverse of lower triangular matrix made from A

# Input: Matrix A, vector B
# Output: Inverse of L, lower triangular part of A

def lower_inverse_on_vector(A,b):

  L = np.zeros(A.shape)
  for i in range(A.shape[0]):
    for j in range(A.shape[0]):
      if j<= i:
        L[i,j] = A[i,j]
  L_inverse_b = forward_substitution(L,b)
  return L_inverse_b

Finally, the Gauss Seidel iteration algorithm is presented, using the algorithm from above.

In [31]:
## Gauss Seidel iteration, partly based on Algorithm 7.1 (Richardson iteration), p.149, Methods in Computational Science

# Input: Matrix A, vector b
# Output: Vector x so that Ax=b

def gauss_seidel_iteration(A,b):
  x = np.zeros(b.shape)
  tolerance = 10**(-16)

  BA = np.zeros(A.shape)

  for col in range(A.shape[0]):
    BA[:,col] = lower_inverse_on_vector(A,np.transpose(A)[col])

  Bb = lower_inverse_on_vector(A,b)

  r = 1

  while np.linalg.norm(r)/np.linalg.norm(Bb) > tolerance:
    r = np.matmul(BA,x)

    r[:] = Bb[:]-r[:]

    x[:] = x[:] + r[:]
    print("||Ax-b|| = ",np.linalg.norm(np.matmul(A,x)-b))

  return x

## Newton's method for scalar nonlinear equation

To solve a scalar non-linear equation $$
f(x) = 0,
$$
one methods can be used is Newton's method. (Methods in Computational Science, p. 174)

Newtons method is a fixed point method. A general fixed point iteration can be written as  $$
x^{(k+1)} = g(x^{(k)}) = x^{(k)} + \alpha f(x^{(k+1)}).
$$
To get Newtons method, we chose $$
f(x) = - 1/f'(x),
$$
which leads to a quadratic rate of convergence.

The method can be viewed as creating the tangent in a point of the function, and using that point as the new approximation.

Below, a non-linear function as well as Newtons method is presented.

In [29]:
## Newtons method, based on Algorithm 8.2, p.174, Methods in Computational Science

# Input: Function f, starting guess x0
# Output: Approximate root x

def newtons_method(f,x0):
  x = x0
  tolerance = 10**(-16)

  while abs(f(x,"func")) > tolerance: # f(x,"func") gives f(x)

    x = x - f(x,"func")/f(x,"derivative") # f(x,"derivative") gives f'(x)
    print("|f(x)|= ", f(x,"func"))


  return x


# **Results**
In this section, the results from the algorithms presented in the last section are shown.

## Jacobi iteration

To test the convergence of Jacobi iteration, we create diagonally dominant matrix $$
A= \begin{bmatrix}
5&0&-1\\-2&10&1\\0&3&8
\end{bmatrix}
$$
and a vector
$$b = \begin{bmatrix}
1\\1\\0
\end{bmatrix}$$
By printing the residual $$
||Ax-b||
$$in each iteration, we can se that it converges to 0, until the tolerance in the algorithm is reached.

By constructing an exact solution $y$, we can also see if the norm of the resulting vector tends to 0.

In [33]:
A = np.array([[5,0, -1],[-2,10,1],[0,3,8]]) # diagonally dominant
b = np.array([1,1,0])

x = jacobi_iteration(A,b)

print()

print("x = ", x)
exact_solution = np.array([74/391,56/391,-21/391])

print("|x-y| = ", np.linalg.norm(x-exact_solution))

||Ax-b|| =  0.5
||Ax-b|| =  0.1311964176340193
||Ax-b|| =  0.01875000000000009
||Ax-b|| =  0.004804172990744791
||Ax-b|| =  0.00148850084501993
||Ax-b|| =  0.00029635696990848405
||Ax-b|| =  7.349674513485788e-05
||Ax-b|| =  1.5855122835260745e-05
||Ax-b|| =  4.7670132223313976e-06
||Ax-b|| =  1.463960843238776e-06
||Ax-b|| =  2.867704562348136e-07
||Ax-b|| =  1.01327219420185e-07
||Ax-b|| =  3.1087240179759763e-08
||Ax-b|| =  7.72849394225748e-09
||Ax-b|| =  2.5517653426723526e-09
||Ax-b|| =  7.368803402755469e-10
||Ax-b|| =  2.1018753053018427e-10
||Ax-b|| =  6.54915453505514e-11
||Ax-b|| =  1.8807016092336133e-11
||Ax-b|| =  5.594787020055707e-12
||Ax-b|| =  1.68660737728474e-12
||Ax-b|| =  4.915859068392439e-13
||Ax-b|| =  1.4705430982842269e-13
||Ax-b|| =  4.374799983577502e-14
||Ax-b|| =  1.2666277535224123e-14
||Ax-b|| =  4.010656666373001e-15
||Ax-b|| =  9.104505742017336e-16
||Ax-b|| =  4.871083751574258e-16
||Ax-b|| =  2.220446049250313e-16
||Ax-b|| =  2.220446049250313e-16



By loking at the numerical result, we can see that the residual goes to 0. We can also see that the difference between the exact and numerical solution is smaller than the tolerance level of the algorithm, and is very close to 0.

## Gauss Seidel iteration

To test the convergence of Jacobi iteration, we create diagonally dominant matrix $$
A= \begin{bmatrix}
6&0&1\\3&50&1\\0&2&3
\end{bmatrix}
$$
and a vector
$$b = \begin{bmatrix}
1\\2\\0
\end{bmatrix}.$$

Again, we construct the residual $$
||Ax-b||
$$
and seeing if this converges to 0, and also look at the difference between the exact solution and the computed one.

In [34]:
A = np.array([[6,0, 1],[3,5,1],[0,2,3]]) # diagonally dominant
b = np.array([1,2,0])

x = gauss_seidel_iteration(A,b)

print()

print("x = ", x)
exact_solution = np.array([17/84,9/28,-3/14])

print("|x-y| = ", np.linalg.norm(x-exact_solution))



||Ax-b|| =  0.9433981132056605
||Ax-b|| =  0.42687494916218977
||Ax-b|| =  0.2362672686222582
||Ax-b|| =  0.11139962541772676
||Ax-b|| =  0.05870032704390762
||Ax-b|| =  0.030124752046631627
||Ax-b|| =  0.015062842298417211
||Ax-b|| =  0.007893255230152725
||Ax-b|| =  0.004011512893632189
||Ax-b|| =  0.0020546469088300453
||Ax-b|| =  0.0010603179739570208
||Ax-b|| =  0.0005412314635050782
||Ax-b|| =  0.00027833052232868803
||Ax-b|| =  0.00014284414506531882
||Ax-b|| =  7.318967147645076e-05
||Ax-b|| =  3.760060339760802e-05
||Ax-b|| =  1.9281475952110146e-05
||Ax-b|| =  9.892693386252103e-06
||Ax-b|| =  5.0775574335642285e-06
||Ax-b|| =  2.604454790005896e-06
||Ax-b|| =  1.3365201758124283e-06
||Ax-b|| =  6.857643396087107e-07
||Ax-b|| =  3.5183295668104024e-07
||Ax-b|| =  1.8053657887626318e-07
||Ax-b|| =  9.262868218433063e-08
||Ax-b|| =  4.752707347781646e-08
||Ax-b|| =  2.438626251892556e-08
||Ax-b|| =  1.2512188441740704e-08
||Ax-b|| =  6.4199730121574435e-09
||Ax-b|| =  3.2940424

Again, we can see that the residual goes to 0. Additionally, the difference between the exact and numerical solution is very small.

## Newton's method for scalar nonlinear equation

We have the non-linear function $$
f(x) = x^2+x-12
$$
and similarly to the two previous cases, we look at the residual $$
|f(x)|
$$
which should go to 0.

Additionally, we construct an exact solution $y $ and compute $|x-y|$

In [22]:
def f(x,type):
  if type == "derivative":
    return 2*x +1
  else:
    return x**2+x-12

In [35]:
x = newtons_method(f,2)

exact_solution = 3
print("|x-y| = ", x-exact_solution)


|f(x)|=  1.4400000000000013
|f(x)|=  0.03786705624543352
|f(x)|=  2.917336959562533e-05
|f(x)|=  1.737099353249505e-11
|f(x)|=  0.0
|x-y| =  0.0


We can see that the residual goes to 0, as well as that the difference between $x$ and $y$ is very small.

# **Discussion**

All three of the methods behaved as expected. They all converged to the exact solution. Also, depending on the tolerance set, the computed result was closer or further away from the exact solution. We can also see that Newton's method converged quicker than the other methods, which is expected since the method has a quadratic rate of convergence.