# **Lab 3: Iterative methods**
**Fabián Levicán**

# **Abstract**

This is the third lab in the course DD2363 Methods in Scientific Computing. It is about using Jupyter to implement four iterative methods for solving linear and non-linear equations. Some objectives may be to become familiar with the differences between "discrete" and "continuous" methods, and also between direct and iterative methods. The functions implemented are jacobi, gaussSeidel, newtonScalar, and newtonVector.  The residuals of the solutions are then compared to the exact solutions of a few equations, and the results are favorable.

#**About the code**

In [1]:
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2020 Fabián Levicán (fils2@kth.se)

# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

In [0]:
# Load neccessary modules.
from google.colab import files

import math
import time
import numpy as np

# **Introduction**

Iterative methods are (in general) algorithms that, given an equation and possibly an initial solution and a number of iterations, return an approximation of the solution to the equation. They are called iterative because the $k$-th step uses the solution obtained in the $(k-1)$-th step, and the solutions (ideally) form a sequence that is convergent to the exact solution.

The following are implementations of the iterative methods mentioned in the abstract. Linear equations of the form $Ax = b$, and non-linear equations of the form $f(x) = 0$, are henceforth assumed. The author decided to implement newtonScalar and newtonVector spearately, although he is aware that one is a special case of the other.



# **Methods**

The method jacobi takes a matrix $A$ and a vector $b$ as input and returns None if the matrix is not square, if the matrix has a zero entry in the diagonal, if the residual $\|Ax - b\|$ becomes too large, or if the method does not converge after a certain number of iterations and the residual is still too large. The author decided not to consider more sophisticated criteria for convergence, as he couldn't find a necessary and sufficient condition for it (only either necessary or sufficient conditions). Instead, the method may halt in the aforementioned way. The method returns an approximate solution $x$ if the residual is small enough.

Two copies of the solution are stored and used in every iteration. Every entry of the most recent copy at every iteration is updated according to the formula $x_i^{(k + 1)} = \frac{1}{a_{ii}}\left (b_i - \sum_{j \neq i} a_{ij}x_j^{(k)}\right )$, which follows from the Jacobi iteration $x^{(k + 1)} = D^{- 1}(b - Rx^{(k)})$, where $D$ is the diagonal of $A$ and $R$ is the matrix containing the rest of the entries of $A$ (and $0$s elsewhere).

The initial solution is assumed to be the zero vector.

In [0]:
def jacobi(A, b):
  # Initialize variables
  # The initial solution is assumed to be the zero vector
  # The algorithm halts if the residual lies outside the interval [epsilon, maxResidual]
  # or if its is greater than maxIts = 10000
  n = A.shape[0]
  x1 = np.zeros(n)
  x2 = np.zeros(n)
  residual = 1
  epsilon = 0.0001
  maxResidual = 10000
  maxIts = 10000
  its = 0

  # Basic error handling
  if(A.shape[0] != A.shape[1]):
    print("Error: The matrix is not square.")
    return None
  flag = False
  for i in range(n):
    if(A[i][i] == 0):
      flag = True
  if(flag):
    print("Error: The matrix has a zero entry in the diagonal.")
    return None

  while(residual >= epsilon and residual <= maxResidual and its <= maxIts):
    # Update x2 (i. e., x_(k+1))
    for i in range(n):
      sum = 0
      for j in range(n):
        if(j != i):
          sum += A[i][j]*x1[j]
      x2[i] = 1.0/A[i][i]*(b[i] - sum)

    # Update x1 (i. e., x_k)
    for i in range(n):
      x1[i] = x2[i]

    # Calculate the residual
    residual = np.linalg.norm((A @ x2) - b)

    # Update its
    its += 1

  # Return the approximate solution
  if(np.linalg.norm((A @ x2) - b) < epsilon):
    return x2
  else:
    return None

The method gaussSeidel is almost identical to the method jacobi, except that only one copy of the solution is stored and used. This is because every entry at every iteration is updated according to the formula $x_i^{(k + 1)} = \frac{1}{a_{ii}}\left (b_i - \sum_{j < i} a_{ij}x_j^{(k + 1)} - \sum_{j > i} a_{ij}x_j^{(k)}\right )$ which follows from using forward substitution on the Gauss-Seidel iteration $Lx^{(k + 1)} = b - Ux^{(k)}$, and if the entries of the solution are updated in increasing order, it is easy to see that the formula doesn't use any entries of the previous solution that have already been updated (i. e., it only uses entries of the current solution, and entries of the previous solution that haven't been updated yet). $L$ is the lower triangular component of $A$ and $U$ is the upper triangular component of $A$.

In [0]:
def gaussSeidel(A, b):
  # Initialize variables
  # The initial solution is assumed to be the zero vector
  # The algorithm halts if the residual lies outside the interval [epsilon, maxResidual]
  # or if its is greater than maxIts = 10000
  n = A.shape[0]
  x = np.zeros(n)
  residual = 1
  epsilon = 0.0001
  maxResidual = 10000
  maxIts = 10000
  its = 0

  # Basic error handling
  if(A.shape[0] != A.shape[1]):
    print("Error: The matrix is not square.")
    return None
  flag = False
  for i in range(n):
    if(A[i][i] == 0):
      flag = True
  if(flag):
    print("Error: The matrix has a zero entry in the diagonal.")
    return None

  while(residual >= epsilon and residual <= maxResidual and its <= maxIts):
    # Update x
    for i in range(n):
      sum = 0
      for j in range(n):
        if(j != i):
          sum += A[i][j]*x[j]
      x[i] = 1.0/A[i][i]*(b[i] - sum)
    
    # Calculate the residual
    residual = np.linalg.norm((A @ x) - b)

    # Update its
    its += 1
  
  # Return the approximate solution
  if(np.linalg.norm((A @ x) - b) < epsilon):
    return x
  else:
    return None

The method newtonScalar takes a function $f$ and its first derivative (which is also a function) as inputs and returns None if the derivative at some iteration is $0$ and the evaluation $|f(x)|$ is still too large, or if the method doesn't converge after a certain number of iterations and the evaluation is still too large. The method returns an approximate solution $x$ if the evaluation is small enough.

At every iteration, the solution is updated according to the formula $x^{(k + 1)} = x^{(k)} - \frac{f(x^{(k)})}{f'(x^{(k)})}$.

The initial solution is assumed to be zero.

In [0]:
def newtonScalar(func, derivative):
  # Initialize variables
  # The initial solution is assumed to be zero
  # The algorithm halts if the absolute value of func(x) is less than epsilon = 0.0001
  # or if its is greater than maxIts = 10000
  x = 0
  absfx = 1
  epsilon = 0.0001
  maxIts = 10000
  its = 0

  while(absfx >= epsilon and its <= maxIts):
    # Update iteration variables
    funcx = func(x)
    derivativex = derivative(x)

    # Check if the derivative at x is zero
    if(derivativex == 0):
      print("Error: The derivative at iteration " + str(its) + " is zero, and thus the algorithm can't continue.")
      break
    
    # Update x
    x -= funcx/derivativex

    # Update absfx
    absfx = abs(funcx)

    # Update its
    its += 1
  
  # Return the approximate root
  if(abs(func(x)) < epsilon):
    return x
  else:
    return None

The method newtonVector is almost identical to the method newtonScalar, except that the function $f$ is a map from $\mathbb{R}^n$ to $\mathbb{R}^n$, and the derivative is instead its Jacobian matrix $J$, which is a map from $\mathbb{R}^n$ to $\mathbb{R}^{n \times n}$. These functions work as usual Python functions. Naturally, the evaluation is $\|f(x)\|$. The solution is updated according to the formula $x^{(k + 1)} = x^{(k)} - J^{-1}(x^{(k)})\frac{f(x^{(k)})}{f'(x^{(k)})}$.

In [0]:
def newtonVector(func, JacobianMatrix, n):
  # Initialize variables
  # The initial solution is assumed to be the zero vector
  # The algorithm halts if the norm of func(x) is less than epsilon = 0.0001
  # or if its is greater than maxIts = 10000
  x = np.zeros(n)
  normfx = 1
  epsilon = 0.0001
  maxIts = 10000
  its = 0

  while(normfx >= epsilon and its <= maxIts):
    # Update iteration variables
    funcx = func(x)
    JacobianMatrixx = JacobianMatrix(x)

    # Check if the Jacobian matrix at x is not invertible
    if(np.linalg.det(JacobianMatrixx) == 0):
      print("Error: The Jacobian matrix at iteration " + str(its) + " is not invertible, and thus the algorithm can't continue.")
      break

    # Update x
    x -= np.linalg.inv(JacobianMatrixx) @ funcx

    # Update normfx
    normfx = np.linalg.norm(funcx)

    # Update its
    its += 1

  # Return the approximate root
  if(np.linalg.norm(func(x)) < epsilon):
    return x
  else:
    return None

# **Results**

The methods jacobi and gaussSeidel are tested with various systems of linear equations. Both methods should diverge with the three first systems, and both methods should converge with the three final systems. Both methods should print an error with the system that has a zero entry in the diagonal. When the methods converge, the quantity $\|x - y\|$, where $x$ is the returned vector and $y$ is a vector known to be close enough to the solution, is asserted to be less than epsilon.

In [7]:
# This matrix has a zero entry in the diagonal
A1 = np.array([[-1.0, 0.0, 1.0], [0.0, 0.0, 0.0], [1.0, 0.0, -1.0]])
b1 = np.array([-1.0, 1.0, -1.0])

# The methods should diverge for this matrix
A2 = np.array([[-2, 3, -5, 7, -11], [13, -17, 19, -23, 29], [-31, 37, -41, 43, -47], [53, -59, 61, -67, 71], [-73, 79, -83, 89, -97]])
b2 = np.array([42, 42, 42, 42, 42])

# The methods should diverge for this matrix (from Wikipedia)
A3 = np.array([[2.0, 3.0], [5.0, 7.0]])
b3 = np.array([11.0, 13.0])

# The methods should converge for this matrix (from Wikipedia)
A4 = np.array([[2.0, 1.0], [5.0, 7.0]])
b4 = np.array([11.0, 13.0])
y4 = np.array([7.1111, -3.2222])

# The methods should converge for this matrix (from Wikipedia)
A5 = np.array([[16.0, 3.0], [7.0, -11.0]])
b5 = np.array([11.0, 13.0])
y5 = np.array([0.8122, -0.6650])

# This matrix should converge for this matrix (from Wikipedia)
A6 = np.array([[10.0, -1.0, 2.0, 0.0], [-1.0, 11.0, -1.0, 3.0], [2.0, -1.0, 10.0, -1.0], [0.0, 3.0, -1.0, 8.0]])
b6 = np.array([6.0, 25.0, -11.0, 15.0])
y6 = np.array([1.0, 2.0, -1.0, 1.0])

epsilon = 0.0001
assert jacobi(A1, b1) == None
assert gaussSeidel(A1, b1) == None
assert jacobi(A2, b2) == None
assert gaussSeidel(A2, b2) == None
assert jacobi(A3, b3) == None
assert np.linalg.norm(jacobi(A4, b4) - y4) < epsilon
assert np.linalg.norm(gaussSeidel(A4, b4) - y4) < epsilon
assert np.linalg.norm(jacobi(A5, b5) - y5) < epsilon
assert np.linalg.norm(gaussSeidel(A5, b5) - y5) < epsilon
assert np.linalg.norm(jacobi(A6, b6) - y6) < epsilon
assert np.linalg.norm(gaussSeidel(A6, b6) - y6) < epsilon
print("Tests passed successfully!")

Error: The matrix has a zero entry in the diagonal.
Error: The matrix has a zero entry in the diagonal.
Tests passed successfully!


The method newtonScalar is tested with:
1. $f(x) = x^2 + 1$, $f'(x) = 2x$. As $f'(x_0) = 0$, the method should print an error, and return None.
2. $f(x) = cos(x)$, $f'(x) = -sin(x)$. Idem.
3. $f(x) = sin(x)$, $f'(x) = cos(x)$. As $f(x_0) = 0$, the method should return $0$ after iteration $0$.
4. $f(x) = x^3 + x$, $f'(x) = 3x^2 + 1$. As $f(x_0) = 0$, the method should return $0$ after iteration $0$.
5. $f(x) = x - \pi$, $f'(x) = 1$. The method should converge to $\pi$.
6. $f(x) = cos(x) - x$, $f'(x) = -sin(x) - 1$. The method should converge to the cosine fixed point constant.

When the method converges (i. e., when it doesn't return None), the quantity $|x - y|$, where $x$ is the returned value and $y$ is a value known to be close enough to the solution, is asserted to be less than epsilon.

In [8]:
epsilon = 0.0001
assert newtonScalar(lambda x: x**2 + 1, lambda x: 2*x) == None
assert newtonScalar(math.cos, lambda x: -math.sin(x)) == None
assert abs(newtonScalar(math.sin, math.cos) - 0.0) < epsilon
assert abs(newtonScalar(lambda x: x**3 + x, lambda x: 3*x**2 + 1) - 0.0) < epsilon
assert abs(newtonScalar(lambda x: x - math.pi, lambda x: 1) - math.pi) < epsilon
assert abs(newtonScalar(lambda x: math.cos(x) - x, lambda x: -math.sin(x) - 1) - 0.7391) < epsilon
print("Tests passed successfully!")

Error: The derivative at iteration 0 is zero, and thus the algorithm can't continue.
Error: The derivative at iteration 0 is zero, and thus the algorithm can't continue.
Tests passed successfully!


The method newtonVector is tested with:
1.  $f(x, y, z) = (cos(x), x + y + z, x + y + z)$, $J(x, y, z) = \{\{-sin(x), 0, 0\}, \{1, 1, 1\}, \{1, 1, 1\}\}$. As $-sin(0) = 0$, the method should print an error, and return None.
2.  $f(x, y) = (2x - 3y + 5, 4x - 7y + 10)$, $J(x, y) = \{\{2, -3\}, \{4, -7\}\}$. The method should converge to $(-2.5, 0)$.
3.  $f(x, y) = (x - 2)^2 + (y - 8)^2 = 40$, $J(x, y) = \{\{2(x - 2), 2(y - 8)\}, \{-2, 1\}\}$. The method should converge to $(4, 2)$.
4.  $f(x, y) = (e^x - 1, cos(y) - 1)$, $J(x, y) = \{\{e^x, 0\}, \{0, -sin(y)\}\}$. As $sin(0) = 0$, the method should print an error, but because $f(0, 0) = (0, 0)$, the method should return $(0, 0)$ after iteration $0$.

When the method converges (i. e., when it doesn't return None), the quantity $\|x - y\|$, where $x$ is the returned vector and $y$ is a vector known to be close enough to the solution, is asserted to be less than epsilon.

In [10]:
epsilon = 0.0001
assert newtonVector(lambda x: np.array([math.cos(x[0]), x[0] + x[1] + x[2], x[0] + x[1] + x[2]]), lambda x: np.array([[-math.sin(x[0]), 0, 0], [1, 1, 1], [1, 1, 1]]), 3) == None
assert np.linalg.norm(newtonVector(lambda x: np.array([2.0*x[0] - 3.0*x[1] + 5.0, 4.0*x[0] - 7.0*x[1] + 10.0]), lambda x: np.array([[2.0, -3.0], [4.0, -7.0]]), 2) - np.array([-2.5, 0])) < epsilon
# (from lumenlearning.com)
assert np.linalg.norm(newtonVector(lambda x: np.array([(x[0] - 2)**2 + (x[1] - 8)**2 - 40, -2*x[0] + x[1] +6]), lambda x: np.array([[2*(x[0] - 2), 2*(x[1] - 8)], [-2, 1]]), 2) - np.array([4.0, 2.0])) < epsilon
assert np.linalg.norm(newtonVector(lambda x: np.array([math.exp(x[0]) - 1.0, math.cos(x[1]) - 1]), lambda x: np.array([[math.exp(x[0]), 0.0], [0.0, -math.sin(x[1])]]), 2) - np.array([0.0, 0.0])) < epsilon
print("Tests passed successfully!")

Error: The Jacobian matrix at iteration 0 is not invertible, and thus the algorithm can't continue.
Error: The Jacobian matrix at iteration 0 is not invertible, and thus the algorithm can't continue.
Tests passed successfully!


All the tests pass successfully.

# **Discussion**

The author thinks it is surprising that the Gauss-Seidel method is in a very concrete way simpler than the Jacobi method, and yet it is theoretically faster. The author also thinks the treatment of lambda functions in python is extremely intuitive and useful. The results were favourable and expected. 

The algorithms here presented could be improved by adding more sophisticated criteria for convergence, such as considering the spectral radius for the linear methods, and the norm of the derivative or Jacobian matrix for the non-linear methods.

The class notes were extensively consulted. The Jacobi method and Gauss-Seidel method articles in the English Wikipedia were used while writing this document. [This](https://https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf) paper was also used for the part on the Newton method for vector functions. One system of non-linear equations was obtained from a webpage on [this](http://lumenlearning.com) website.

The author collaborated with Pablo Aravena and Felipe Vicencio.