<a href="https://colab.research.google.com/github/johanhoffman/DD2363-VT20/blob/ejemyr/Lab-3/ejemyr_lab3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 3: Iterative methods**
**Christoffer Ejemyr**

# Abstract

In this lab many different itterative methods were investigated. They generally succeded, but not allways. It is intresting how the initial values sometimes can matter to the degree that the method never converges for some initial values.

# About the code

A short statement on who is the author of the file, and if the code is distributed under a certain license. 

In [1]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2019 Christoffer Ejemyr (ejemyr@kth.se)

# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

To have access to the neccessary modules you have to run this cell. If you need additional modules, this is where you add them. 

In [0]:
# Load neccessary modules.

import time
import numpy as np
import unittest
import math
import random

from matplotlib import pyplot as plt
from matplotlib import tri
from matplotlib import axes
from mpl_toolkits.mplot3d import Axes3D

# Introduction

In this lab we will solve systems of linear equations, as well as finding zeros of functions. This will all be done using iterative methods.

# Methods

### Standard basis
Super simple vector generator. Replace element $i$ with $1$.

In [0]:
def standard_basis(n: int, i: int):
    e_i = np.zeros(n)
    e_i[i] = 1.
    return e_i

### Spectral radius

We define the spectral radius of a matrix $M$ as 

$$\rho(M) = \text{max}\lbrace|\lambda_1|, |\lambda_2|, \ldots, |\lambda_n|\rbrace$$

where $\lambda_1, \lambda_2, \ldots, \lambda_n$ are the eigenvalues of $M$.


In [0]:
def spectral_radius(M):
    if type(M) != np.ndarray or M.ndim != 2:
        raise Exception("M matrix format not recogniced.")
    return np.max(np.abs(np.linalg.eig(M)[0]))

### Richardson iteration

Below I defined the left preconditioned Richardson iteration. Using $B = I$ (letting parameter `B=None`
) you get the non preconitioned Richardson iteration.

In my implementation I have the method raising an Exception when $\rho(I - \alpha BA) \geq1$. This is beceause we can not guarantee convergence. But having $\rho(I - \alpha BA) \geq 1$ does not necessarily make it divergent.

I've also changed the stoping criteria to be $||b - Ax|| < \text{TOL}$ instead of $||B(b - Ax)|| < \text{TOL}$, since it generally is the residual $||b - Ax||$ you want to minimize.

In [0]:
def richardson_iteration(A, b, alpha, tol=1e-6, x0=None, B=None):
    """The left preconditioned Richardson iteration."""
    
    if type(A) != np.ndarray or A.ndim != 2:
        raise Exception("A matrix format not recogniced.")
    if B is None:
        B = np.eye(A.shape[0])
    if type(B) != np.ndarray or B.ndim != 2:
        raise Exception("B matrix format not recogniced.")
    if A.shape[0] != A.shape[1]:
        raise Exception("Matrix not square.")
    if (x0 is not None) and x0.size != A.shape[1]:
        raise Exception("Shapes of x0 and A does not agree.")
    if A.shape[0] != B.shape[1]:
        raise Exception("Shapes of A and B does not agree.")
    if B.shape[0] != b.size:
        raise Exception("Shapes of B and b does not agree.")
    
    x = None
    if x0 is None:
        x = np.zeros(A.shape[1])
    else:
        x = x0.copy()
        
    if spectral_radius(np.eye(B.shape[0]) - alpha * B.dot(A)) >= 1:
        return None
    
    r = np.zeros(B.shape[0])
    r[:] = b - A.dot(x)
    i = 0
    while np.linalg.norm(r) > tol:
        r[:] = b - A.dot(x)
        x[:] = x[:] + alpha * B.dot(r)
        i += 1

    return x, i

### Jacobi iteration

As the lecture notes pointed out the Jacobi iteration is only the left preconditioned richardson itteration with $B = (\alpha D)^{-1}$, where $D$ is the diagonal matrix with $\text{diag}(D) = \text{diag}(A)$.

In [0]:
def check_jacobi_convergence(A):
    if (np.diag(A) != 0).all():
        B = np.diag(1. / np.diag(A))
        return spectral_radius(np.eye(B.shape[0]) - B.dot(A)) < 1
    else:
        return False

def jacobi_iteration(A, b, tol=1e-6, x0=None):
    if check_jacobi_convergence(A):
        B = np.diag(1. / np.diag(A))
        return richardson_iteration(A, b, 1., tol=tol, x0=x0, B=B)
    else:
        return None

### Gauss-Seidel iteration

As the lecture notes pointed out the Gauss-Seidel iteration is only the left preconditioned richardson itteration with $B = (\alpha L)^{-1}$, where $L$ is the lower triangonal matrix created by zeroing out the over-diagonal elements in $A$.

In [0]:
def check_gauss_seidel_convergence(A):
    if (np.diag(A) != 0).all():
        B = np.linalg.inv(np.tril(A))
        return spectral_radius(np.eye(B.shape[0]) - B.dot(A)) < 1
    else:
        return False

def gauss_seidel_iteration(A, b, tol=1e-6, x0=None):
    if check_gauss_seidel_convergence(A):
        B = np.linalg.inv(np.tril(A))
        return richardson_iteration(A, b, 1., tol=tol, x0=x0, B=B)
    else:
        return None

### Newtons method

In [0]:
def get_derivative(f, x: np.array, dx_vec: np.array):
    return (f(x + dx_vec) - f(x - dx_vec)) / (2 * np.linalg.norm(dx_vec))

def jacobian(f, x0, dx: float):
    n = x0.size
    Df = np.zeros((n,n))
    for i in range(0,n):
        Df[:, i] = get_derivative(f, x0, dx * standard_basis(n, i))
    return Df

def newtons_method(f, x0, dx: float, tol=1e-6, max_itr=1e3):
    x = x0
    i = 0
    while np.linalg.norm(f(x)) >= tol:
        if i >= max_itr:
            print("Max itr")
            return None
        i += 1

        Df = jacobian(f, x0, dx)
        if np.allclose(np.linalg.det(Df), 0, atol=1e-9):
            print("Singular jacobian")
            return None
        
        x[:] = x - np.linalg.solve(Df, f(x))

    return x, i

def scalar_newton(f, x0, dx: float, tol=1e-6, max_itr=1e4):
    if type(x0) != np.ndarray:
        x0 = np.array([x0])
    
    ans = newtons_method(f, x0, dx, tol=tol, max_itr=max_itr)
    if ans is None:
        return None, 0
    else:    
        return ans[0][0], ans[1]


### Arnoldi iteration

I used the algorithm in the lecturenotes with slight modifications. Having problems with the algorithm dividing by zero I (with slight inpiration from Wikipedia, heh.) added a test `H[j + 1, j] > 1e-12` to ensure that no `nan` values occur.

In [0]:
def arnoldi_iteration(A, b, k: int):
    if type(A) != np.ndarray or A.ndim != 2:
        raise Exception("A matrix format not recogniced.")
    if type(b) != np.ndarray or b.ndim != 1:
        raise Exception("b vector format not recogniced.")
    if A.shape[0] != b.size:
        raise Exception("Shapes of A and b does not agree.")
    
    H = np.zeros((k + 1, k))
    Q = np.zeros((A.shape[0], k + 1))
    Q[:, 0] = b / np.linalg.norm(b)
    
    for j in range(k):
        v = A.dot(Q[:, j])
        for i in range(j + 1):
            H[i, j] = np.dot(Q[:, i].conj(), v)
            v = v - H[i, j] * Q[:, i]

        H[j + 1, j] = np.linalg.norm(v)
        if H[j + 1, j] > 1e-12:
            Q[:, j + 1] = v / H[j + 1, j]
        else:
            break
    return Q, H

### GMRES algorithm

Since we already written a least squares solver in a previous lab I use Numpy's `numpy.linalg.lstsq` method. To the algorithm in the lecture notes I've also added a maximum number of itterations.

In [0]:
def gmres(A, b, max_itr=None, tol=1e-6):
    if type(A) != np.ndarray or A.ndim != 2:
        raise Exception("A matrix format not recogniced.")
    if type(b) != np.ndarray or b.ndim != 1:
        raise Exception("b vector format not recogniced.")
    if A.shape[0] != b.size:
        raise Exception("Shapes of A and b does not agree.")
    
    norm_b = np.linalg.norm(b)
    
    Q = np.zeros((b.size, 1))
    Q[:, 0] = b[:]/norm_b
    
    y = None
    r = tol * norm_b
    
    k = 0
    while np.linalg.norm(r) >= tol * norm_b:
        Q, H = arnoldi_iteration(A, b, k)
        y = np.linalg.lstsq(H, norm_b * standard_basis(k+1, 0), rcond=None)[0]
        r = H.dot(y)
        r[:] = norm_b * standard_basis(k+1, 0) - r[:]
        k += 1
        if not(max_itr is None) and k >= max_itr:
            break
    
    x = Q[:, 0:k-1].dot(y)
    return x, k

# Testing

## Iteration algorithms

The testing of accuracy of the iteration solvers are very alike. Therefore I defined a `test_iteration_solver` method. It generates random matrix $A$ of size $\text{max_size} \times \text{max_size}$ and a random vector $x$ of size $\text{max_size}$ and then creates $b = Ax$. It then checks $||x_{est} - x|| \approx 0$ and $||Ax - b|| \approx 0$ down to `decimal` decimals. The process is repeated `num_of_tests` times.

In [0]:
def test_iteration_solver(solver, decimal=4, num_of_tests=1000, max_size=10, alpha=None):
    i = 0
    tol = 1e-6

    while i < num_of_tests:
        n = np.random.randint(1, max_size)
        A = 1000 * np.random.rand(n, n)
        x_true = np.random.rand(n)
        b = A.dot(x_true)
        
        if np.allclose(np.linalg.det(A), 0, 1e-9):
            continue

        ans = None
        if solver == richardson_iteration:
            ans = solver(A, b, alpha, tol=tol)
        else:
            ans = solver(A, b, tol=tol)
        if ans is None:
            continue
        i += 1

        x = ans[0]
        
        np.testing.assert_allclose(
            np.linalg.norm(A.dot(x) - b),
            0,
            atol=10 * tol)

class TestIterationSolvers(unittest.TestCase):
    def test_jacobi(self):
        test_iteration_solver(jacobi_iteration)
                
    def test_gauss_seidel(self):
        test_iteration_solver(gauss_seidel_iteration)

    def test_gmres(self):
        test_iteration_solver(gmres)

## Newtons method
For the scalar Newtons method I generate polynomials with roots spaced along the $x$-axis. I use Newtons method to find these polynomials to find the zeros of the function and then check for accuracy. I checked that $|f(x)|\approx 0$ and that $|x-r|\approx 0$ for some root $r$. I calculated the tolerance of $|x-r|$ by using the derivative at the calculated root and using a linear approximation. Then the tolerance in the $x$-axis is given by the contition that
$$|x - r| < \frac{\text{TOL}}{f'(x)}$$
where $\text{TOL}$ is the tolerance in the $y$-axis.

I tested for a $1000$ random polynomials in a predefined (convenient) subspace of polynomials ($\text{deg} < 10$, $r < \text{10000}$, $0.1 < |r_n - r_m| < 100$).

In [0]:
def get_rand_polynomial(deg: int, root_max_abs: float, minimal_root_dist: float):
    if 2 * root_max_abs < (deg - 1) * minimal_root_dist:
        raise Exception("Intervall error")

    roots = []
    low = -root_max_abs
    high = root_max_abs - (deg - 1) * minimal_root_dist
    for i in range(deg):
        roots.append(random.uniform(low, high))
        low = roots[i] + minimal_root_dist
        high += minimal_root_dist

    def f(x):
        y = 1
        for root in roots:
            y *= (x - root)
        return y

    return f, roots

class TestNewtonScalar(unittest.TestCase):
    def test_rand_polynomial(self):
        max_deg = 2
        max_dist = 100
        max_root_max = 10000
        tol = 1e-6

        for i in range(1000):
            root_dist = random.uniform(0.1, max_dist)
            deg = random.randint(1, max_deg)
            root_max = random.uniform((deg - 1) * root_dist, max_root_max)
            
            f, roots = get_rand_polynomial(deg, root_max, root_dist)

            x0 = random.uniform(-root_max, root_max)
            root = scalar_newton(f, x0, tol)[0]

            np.testing.assert_allclose(f(root), 0, atol=tol)
            is_close = False
            dfdx = get_derivative(f, np.array([root]), np.array([1e-6]))[0]
            for r in roots:
                diff = r - root
                if np.allclose(diff, 0, atol=tol/abs(dfdx)):
                    is_close = True
                    break

            assert is_close


In [17]:
if __name__ == '__main__':
    unittest.main(argv=['first-arg-is-ignored'], exit=False)

....
----------------------------------------------------------------------
Ran 4 tests in 9.055s

OK


# Results

Above we can se that the iterative solvers all solve systems of linear equations. The toleranse-levels are generally accomplished, but not allways (strange...).

The method for scalar Newton succeed in all polynomials with zeros not to close together. It can not allways be guaraanteed to succeed since you can create "deadlocks", but that is more common in symetric equations as $cos(x)$.


# **Discussion**

Having more time for the lab it would have been interesting to investigate both the number of iterations taken by the different methods, but allso the absolute time. The GMRES has a very different method to the Jacobi and Gauss-Seidel methods, so even thou the number of itterations are fewer the time complexity or absolute time might be very different.