<a href="https://colab.research.google.com/github/johanhoffman/DD2363_VT23/blob/main/Lab2/reinisfreibergs_lab2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 1: iterative methods 
**Reinis Freibergs**

# **Abstract**


The objective of the lab is to implement and test various iterative methods - the Jacobi iteration, Gauss-Seidel iteration, Newton's method for scalar nonlinear equation and Newton's method for vector nonlinear equations.

All algorithms were tested with random matrices with the assumed configuration and returned the expected outputs.


# **About the code**

In [138]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""


# Author: Reinis Freibergs, 2023

# Based on a template:
# Copyright (C) 2023 Johan Hoffman (jhoffman@kth.se)


# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This template is maintained by Johan Hoffman
# Please report problems to jhoffman@kth.se

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

To have access to the neccessary modules you have to run this cell. If you need additional modules, this is where you add them. 

In [139]:
# Load neccessary modules.
#from google.colab import files

import numpy as np

# **Introduction**

In this lab iterative methods for solving systems of linear and nonlinear equations are examined. In comparison to direct methods based on matrix factorization, iterative methods generate approximate solutions, as a result being much faster and having lower memory demands, which is especially important for very large systems. Of course, approximation also requires to follow the errors and convergence rates.


In this report the implementations of iterative methods for finding the solutions of systems of linear and nonlinear equations are given:<br>
1.  Jacobi iteration
<br>
2. Gauss - Seidel iteration
<br>
3. Newton's method for scalar nonlinear equation
4. Newton's method for vector nonlinear equation

All implementations are based on materials from the lecture notes.




# **Methods**

### Jacobi iteration

Stationary iterative methods are based on the fixed point iteration ($\textit{eq. 7.10}$) to solve the equation $x=g(x)$:

$$x^{k+1} = g(x^k)$$

by putting the linear operator $g(x) = Mx^{k} + c$. Then $M \in R^{nxn}$ is the iteration matrix, ${x}^k$ is the sequence of approximations and $x^k \in R^n$ together with $c \in R^n$ are constant vectors.

Chapter 7.7 in the book describes matrix splitting - a method to formulate a stationary iterative method by splitting the matrix $A$ into a sum of two matrices:

$$A = A_1 + A_2$$

where $A_1$ is an easily invertible matrix. The Jacobi iteration described in example $\textit{7.8}$ is based on the splitting:

$$A_1 = D,     A_2 = A - D$$

where $D = diag(A)$. In that case the iteration matrix $M$ is given by:

$$M = -D^{-1}(A - D) = (I - D^{-1}A)$$

and the vector $c$:

$$c = D^{-1}b$$

Finally giving the form:

$$x^{(k+1)} = -D^{-1}(A - D)x^{(k)} + D^{-1}b$$


The convergence criterion is given by:

$$\Vert I - D^{-1}A \Vert \le 1$$ 

Meaning that the matrix $A$ must be diagonally dominant for the solution to converge.

In [140]:
def jacobi_iteration(A, b, tol):
    
    d = A * np.eye(A.shape[0])
    a_2 = A - d
    # since its diagonal d^-1[i,i] = 1/d[i,i]
    d_inv = np.divide(1, d, out=np.zeros_like(d), where=d != 0)
    x = np.zeros_like(b)
  
    iterations = 0
    while np.linalg.norm(A@x - b) / np.linalg.norm(b) > tol and iterations < 10:
        x = d_inv @ (-a_2 @ x + b)
        iterations += 1

    return x
        

### Gauss-Seidel iteration

The Gauss-Seidel is similar to the Jacobi iteration, except that the matrix split is done by:

$$A_1 = L,     A_2 = A - L$$

where L is the lower triangular matrix produced by zeroing out all elements above the diagonal of matrix A, which can be inverted by forward substitution.

In [141]:
def forward_substitution(U, b):
    n = len(b)
    x = np.zeros(shape=(n,))
    x[0] = b[0] / U[0, 0]
    for i in range(1, n):
        sum = 0
        for j in range(0, i):
            sum += U[i, j]*x[j]
        x[i] = (b[i] - sum)/U[i, i]

    return x

def gauss_seidel_iteration(A, b, tol):
    
    l = np.tril(A)
    a_2 = A - l

    x = np.zeros_like(b)
     
    iterations = 0
    while np.linalg.norm(A@x - b) / np.linalg.norm(b) > tol and iterations < 10:
        x = forward_substitution(l, (b - a_2 @ x))
        iterations += 1

    return x
    
    

### Newton's method for scalar nonlinear equation $f(x)=0$

In comparison to the previously viewed Jacobi and Gauss-Seidel iteration, Newton's method works with nonlinear equations. 
It is defined as the solution of:

$$f(x) = 0$$

with the fixed point iteration (8.2):

$$x^{(k+1)} = x^{(k)} + \alpha f(x^{(k)})$$

with $\alpha = -f'(x^{(k)})^{-1}$, thus giving the final form:

$$x^{(k+1)} = x^{(k)} - \frac{f(x^{(k)})}{f'(x^{(k)})} $$

which geometrically corresponds to finding the root from the tangent-line.

To calculate the derivative $f'(x^{(k)})$ here the central-difference scheme will be used.

In [142]:
def derivative(f, x):
    delta = 0.001
    return (f(x + delta) - f(x - delta)) / (2*delta)

def newtons_method(f, x0):
    x = x0
    tol = 0.001
    iterations = 0
    while np.linalg.norm(f(x)) > tol and iterations < 100:
        df = derivative(f, x)
        x = x - f(x)/df
        iterations += 1
        
    return x
    

### Newton's method for vector nonlinear equation $f(x)=0$

Newton's method for a system of nonlinear equations is analogous to the previously implemented case of the scalar equation, but instead there is a system of equations and instead of the derative a Jacobian matrix is used, which includes all partial derivatives:

$$
D = \begin{pmatrix}
  \frac{\partial u_1}{\partial x_1} & 
    \dots & 
    \frac{\partial u_1}{\partial x_n} \\[1ex] % <-- 1ex more space between rows of matrix
  \vdots & 
    \vdots & 
    \vdots \\[1ex]
  \frac{\partial u_n}{\partial x_1} & 
    \dots & 
    \frac{\partial u_n}{\partial x_n}
\end{pmatrix}
$$


In [143]:
def newton_vector(f, x):
    tol = 0.001
    iterations = 0
    while np.linalg.norm(f(x)) > tol and iterations < 100:
        j = jacobian(x)
        dx = np.linalg.solve(j, -f(x))
        x = x + dx
        iterations += 1

    return x

# **Results**

### Jacobi iteration

The Jacobi iteration is tested for random matrices. As the method can converge only for diagonally dominant matrices, we start buy testing completely diagonal matrices and move to subsequently less diagonal-dominant by introducing random elements, thus testing the limits of the method.


The calculated residuals are $\|Ax - b \|$ and $\|x-y \|$ where y is a manufactured solution with $b=Ay$

In [144]:
n = 5
A_orig = np.random.rand(n, n)
b = np.random.rand(n)
for nonzero_element_odds in [0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.5, 1]:

    # all diagonal elemenents and random non-diagonal ones proportional to the probability
    A = A_orig * (np.random.rand(n, n) < nonzero_element_odds) + np.diag(np.diag(A_orig))
    x = jacobi_iteration(A, b, tol=0.001)
    residual_1 = np.linalg.norm(A@x - b)

    y = np.random.rand(n)
    b_y = A @ y
    x_y = jacobi_iteration(A, b_y, tol=0.001)
    residual_2 = np.linalg.norm(x_y - y)
    
    print(f'non-zero probability: {nonzero_element_odds}')
    print(f'matrix:')
    print(f'{np.round(A,2)}' + '\n') 
    print(f'||Ax - b||: {residual_1}')
    print(f'||x - y||: {residual_2}')
    print('')

non-zero probability: 0
matrix:
[[0.63 0.   0.   0.   0.  ]
 [0.   0.65 0.   0.   0.  ]
 [0.   0.   0.31 0.   0.  ]
 [0.   0.   0.   0.76 0.  ]
 [0.   0.   0.   0.   0.95]]

||Ax - b||: 7.97218214573518e-17
||x - y||: 0.0

non-zero probability: 0.05
matrix:
[[0.63 0.   0.99 0.   0.  ]
 [0.   0.65 0.   0.   0.16]
 [0.   0.   0.31 0.   0.  ]
 [0.   0.   0.   0.76 0.  ]
 [0.   0.   0.   0.   0.95]]

||Ax - b||: 2.2929868617541516e-16
||x - y||: 1.1102230246251565e-16

non-zero probability: 0.1
matrix:
[[0.63 0.   0.   0.   0.  ]
 [0.   0.65 0.   0.   0.  ]
 [0.   0.   0.31 0.   0.  ]
 [0.   0.   0.   0.76 0.  ]
 [0.   0.   0.   0.   0.95]]

||Ax - b||: 7.97218214573518e-17
||x - y||: 0.0

non-zero probability: 0.15
matrix:
[[0.63 0.87 0.   0.   0.  ]
 [0.   0.65 0.   0.   0.  ]
 [0.   0.   0.31 0.   0.  ]
 [0.   0.   0.   0.76 0.  ]
 [0.   0.   0.   0.   0.95]]

||Ax - b||: 5.721958498152797e-17
||x - y||: 6.206335383118183e-17

non-zero probability: 0.2
matrix:
[[0.63 0.   0.   0.   0.  

### Gauss - Seidel iteration

The testing is done identically as the Jacobi iteration:

In [145]:
n = 5
A_orig = np.random.rand(n, n)
b = np.random.rand(n)
for nonzero_element_odds in [0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.5, 1]:

    # all diagonal elemenents and random non-diagonal ones proportional to the probability
    A = A_orig * (np.random.rand(n, n) < nonzero_element_odds) + np.diag(np.diag(A_orig))
    x = gauss_seidel_iteration(A, b, tol=0.001)
    residual_1 = np.linalg.norm(A@x - b)

    y = np.random.rand(n)
    b_y = A @ y
    x_y = gauss_seidel_iteration(A, b_y, tol=0.001)
    residual_2 = np.linalg.norm(x_y - y)
    
    print(f'non-zero probability: {nonzero_element_odds}')
    print(f'matrix:')
    print(f'{np.round(A,2)}' + '\n') 
    print(f'||Ax - b||: {residual_1}')
    print(f'||x - y||: {residual_2}')
    print('')

non-zero probability: 0
matrix:
[[0.87 0.   0.   0.   0.  ]
 [0.   0.48 0.   0.   0.  ]
 [0.   0.   0.89 0.   0.  ]
 [0.   0.   0.   0.04 0.  ]
 [0.   0.   0.   0.   0.77]]

||Ax - b||: 5.551115123125783e-17
||x - y||: 0.0

non-zero probability: 0.05
matrix:
[[0.87 0.   0.   0.   0.  ]
 [0.   0.48 0.   0.   0.  ]
 [0.   0.   0.89 0.   0.  ]
 [0.   0.   0.   0.04 0.  ]
 [0.   0.   0.   0.   1.55]]

||Ax - b||: 5.551115123125783e-17
||x - y||: 0.0

non-zero probability: 0.1
matrix:
[[0.87 0.   0.   0.85 0.93]
 [0.   0.48 0.   0.   0.  ]
 [0.   0.   1.79 0.   0.  ]
 [0.   0.   0.   0.04 0.  ]
 [0.   0.64 0.   0.   0.77]]

||Ax - b||: 1.2412670766236366e-16
||x - y||: 5.551115123125783e-17

non-zero probability: 0.15
matrix:
[[0.87 0.   0.   0.   0.  ]
 [0.28 0.48 0.   0.   0.  ]
 [0.   0.   0.89 0.   0.  ]
 [0.   0.   0.   0.04 0.  ]
 [0.   0.   0.87 0.91 0.77]]

||Ax - b||: 1.1102230246251565e-16
||x - y||: 1.5142854222490728e-16

non-zero probability: 0.2
matrix:
[[0.87 0.   0.   0.   0

### Newton's method for scalar nonlinear equation $f(x)=0$

The method is tested for the second order polynomial. Only parameter $b$ is going to be changed, which geometrically means shifting the root of the parabola along the x-axis but still ensures the root exists and the function is differentiable. Also as the polynomial is second order it has two roots, but the positive one will be found by giving a positive starting point $x_0 = 2$

The calculated residuals are $\vert f(x) \vert$ and $\vert x-y \vert$ where $y$ is the real root. Displayed as well is the width coefficient to see wether and by how much the residuals increase by having worse initial guesses.

In [146]:
def create_polynomial(b):
    
    def polynomial(x):
        return x**2 + b*x - 5
    
    # quadratic formula
    root = (-b + np.sqrt(b**2 - 4*1*(-5))) / (2*1)
    
    return polynomial, root

for n in [0, 1, 10, 100]:
    function, real_root = create_polynomial(n)
    newton_root = newtons_method(function, x0 = 2)
    residual1 = abs(function(newton_root))
    residual2 = (abs(newton_root) - abs(real_root))
    
    print(f'parabola width coefficient: {n}')
    print(f'|f(x)|: {residual1}')
    print(f'|x - y|: {residual2}')
    print('')
    

parabola width coefficient: 0
|f(x)|: 0.00019290123456450203
|x - y|: 4.3133611320467224e-05

parabola width coefficient: 1
|f(x)|: 7.561436672798294e-05
|x - y|: 1.6500348166692547e-05

parabola width coefficient: 10
|f(x)|: 5.906383858444997e-06
|x - y|: 5.391765854900754e-07

parabola width coefficient: 100
|f(x)|: 1.7810819485930551e-10
|x - y|: 1.7780776850884195e-12



### Newton's method for vector nonlinear equation $f(x)=0$

The method is tested with one exact system of equations:

$$
f(x,y) = \begin{cases}
x^2 -2x + 5 \\
xy + 10
\end{cases}
$$

Then the Jacobian. For demonstration the Jacobian matrix was found analytically, but could just as well be calculated with the same finite difference scheme.

$$
J(x, y) = \begin{pmatrix}
        2x & 
        -2 \\[1ex] % <-- 1ex more space between rows of matrix
        x & 
        y \\[1ex]
\end{pmatrix}
$$

And the system has one root, which was calculated with $Mathematica$:

$$
\begin{cases}
x = -3.3202 \\
y = 3.0119
\end{cases}
$$

As there is only one case, the method will be tested by varying the initial guess.
The same as before, the calculated residuals are $\vert f(x) \vert$ and $\vert x-y \vert$ where $y$ is the real root. Displayed as well are the initial guesses, to show how far they are from the real roots.

In [147]:
def jacobian(x):

    jac = np.array([[2*x[0], -2],
                    [x[1],x[0]]])

    return jac

def function(x):
    f = np.zeros(2)
    f[0] = x[0]**2 - 2*x[1] - 5
    f[1] = x[0]*x[1] + 10

    return f

real_roots = np.array([-3.3202006469833175893, 3.011866168114220354])
for x_init in [-5, -3, 1, 3]:
    for y_init in [1, 3, 10, 100]:


        newton_roots = newton_vector(function, np.array([x_init, y_init]))

        residual1 = abs(function(newton_roots))
        residual2 = abs(newton_roots - real_roots)
        
        print(f'x_init, y_init: {x_init, y_init}')
        print(f'|f(x)|: {residual1}')
        print(f'|x - y|: {residual2}')
        print('')

x_init, y_init: (-5, 1)
|f(x)|: [2.07580555e-05 1.17307753e-04]
|x - y|: [1.08130578e-05 2.55225522e-05]

x_init, y_init: (-5, 3)
|f(x)|: [2.23881825e-05 6.93110206e-05]
|x - y|: [2.29020824e-06 1.87980395e-05]

x_init, y_init: (-5, 10)
|f(x)|: [9.10303015e-07 1.08431627e-06]
|x - y|: [1.84923394e-07 1.58831281e-07]

x_init, y_init: (-5, 100)
|f(x)|: [7.43337560e-08 8.55559605e-08]
|x - y|: [1.48876769e-08 1.22631958e-08]

x_init, y_init: (-3, 1)
|f(x)|: [3.06202881e-06 4.84215977e-06]
|x - y|: [7.07161392e-07 8.16903557e-07]

x_init, y_init: (-3, 3)
|f(x)|: [0.00017222 0.000155  ]
|x - y|: [9.32676996e-06 5.51444922e-05]

x_init, y_init: (-3, 10)
|f(x)|: [2.21841611e-05 4.29450810e-05]
|x - y|: [5.68361344e-06 7.77867260e-06]

x_init, y_init: (-3, 100)
|f(x)|: [7.40565413e-08 9.71988712e-08]
|x - y|: [1.56844155e-08 1.50471360e-08]

x_init, y_init: (1, 1)
|f(x)|: [0.00010756 0.00014761]
|x - y|: [2.32384907e-05 2.33764160e-05]

x_init, y_init: (1, 3)
|f(x)|: [3.7624338  6.24601757]
|x

# **Discussion**

For the Jacobian and Gauss-Seidel iterations testing procedure was identical. It could be seen that initially with completely diagonal or diagonally dominated matrices the residuals converged to approximately zero, but when increasing the amount of non-diagonal elements the results started to diverge, as expected. Moreover the divergence started with around 30% non-diagonal elements being non-zero, indicating the importance of criterion. Of course, experiments with various tolerances, matrix dimensions and element amplitudes could be tested, but the preliminary tests already proved the necessity for being diagonally dominant for solution to be found.

The Newton's method for scalar functions converged to approximately zero in all cases, despite changing the width of the parabola. A more complex function could be used for further evaluation.
For the Newton's method for systems, the initial guesses showed to be more important, where the tests close to the roots converged, but the furthest one(especially in the x direction) diverged.


In all cases further checks would be necessary to ensure the method can be applicable at all, like checking wether the matrices are invertible, functions differentiable etc.