# Numerical Methods

# Lecture 7: Numerical Linear Algebra III

## Contents

## I.     Ill - Conditioned Matrices
## II.    Round Off Errors
## III.   Algorithm Stability
## IV.   Direct vs Iterative Methods
## V.    Iterative Methods - Jacobi's Method
## VI.   Iterative Methods - Gauss - Seidel Method
## VII.  Sparse Matrices
## VIII.  EXTRA: How to Guide

## Learning objectives:

* Introduce ill-conditioned matrices (via matrix norms and condition number)
* Consider direct vs iterative/indirect methods
* Example iterative algorithm: the Jacobi and Gauss-Seidel methods
* Sparse matrices and a pointer to more advanced algorithms (supplementary readings)

## VIII.    Sparse matrices

Note that the matrices which result from the numerical solution of differential equations are generally  *sparse* (<https://en.wikipedia.org/wiki/Sparse_matrix>) which means that most entries are zero (the alternative is termed *dense*).  Knowing which entries are zero means that we can devise more efficient matrix storage methods, as well as more efficient implementations of the above algorithms (e.g. by not bothering to do operations that we know involve multiplications by zero - we know the answer will be zero).

As an example, for the two iterative methods shown above (Jacobi and Gauss Seidel), the cost of *each* iteration is quadratically dependent on the number of unknowns $n$, since we need to loop through all the entries of the $n\times n$ matrix $A$. For a fixed number of iterations the computational cost of these methods therefore scales as $n^2$. If we know that each row only contains a fixed, small number of non-zero entries however (as for example the matrix in the example below), we can simply skip the zero entries and the cost *per iteration* becomes linear in $n$. These scalings of $n^2$ for *dense* and $n$ for *sparse* matrices for the cost per iteration are typical for iterative methods. Unfortunately this does not mean that the overall cost of an iterative method is also $n^2$ or $n$, as the number of iterations that is needed to achieve a certain accuracy quite often also increases for larger problem sizes. The number of required iterations typically only increases very slowly however, so that the cost of the method is still considerably cheaper than direct methods, in particular for very large problems.

A huge range of iterative solution methods exist and the literature on this topic is massive. Below is an example of using scipy to access the Conjugate Gradient algorithm which is a popular example of a method suitable for matrices which result from the numerical solution of differential equations.

In [12]:
import scipy.sparse.linalg

n = 50
main_diag = np.ones(n)   #just a vector of ones
off_diag = np.random.random(n-1)  # to make it a bit more interesting make the off-diagonals random
A = np.diag(-2*main_diag,0) + np.diag(1.*off_diag,1) + np.diag(1.*off_diag,-1)
# A random RHS vector
b = np.random.random(A.shape[0])

print(A) # print our A in "dense" matrix format

sA = scipy.sparse.csr_matrix(A) # The same matrix in a "sparse" matrix data structure where only non-zeros stored
print('This is the same matrix but now stored in a sparse matrix data structure.')
print(sA)

# now use a scipy iterative algorithm (Conjugate Gradient) to solve

# First define a (callback) function which we are allowed to pass to the solver; here this is coded such that it will store and print the iteration numbers and residuals - basically a method to output some diagnostic information on the solver as it executes
def gen_callback_cg():
    diagnostics = dict(it=0, residuals=[]) 
    def callback(xk):   # xk is the solution computed by CG at each iteration
        diagnostics["it"] += 1
        diagnostics["residuals"].append(sl.norm(A @ xk - b))
        print(diagnostics["it"], sl.norm(A @ xk - b))
    return callback    

print('Now we execute the CG algorithm on our problem, with our callback function returning information on the residual at each iteration.')
x_sol = scipy.sparse.linalg.cg(A,b,x0=None, tol=1e-10, maxiter=1000, callback=gen_callback_cg())

[[-2.          0.01579799  0.         ...  0.          0.
   0.        ]
 [ 0.01579799 -2.          0.96907927 ...  0.          0.
   0.        ]
 [ 0.          0.96907927 -2.         ...  0.          0.
   0.        ]
 ...
 [ 0.          0.          0.         ... -2.          0.293628
   0.        ]
 [ 0.          0.          0.         ...  0.293628   -2.
   0.02000908]
 [ 0.          0.          0.         ...  0.          0.02000908
  -2.        ]]
This is the same matrix but now stored in a sparse matrix data structure.
  (0, 0)	-2.0
  (0, 1)	0.01579798887096051
  (1, 0)	0.01579798887096051
  (1, 1)	-2.0
  (1, 2)	0.9690792657586554
  (2, 1)	0.9690792657586554
  (2, 2)	-2.0
  (2, 3)	0.4426848628892317
  (3, 2)	0.4426848628892317
  (3, 3)	-2.0
  (3, 4)	0.33287797242853423
  (4, 3)	0.33287797242853423
  (4, 4)	-2.0
  (4, 5)	0.8349961890669807
  (5, 4)	0.8349961890669807
  (5, 5)	-2.0
  (5, 6)	0.2759243129018931
  (6, 5)	0.2759243129018931
  (6, 6)	-2.0
  (6, 7)	0.5899642313423469
  

## An example of a sparse matrix

Let us consider an electric circuit arranged in a regular grid of $n$ rows and $m$ columns. The nodes in the grid are numbered from 0 to $nm-1$ as indicated in the diagram below.

![bla](images/circuit.png)

We want to calculate the electric potential $V_i$ in all of the nodes $i$. A node $i$ somewhere in the middle of the circuit is connected via resistor to nodes $i-1$ and $i+1$ to the left and right respectively, and to the nodes $i-m$ and $i+m$ in the rows above and below. For simplicity we assume that all resistors have the same resistance value $R$. The first and last node of the circuit (0 and $nm-1$) to a battery via two additional resistors, with the same resistance value $R$.

The sum of the currents coming into a node is zero (if we use a sign convention where a current coming into a node is positive and a current going out is negative. The currents between two nodes can be calculated using Ohm's law: $I=V/R$ where $R$ is the resistance of the resistor, and $V$ is the potential difference between two nodes, say $V=V_i-V_{i-1}$. Therefore we can write:

\begin{eqnarray}
  0 &=& I_{i-1\to i} + I_{i+1\to i} + I_{i-m\to i} + I_{i+m\to i} \\
    &=& V_{i-1\to i}/R + V_{i+1\to i}/R + V_{i-m\to i}/R + V_{i+m\to i}/R \\
    &=& (V_{i}-V_{i-1})/R + (V_{i}-V_{i+1})/R + (V_{i}-V_{i-m})/R + (V_{i}-V_{i+m})/R \\
    &=& (4V_{i}-V_{i-1}-V_{i+1}-V_{i-m}-V_{i+m})/R
\end{eqnarray}

This gives us one linear equation for each node in the circuit (with slight modifications for nodes that are not in the interior). These can be combined into a linear system $Ax=b$ which is assembled in the code below:

In [13]:
n = 4 # number of rows
m = 3 # number of columns
V_battery = 5.0 # voltage on the right of the battery

A = np.zeros((n*m, n*m))
for row in range(n):
    for column in range(m):
        i = row*m + column # node number
        if column>0: # left neighbour
            A[i,i-1] += -1.0
            A[i,i] += 1.0
        if column<m-1: # right neighbour
            A[i,i+1] += -1.0
            A[i,i] += 1.0
        if row>0: # neighbour above
            A[i,i-m] += -1.0
            A[i,i] += 1.0
        if row<n-1: # neighbour below
            A[i,i+m] += -1.0
            A[i,i] += 1.0

# connecting node 0 to the battery: I = (V_0 - 0)/R
A[0,0] += 1.0 
# connecting last node nm-1 to the battery: I = (V_0 - V_battery)/R = V_0/R - V_battery/R
A[n*m-1,n*m-1] += 1.0
# the V_battery/R term is a constant that does not depend on the unknowns, so ends up in the rhs vector b
b = np.zeros(n*m)
b[n*m-1] = V_battery
print(A)

[[ 3. -1.  0. -1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [-1.  3. -1.  0. -1.  0.  0.  0.  0.  0.  0.  0.]
 [ 0. -1.  2.  0.  0. -1.  0.  0.  0.  0.  0.  0.]
 [-1.  0.  0.  3. -1.  0. -1.  0.  0.  0.  0.  0.]
 [ 0. -1.  0. -1.  4. -1.  0. -1.  0.  0.  0.  0.]
 [ 0.  0. -1.  0. -1.  3.  0.  0. -1.  0.  0.  0.]
 [ 0.  0.  0. -1.  0.  0.  3. -1.  0. -1.  0.  0.]
 [ 0.  0.  0.  0. -1.  0. -1.  4. -1.  0. -1.  0.]
 [ 0.  0.  0.  0.  0. -1.  0. -1.  3.  0.  0. -1.]
 [ 0.  0.  0.  0.  0.  0. -1.  0.  0.  2. -1.  0.]
 [ 0.  0.  0.  0.  0.  0.  0. -1.  0. -1.  3. -1.]
 [ 0.  0.  0.  0.  0.  0.  0.  0. -1.  0. -1.  3.]]


## EXTRA.    How - to - Guide

Always check the solvability before actually trying to solve it! You could use determinant and Norms, or the Gaussian - Jordan to check!

For system of linear equations

- If the problem you are solving has very few unknowns and very few equations, i.e. 3 or less, maybe more for some of you who are more math savvy, then solve it by hand

- If the problem you are solving has few unknowns and few equations, use Gaussian Elimination, preferably through LU decomposition using Doolittle Algorithm and maybe implement partial pivoting for $0$ on the diagonal

- f the problem you are solving has lots of unknowns and lots of equation, and finding the exact solution would take ages, and well you don't have ages of time to solve the problem, then preferably Iterative Methods, such as the Jacobi Method or Gauss - Seidel Method. 

For numerical solutions to differential equations that are generally sparse - maybe use Conjugate Gradient Algorithm or some other algorithm that reduces the number of operations for sparse matrices, especially for very large sparse matrices. 