<hr style="border:3px solid coral"></hr>

# Homework #5

<hr style="border:3px solid coral"></hr>

* <a href="#prob1">Problem #1</a> Serial Jacobi solver

* <a href="#prob2">Problem #2</a> Parallel Jacobi solver

* <a href="#prob3">Problem #3</a> TBD

In [1]:
from numpy import *
from matplotlib.pyplot import *

<a id="prob1"></a>

<hr style="border:3px solid coral"></hr>

## Problem #1 (serial)

<hr style="border:3px solid coral"></hr>

Approximate the solution to the following elliptic problem using a finite difference scheme and the Jacobi method.   

\begin{equation}
u''(x) = e^{-x^2}, \quad x \in [-1,1]
\end{equation}

subject to $u(-1) = u(1) = 0$.   

The true solution is given by 

\begin{equation}
u_{true}(x) = \frac{1}{2}\left[\sqrt{\pi}\; x \;\mbox{erf}(x) + e^{-x^2}\right] + c_2 x + c_1
\end{equation}


### To Do

* Determine coefficients $c_1$ and $c_2$ in the true solution so that boundary conditions are satisfied.

* Discretize the problem using a second order centered difference scheme.  

* Compute the solution using the Jacobi method.  Use the matrix-free method and the iteration

\begin{equation}
\mathbf u_{k+1} = \mathbf u_k + D^{-1}(\mathbf F - A\mathbf u_k)
\end{equation}

Use 

\begin{equation}
\Vert \mathbf u_{k+1} - \mathbf u_k \Vert_\infty < \tau = 10^{-12}
\end{equation}

as stopping criteria. 
* Report the inf-norm of the error and residual for $N = 32$, $N=64$ and $N=128$.   


\begin{eqnarray}
\mbox{error : } \quad e &= &\Vert \mathbf u - \mathbf u_{true} \Vert_\infty \\
\mbox{residual : } \quad r &= &\Vert \mathbf F - A \mathbf u \Vert_\infty
\end{eqnarray}

where $\mathbf u$ is the computed solution. 

Show that you are getting second order accuracy.   **Hint:** Show that the error is reduced by a factor of 4 each time you double $N$. 

* Plot your solution for $N = 64$. 

#### Tips

* Solve for values $u_i = u(x_i)$, $x_i = -1 + ih$, $i = 0,1,\dots N$ and $h = 2/N$. 

* Recall that the inf-norm $\Vert \cdot \Vert_\infty$ of a vector $\mathbf v$ is given by 

\begin{equation}
\Vert \mathbf v \Vert_\infty = \max_{i} |v_i|
\end{equation}

* To handle the boundary conditions, you can use the trick we discussed in class and assign values to $u_{-1}$ and $u_{N+1}$ using 

\begin{equation}
\frac{u_1 + u_{-1}}{2} = u_{true}(-1) = 0 \quad \Longrightarrow \quad u_{-1} = -u_{1}
\end{equation}

and 

\begin{equation}
\frac{u_{N-1} + u_{N+1}}{2} = u_{true}(1) = 0 \quad \Longrightarrow \quad u_{N+1} = -u_{N-1}
\end{equation}

Allocate space for ghost values $u_{-1}$ and $u_{N+1}$ and before calling the matrix-vector multipy, you can make the assignment

    uk[-1] = u[1];
    uk[N+1] = u[N-1];
    ...
    matvec(N,u,Lu);

Then matrix-vector routine can be written as

    void matvec(int N, double *u, double *L)
    {
        for(int i = 0; i < N+1; i++)
            L[i] = (u[i-1] - 2*u[i] + u[i+1]); 
    }

### Compute coefficients $c_1$, $c_2$

Check that $u(-1) = u(1) = 0$. 

In [2]:
import scipy
from scipy.special import erf

def utrue(x,c1,c2):
    return (sqrt(pi)*x*erf(x) + exp(-x**2))/2 + c2*x + c1

In [25]:
from numpy.linalg import solve
# Compute c1, c2
utrue(-1,-1.4,-0.469236)  

-1.466018519913348e-07

In [4]:
%%file prob1.c

#include <stdio.h>
#include <stdlib.h>

#include <math.h>

// # TODO : Include values of c1, c2 here
double c1 = -1.4;
double c2 = -0.469236;

double* allocate_1d(int n, int m)
{
    double *mem = (double*) malloc((n + 2*m)*sizeof(double));
    return &mem[m];
}

void free_1d(double **x, int m)
{
    free(&(*x)[-m]);
    *x = NULL;
}


double utrue(double x)
{
    double pi = M_PI;
    double utrue = (sqrt(pi)*x*erf(x) + exp(-x*x))/2.0 + c2*x + c1;
    return utrue;
    
}

double rhs(double x)
{
    double upp = exp(-x*x);
    return upp;
}

void matvec(int N, double *u, double *L)
{
    for(int i = 0; i < N+1; i++)
        L[i] = (u[i-1] - 2*u[i] + u[i+1]); 
}


void jacobi(int N,double *F,double* u,double tol,int kmax, int prt)
{
    // # Jacobi iteration
    for(int k = 0; k < kmax; k++)
    {
        matvec(xk,Ax);
        
        double err = 0;
        for(int i = 0; i < N; i++)
        {
            double r = b[i] - Ax[i];
            double dstep = r/D[i];
            xkp1[i] = xk[i] + dstep;  
            err = fabs(dstep) > err ? fabs(dstep) : err;
        }
        printf("%5d %12.4e\n",k,err);
        
        // Parallel reduce to get err. 
        
        
        if (err < tol)
            break;
        
        for(int i = 0; i < N; i++)
            xk[i] = xkp1[i];        
    }
    for(int i = 0; i < 3; i++)
        x[i] = xkp1[i];        
}

int main(int argc, char** argv)
{
    int N = atoi(argv[1]);
    double *u = allocate_1d(N, 1);
    double *u_true = allocate_1d(N,0);
    double a = -1.0;
    double b = -1.0;
    
    
    
    
    
}



Writing prob1.c


### Run code and get output (serial)

In [6]:
%%bash

rm -rf prob1.o prob1

mpicc -o prob1 prob1.c

mpirun -n 1 ./prob1 64

### Accuracy of solution (serial)

In [7]:
# Errors from running at N=8,16,32,64, 128
# Iterations : 307, 1179, 4447, 16659, 62053
e = array([1.4834e-02, 3.6999e-03, 9.2446e-04, 2.3108e-04, 5.7765e-05]).reshape((5,1))

with printoptions(formatter={'float' : "{:.4f}".format}):
    print(log2(e[:-1]/e[1:]))


[[2.0033]
 [2.0008]
 [2.0002]
 [2.0001]]


### Check file size (serial)

In [None]:
import os

stats = os.stat("prob1.dat")
print(f"File size          : {stats.st_size:d} bytes")

fout = open("prob1.dat","rb")
N = np.fromfile(fout,dtype=int, count=1)[0]
fout.close()

esize = (N+1)*8 + 2*8 + 4
print(f"Expected file size : {esize:d} bytes")

In [None]:
dt = np.dtype([('N',np.int32),\
               ('a',np.float64), \
               ('b',np.float64),\
               ('u',(np.float64,N+1))])

fout = open("prob1.dat","rb")
N,a,b,u = np.fromfile(fout,dtype=dt, count=1)[0]
fout.close()

### Plot the solution (serial)

In [None]:
figure(1)
clf()

x = linspace(a,b,N+1);

ut = utrue(x,c1,c2)

plot(x,u,'ro',ms=5,label='Computed solution');
plot(x,ut,'b-',lw=1,label='True solution');

kwargs = {'color' : 'k', 'lw' : 0.5, 'ls' : '-'}
axhline(**kwargs)
axvline(x=-1,**kwargs)
axvline(x=1,**kwargs)

legend(loc=4);

<hr style="border:3px solid coral"></hr>

## Problem #2 (Parallel)

<hr style="border:3px solid coral"></hr>

Implement a parallel version of the solver you wrote in problem #1.


In [None]:
%%file prob2.c

// # Parallel Jacobi solver

Compare results to (serial results) : 

    N = 8
    Iteration count     :          307
    Error (inf-norm)    :   1.4834e-02
    Residual (inf-norm) :   9.0417e-13    
    
    N = 32
    Iteration count     :         4447
    Error (inf-norm)    :   9.2446e-04
    Residual (inf-norm) :   9.9720e-13    

In [None]:
%%bash

rm -rf prob2.o prob2

mpicc -o prob2 prob2.c

mpirun -n  4 prob2 64

### Accuracy of solution (parallel)

In [None]:
# Errors from running at N=8, 16, 32, 64, 128

# Report errors

### Check file size

In [None]:
import os

stats = os.stat("prob2.out")
print(f"File size          : {stats.st_size:d} bytes")

fout = open("prob2.out","rb")
N = np.fromfile(fout,dtype=int, count=1)[0]
fout.close()

esize = (N+1)*8 + 2*8 + 4
print(f"Expected file size : {esize:d} bytes")

In [None]:
dt = np.dtype([('N',np.int32),\
               ('a',np.float64), \
               ('b',np.float64),\
               ('u',(np.float64,N+1))])

fout = open("prob2.out","rb")
N,a,b,u = np.fromfile(fout,dtype=dt, count=1)[0]
fout.close()

### Plot the solution

In [None]:
figure(1)
clf()

x = linspace(a,b,N+1);

ut = utrue(x,c1,c2)

plot(x,u,'ro',ms=5,label='Computed solution');
plot(x,ut,'b-',lw=1,label='True solution');

kwargs = {'color' : 'k', 'lw' : 0.5, 'ls' : '-'}
axhline(**kwargs)
axvline(x=-1,**kwargs)
axvline(x=1,**kwargs)

legend(loc=4);

<hr style="border:3px solid coral"></hr>

## Problem #3

<hr style="border:3px solid coral"></hr>

TDB. 