In [2]:
%matplotlib notebook
%pylab

Using matplotlib backend: nbAgg
Populating the interactive namespace from numpy and matplotlib


<hr style="border-width:4px; border-color:coral; border-style:solid"/></br>

## Check accuracy

<hr style="border-width:4px; border-color:coral; border-style:solid"/>

This notebook shows you one way to check accuracy from your MPI runs.  In this example, we create files `trap_01.out`, `trap_02.out`, `trap_02.out` and so on for runs on 1, 2, 4, etc processors. You may need to modify th code so it works for your setup.

This code assumes that you are writing your output in binary.  The advantages of the binary output is that we store the full precision of the data in the smallest possible file size. 

See the notebook 'binary_output' for details on creating binary files. 

In [3]:
# Range of processor counts
procs = [1,2,4,8]   # set to 1,2,4, and 8

# Range of N values
Nvec = 2**array(range(26,))

# Name of executable
exec_file = 'integral'

We'll use a Pandas DataFrame and a MultiIndex to store the data for all the runs.  This will allow us to easily compare errors across different processors.

The Pandas 'MultiIndex' is a nice feature to allow us to store multi-dimensional data.  Below, we store data for each tuple (P,N), where P is the processor count, and N is the dimension of the problem.   The data that we will store is the error.  

In [4]:
import pandas

idx = pandas.IndexSlice

index = pandas.MultiIndex.from_product([procs,Nvec],names=['Proc','N'])
cols = ['Error']

df_error = pandas.DataFrame(index=index,columns = cols).sort_index()
index

MultiIndex([(1,  16777216),
            (1,  33554432),
            (1,  67108864),
            (1, 134217728),
            (1, 268435456),
            (2,  16777216),
            (2,  33554432),
            (2,  67108864),
            (2, 134217728),
            (2, 268435456),
            (4,  16777216),
            (4,  33554432),
            (4,  67108864),
            (4, 134217728),
            (4, 268435456),
            (8,  16777216),
            (8,  33554432),
            (8,  67108864),
            (8, 134217728),
            (8, 268435456)],
           names=['Proc', 'N'])

We run all jobs above and collect the data in a Pandas data frame.   The errors should be identical (up to rounding error) for the same N, regardless of number of processors.  

In [5]:
import subprocess
import shlex
import os

# Output file
filename = '{fexec:s}_{np:02}.out'.format

# mpirun command
shell_cmd = 'mpirun -n {np:d} {fexec:s} {N:d}'.format

dt = dtype([('N','int32'), ('Error','d')])      

for np in procs:
    output_fname = filename(fexec=exec_file,np=np)    
    try:
        os.remove(output_fname)
    except OSError as error: 
        #print(error)
        pass
    tvec = []
    for N in Nvec:
        cmd = shell_cmd(np=np,fexec=exec_file,N=N)
        arg_list = shlex.split(cmd) 
        output = subprocess.run(arg_list)
        
    fout = open(output_fname,"rb")
    data = fromfile(fout,dtype=dt)
    fout.close()

    df = pandas.DataFrame(data)
    df_error.loc[(np,),:] = df[cols].values
    
df_error.loc[idx[:],'Error'] = df_error.loc[idx[:],'Error'].astype('double')
df_error.unstack(1).style.format('{:.12e}'.format)

Unnamed: 0_level_0,Error,Error,Error,Error,Error
N,16777216,33554432,67108864,134217728,268435456
Proc,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
1,2.980241736728e-08,1.490111428692e-08,7.45046130346e-09,3.725175834468e-09,1.862495213611e-09
2,2.980233948513e-08,1.49011226136e-08,7.450568217937e-09,3.725276531696e-09,1.862684950726e-09
4,2.980232244321e-08,1.490114170943e-08,7.450572547807e-09,3.725304398294e-09,1.862634269045e-09
8,2.980233276828e-08,1.490116358083e-08,7.450585759461e-09,3.725276254141e-09,1.862649257056e-09


In [None]:
figure(1)
clf()

# plot results from processor 1 (since they are all the same)
df_plot = df_error.loc[idx[1],:]

loglog(Nvec,df_plot.values,'.-',markersize=15,label='Time')

# Add slope to get best fit line
nv = array(Nvec).astype('d')
c = polyfit(log(nv),log(df_plot.values),1)
loglog(nv,exp(polyval(c,log(nv))),'r*-', markersize=8,\
         label='Best-fit line (slope={:6.2f})'.format(c[0][0]),linewidth=1)

# Add title, xlabel, ylabel, xticks and a legend
def fix_xticks(Nvec):
    p0 = log2(Nvec[0])
    p1 = log2(Nvec[-1])
    xlim([2**(p0-0.5), 2**(p1+0.5)])
    
    # Make nice tick marks
    pstr = (['{:d}'.format(int(N)) for N in Nvec])
    xticks(Nvec,pstr)

fix_xticks(Nvec)  # Need numpy array, not a Pandas 'Series'
xlabel("N",fontsize=16)
ylabel("Error",fontsize=16)
title("Errors in Left-endpoint method",fontsize=18)
legend()