Iteration methods in parallel
===

In [1]:
from ipyparallel import Cluster
c = await Cluster(engines="mpi").start_and_connect(n=4, activate=True)
c.ids

Starting 4 engines with <class 'ipyparallel.cluster.launcher.MPIEngineSetLauncher'>


  0%|          | 0/4 [00:00<?, ?engine/s]

[0, 1, 2, 3]

In [2]:
%%px
from mpi4py import MPI
comm = MPI.COMM_WORLD

from ngsolve import *

comm = MPI.COMM_WORLD
mesh = Mesh(unit_square.GenerateMesh(maxh=0.1, comm=comm))

setup a standard problem, in parallel. We use a Jacobi preconditioner: it extracts local diagonals, and cumulates them over identified dofs.

In [3]:
%%px
fes = H1(mesh, dirichlet='.*')
u,v = fes.TnT()
a = BilinearForm(grad(u)*grad(v)*dx).Assemble()
f = LinearForm(1*v*dx).Assemble()
gfu = GridFunction(fes)

pre = Preconditioner(a, "local") # Jacobi preconditioner
pre.Update()

## Richardson iteration

The right hand side vector `f.vec` is distributed, the vector `gfu.vec` of the gridfunction is consistent. Create help vectors of the same type:

* residual calculation `r = f - A u` is purely local
* the preconditioning step `w = pre * r` includes type conversion from a distributed input to a consistent output
* the inner product for the error acts on two vectors of opposite type
* solution vector update can be done purely local

In [5]:
%%px
r = f.vec.CreateVector()    # a distributed vector
w = gfu.vec.CreateVector()  # a consistent vector

for it in range(100):
    r.data = f.vec - a.mat * gfu.vec
    w.data = pre * r
    err = InnerProduct(w, r)
    if comm.rank==0: print (err)
    gfu.vec.data += w

[stdout:0] 3.510857393881553e-08
3.151415212035045e-08
2.8287727826656105e-08
2.539162540713897e-08
2.2792026451000954e-08
2.045857488233031e-08
1.8364022485610737e-08
1.648391072239753e-08
1.4796285123687831e-08
1.3281438922841584e-08
1.1921682935403925e-08
1.0701138998692319e-08
9.605554559063916e-09
8.62213624177576e-09
7.739400459999687e-09
6.947039318498569e-09
6.235800246133606e-09
5.59737795157157e-09
5.024317440643236e-09
4.509926962761598e-09
4.048199869904571e-09
3.633744475711653e-09
3.261721095588732e-09
2.9277855326830464e-09
2.6280383497562194e-09
2.358979334625724e-09
2.117466627425838e-09
1.900680032444471e-09
1.7060880860895808e-09
1.5314184964397437e-09
1.374631609213285e-09
1.233896590281844e-09
1.1075700466267946e-09
9.94176836087894e-10
8.923928418117022e-10
8.010295102531026e-10
7.190199721849927e-10
6.454065846303404e-10
5.793297482668886e-10
5.20017869695728e-10
4.667783513823313e-10
4.189895040471314e-10
3.760932870637924e-10
3.375887921017861e-10
3.03026393909

In [6]:
from ngsolve.webgui import Draw

gfu = c[:]['gfu']
Draw(gfu[0]);

WebGuiWidget(layout=Layout(height='50vh', width='100%'), value={'gui_settings': {}, 'ngsolve_version': '6.2.23…

Very similar for other iteraton methods, such as [Conjugate Gradients](https://github.com/NGSolve/ngsolve/blob/124068e63765a679f9b6b85083671d5df9f2b085/python/krylovspace.py#L182).
The matrix operation goes from consistent to distributed without communication, the preconditioner does the cumulation. The inner products are between different vector types:

In [None]:
%%px
from ngsolve.krylovspace import CGSolver
inv = CGSolver(a.mat, pre, printrates=comm.rank==0)

gfu.vec.data = inv * f.vec

In [None]:
gfu = c[:]['gfu']
Draw(gfu[0]);