# libCEED tutorial

This tutorial shows some examples of libCEED for efficient operator evaluations for the solutions of PDEs. For the purpose of separation of concerns, in these examples we use PETSc for mesh handling and time integration (when needed in the PDE).

In [1]:
# This is a Python code cell. 
# Every instruction in the cell that starts with ! is going to be interpreted and executed as a bash command.

# let's build libCEED:

! make -B

/bin/sh: 1: gfortran: not found
/bin/sh: 1: gfortran: not found
make: 'lib' with optional backends: 
          CC [38;5;177;1mbuild/interface[m/ceed-fortran.o
          CC [38;5;177;1mbuild/interface[m/ceed-basis.o
          CC [38;5;177;1mbuild/interface[m/ceed-elemrestriction.o
          CC [38;5;177;1mbuild/interface[m/ceed-operator.o
          CC [38;5;177;1mbuild/interface[m/ceed-vector.o
          CC [38;5;177;1mbuild/interface[m/ceed.o
          CC [38;5;177;1mbuild/interface[m/ceed-tensor.o
          CC [38;5;177;1mbuild/interface[m/ceed-qfunction.o
          CC [38;5;177;1mbuild/interface[m/ceed-types.o
          CC [38;5;85;1mbuild/gallery/identity[m/ceed-identity.o
          CC [38;5;93;1mbuild/gallery/poisson3d[m/ceed-poisson3dapply.o
          CC [38;5;93;1mbuild/gallery/poisson3d[m/ceed-poisson3dbuild.o
          CC [38;5;155;1mbuild/gallery/mass1d[m/ceed-massapply.o
          CC [38;5;155;1mbuild/gallery/mass1d[m/ceed-mass1dbuild.o
          C

In [2]:
# Now let's build the examples that use PETSc, from the examples/petsc directory:

! make -B PETSC_DIR=/srv/conda/envs/notebook -C examples/petsc

make: Entering directory '/home/jovyan/examples/petsc'
mpicc -std=c99 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /srv/conda/envs/notebook/include -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /srv/conda/envs/notebook/include   -I/srv/conda/envs/notebook/include -I/home/jovyan/include -L/srv/conda/envs/notebook/lib -L/home/jovyan/lib  -Wl,-rpath,/srv/conda/envs/notebook/lib  -Wl,-rpath,/home/jovyan/lib  /home/jovyan/examples/petsc/area.c -o area \
  -lpetsc -lceed -lm
mpicc -std=c99 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /srv/conda/envs/notebook/include -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /srv/conda/envs/notebook/include   -I/srv/conda/envs/notebook/include -I/home/jovyan/include -L/srv/conda/envs/notebook/lib -L/home/jovyan/lib  -Wl,-rpath,/srv/conda/envs/notebook/lib  -Wl,-rpath,/home/jovyan/lib  /home/jovyan/

In [3]:
# Link the executables from the current directory to make it easy to run below

! cp -sf examples/petsc/bpsraw .
! cp -sf examples/petsc/bps .
! cp -sf examples/petsc/bpssphere .
! cp -sf examples/petsc/area .
! cp -sf examples/petsc/multigrid .

## BPs

The Center for Efficient Exascale Discretizations (CEED), part of the Exascale Computing Project (ECP) uses Benchmark Problems (BPs) to test and compare the performance of high-order finite element implementations.

In [4]:
# Let's run bpsraw. This example uses a structured rectangular grid.

! ./bpsraw -ceed /cpu/self/ref/serial -problem bp3 -degree 1


-- CEED Benchmark Problem 3 -- libCEED + PETSc --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/ref/serial
    libCEED Backend MemType            : host
    libCEED User Requested MemType     : none
  Mesh:
    Number of 1D Basis Nodes (P)       : 2
    Number of 1D Quadrature Points (Q) : 3
    Global nodes                       : 1331
    Process Decomposition              : 1 1 1
    Local Elements                     : 1000 = 10 10 10
    Owned nodes                        : 1331 = 11 11 11
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 2
    Final rnorm                        : 7.646897e-15
  Performance:
    Pointwise Error (max)              : 1.112075e-01
    CG Solve Time                      : 0.0072521 (0.0072521) sec
    DoFs/Sec in CG                     : 0.367

In [5]:
# Let's run in parallel

! mpiexec -n 2 ./bpsraw -ceed /cpu/self/ref/serial -problem bp3 -degree 1


-- CEED Benchmark Problem 3 -- libCEED + PETSc --
  PETSc:
    PETSc Vec Type                     : mpi
  libCEED:
    libCEED Backend                    : /cpu/self/ref/serial
    libCEED Backend MemType            : host
    libCEED User Requested MemType     : none
  Mesh:
    Number of 1D Basis Nodes (P)       : 2
    Number of 1D Quadrature Points (Q) : 3
    Global nodes                       : 2541
    Process Decomposition              : 2 1 1
    Local Elements                     : 1000 = 10 10 10
    Owned nodes                        : 1210 = 10 11 11
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 2
    Final rnorm                        : 8.189562e-15
  Performance:
    Pointwise Error (max)              : 1.091383e-01
    CG Solve Time                      : 0.0102704 (0.00936765) sec
    DoFs/Sec in CG                     : 0.49

In [6]:
# Let's run it with a blocked backend

! mpiexec -n 2 ./bpsraw -ceed /cpu/self/ref/blocked -problem bp3 -degree 1


-- CEED Benchmark Problem 3 -- libCEED + PETSc --
  PETSc:
    PETSc Vec Type                     : mpi
  libCEED:
    libCEED Backend                    : /cpu/self/ref/blocked
    libCEED Backend MemType            : host
    libCEED User Requested MemType     : none
  Mesh:
    Number of 1D Basis Nodes (P)       : 2
    Number of 1D Quadrature Points (Q) : 3
    Global nodes                       : 2541
    Process Decomposition              : 2 1 1
    Local Elements                     : 1000 = 10 10 10
    Owned nodes                        : 1210 = 10 11 11
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 2
    Final rnorm                        : 8.159318e-15
  Performance:
    Pointwise Error (max)              : 1.091383e-01
    CG Solve Time                      : 0.00500435 (0.00356853) sec
    DoFs/Sec in CG 

### Running a suite of BPs
We can the `bps`, using unstructured meshes, in batch, so that the `mpiexec` is invoked only once and noise is minimized.

In [7]:
# Let's run bps. This example uses an unstructured grid

! mpiexec -n 4 ./bps -problem bp3 -degree 2,4,6       \
    -ceed /cpu/self/opt/serial,/cpu/self/opt/blocked  \
    -local_nodes 600,5000 | tee bps.log


-- CEED Benchmark Problem 3 -- libCEED + PETSc --
  MPI:
    Hostname                           : jupyter-ceed-2dlibceed-2dk4wrq43m
    Total ranks                        : 4
    Ranks per compute node             : 4
  PETSc:
    PETSc Vec Type                     : mpi
  libCEED:
    libCEED Backend                    : /cpu/self/opt/serial
    libCEED Backend MemType            : host
    libCEED User Requested MemType     : none
  Mesh:
    Number of 1D Basis Nodes (P)       : 3
    Number of 1D Quadrature Points (Q) : 4
    Global nodes                       : 1881
    Local Elements                     : 71
    Owned nodes                        : 367
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 8
    Final rnorm                        : 8.888028e-12
  Performance:
    Pointwise Error (max)              : 6.605839e-03
    CG Solve Time

In [8]:
! ls

area		     build		 include	 python
AUTHORS		     ceed.pc.template	 interface	 README.rst
azure-pipelines.yml  CODE_OF_CONDUCT.md  Jenkinsfile	 requirements-gpu.txt
backends	     CONTRIBUTING.md	 lib		 requirements-test.txt
benchmarks	     doc		 LICENSE	 requirements.txt
bps		     Doxyfile		 Makefile	 setup.py
bps.log		     environment.yml	 multigrid	 tests
bpsraw		     examples		 NOTICE		 Tutorial.ipynb
bpssphere	     gallery		 pyproject.toml


### Plotting the BPs performance summary
You can open a new notebook by clicking on the File -> Open menu. Navigate to the `benchmarks` directory and select the `postprocess_altair.ipynb` notebook. This will open a Jupyter Notebook that uses `altair`, a package for interactive visualization.