<a href="https://colab.research.google.com/github/e-hengirmen/utility-tools/blob/master/Colab_Hybrid_MPI%2BOpenMP__introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Introduction to Hybrid MPI+OpenMP Parallel Programming 
Area of a circle equals to $\pi/4$. Similarly, by integration we can determine 
$$\pi = \int_0^1 \frac{4}{1+x^2} dx \simeq \frac{1}{N} \sum_{i=0}^{N-1} \frac{4}{1+x_i^2}$$

## Compiling and running OpenMP

In [None]:
%%sh 
cat > pi-openmp.c << EOF
#include <stdio.h>
#include <math.h>
#define N 9
int main()
{
  double sum = 0.0;
  #pragma omp parallel for reduction(+:sum)
  for(int i = 0; i < N; i++)
    {
      double x = (i+0.5)/N;
      sum += 4/(1 + x*x);
    }
  printf("pi = %.10lf error=%g\n", sum/N, sum/N-M_PI);
}
EOF
gcc -fopenmp pi-openmp.c -lm && ./a.out

pi = 3.1426214566 error=0.0010288


## MPI

In [None]:
%%sh 
cat > pi-mpi.c << EOF
#include <mpi.h>
#include <stdio.h>
#define N 1000000

int main(int argc, char *argv[])
{
  int rank;
  int size;
  double subsum = 0.0;

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  for(int i = rank; i < N; i += size)
    {
      double x = (i+0.5)/N;
      subsum += 4/(1 + x*x);
    }
  double sum;
  MPI_Reduce(&subsum, &sum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
  if (rank == 0)
     printf("pi = %.10lf\n", sum/N);
  MPI_Finalize();
  return 0;
}
EOF
mpicc pi-mpi.c && mpirun -n 2 --allow-run-as-root a.out

pi = 3.1415926536


## Hybrid MPI+OpenMP

In [None]:
%%sh 
cat > pi-hybrid.c << EOF
#include <omp.h>
#include <mpi.h>
#include <stdio.h>
#define N 1000000

int main(int argc, char *argv[])
{
  int rank;
  int size;
  double subsum = 0.0;

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  omp_set_num_threads(2);
  
  int nthreads = omp_get_num_threads();
  #pragma omp parallel
  {  
    int tid = omp_get_thread_num();
    printf("Thread %d within rank %d started.\n", tid, rank);
    #pragma omp for reduction(+:subsum)
    for(int i = rank; i < N; i += size*nthreads)
      {
        double x = (i+0.5)/N;
        subsum += 4/(1 + x*x);
      }
  }
  double sum;
  MPI_Reduce(&subsum, &sum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
  if (rank == 0)
     printf("pi = %.10lf\n", sum/N);
  MPI_Finalize();
  return 0;
}
EOF
mpicc -fopenmp pi-hybrid.c && mpirun -n 3 --allow-run-as-root a.out

Thread 1 within rank 1 started.
Thread 0 within rank 1 started.
Thread 1 within rank 2 started.
Thread 0 within rank 2 started.
Thread 1 within rank 0 started.
Thread 0 within rank 0 started.
pi = 3.1415926536


In [None]:
%%sh
apt-get install hwloc likwid > /dev/null
hwloc-ls
likwid-topology

Machine (13GB)
  Package L#0 + L3 L#0 (55MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
    PU L#0 (P#0)
    PU L#1 (P#1)
  HostBridge L#0
    PCI 1af4:1000
--------------------------------------------------------------------------------
CPU name:	Intel(R) Xeon(R) CPU @ 2.20GHz
CPU type:	Intel Xeon Broadwell EN/EP/EX processor
CPU stepping:	0
********************************************************************************
Hardware Thread Topology
********************************************************************************
Sockets:		1
Cores per socket:	1
Threads per core:	2
--------------------------------------------------------------------------------
HWThread	Thread		Core		Socket		Available
0		0		0		0		*
1		1		0		0		*
--------------------------------------------------------------------------------
Socket 0:		( 0 1 )
--------------------------------------------------------------------------------
******************************************************************

Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/TLOTFFIBRD4MKDSR4QNEZUW2WQ:/var/lib/docker/overlay2/l/WEMSV5FPOLLNMUACJVLQOSCNZX:/var/lib/docker/overlay2/l/47OAR4QO24W4L3JNMJLB43OGYH:/var/lib/docker/overlay2/l/QP5FODE5DSMJNWQTXAKE5EWENJ:/var/lib/docker/overlay2/l/RTZBWTIRLPOW5TOSCSJNHYCNBT:/var/lib/docker/overlay2/l/H3S5LZMN3IHV2CPUELDJCL4LLM:/var/lib/docker/overlay2/l/YLHBX2IH37UVBKVQKPYKWWVVK5:/var/lib/docker/overlay2/l/FWBFE3GSFMVTI6DKCZQHIKUMO3:/var/lib/docker/overlay2/l/VC4QTWHCDWSNW'
Unexpected end of /proc/mounts line `IQWSGSCYKPI6C:/var/lib/docker/overlay2/l/J35WRPZJLNPJAYWERDRH4B6UKE:/var/lib/docker/overlay2/l/MWT6BXWSXUMECURW75IZM4FL5V:/var/lib/docker/overlay2/l/T7EQZ7VFMTKD2HRFBBCPUILB56:/var/lib/docker/overlay2/l/CC7K6HEG5DVX2BMTYL7DJUAWVH:/var/lib/docker/overlay2/l/3FIKXMRMUYP4WUNOK5QFPZKI77:/var/lib/docker/overlay2/l/FIDPK4DGBBKCUFNUAHEQHVACT5:/var/lib/docker/overlay2/l/LM7ZPSVLQNNXXQZ6T7JUWPUQPG:/var/lib/docker/overl