This course is aimed at programmers seeking to deepen their understanding of MPI and explore some of its more recent and advanced features. We cover topics including exploiting shared-memory access from MPI programs, communicator management and neighbourhood collectives. We also look at performance aspects such as which MPI routines to use for scalability, MPI internal implementation issues and overlapping communication and calculation. Intended learning outcomes
- Understanding of how internal MPI implementation details affect performance
- Techniques for overlapping communications and calculation
- Knowledge of MPI memory models for RMA operations
- Understanding of best practice for MPI+OpenMP programming
- Familiarity with neighbourhood collective operations in MPI
Attendees should be familiar with MPI programming in C, C++ or Fortran, e.g. have attended the ARCHER2 MPI course.
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They are also required to abide by the ARCHER2 Code of Conduct.
Unless otherwise indicated all material is Copyright © EPCC, The University of Edinburgh, and is only made available for private study.
- 09:30 - 09:45 ARCHER2 and PRACE training
- 09:45 - 10:15 MPI Quiz ("Room Name" is: HPCQUIZ)
- 10:15 - 11:00 MPI History
- 11:00 - 11:30 Coffee
- 11:30 - 13:00 Point-to-point Performance
- 13:00 - 14:00 Lunch
- 14:00 - 15:30 MPI Optimisations
- 15:30 - 16:00 Coffee
- 16:00 - 17:00 Collectives
- 17:00 CLOSE
- 09:30 - 11:00 MPI + OpenMP (i)
- 11:00 - 11:30 Coffee
- 11:30 - 13:00 MPI + OpenMP (ii) - same slide deck as above
- 13:00 - 14:00 Lunch
- 14:00 - 14:30 RMA Access in MPI
- 14:30 - 15:30 New MPI shared-memory model
- 15:30 - 16:00 Coffee
- 16:00 - 17:00 Finish Exercises
- 17:00 CLOSE
Unless otherwise indicated all material is Copyright © EPCC, The University of Edinburgh, and is only made available for private study.
SLURM batch scripts are set to run in the short queue and should work any time. However, on days when the course is running, we have special reserved queues to guarantee fast turnaround.
The reserved queue for today is called ta033_186
. To use this queue, change the --qos
and --reservation
lines to:
#SBATCH --qos=standard
#SBATCH --reservation=ta033_186
-
Description of 3D halo-swapping benchmark is in this README
-
Download the code directly to ARCHER2 using:
git clone https://github.com/davidhenty/halobench
- compile with
make -f makefile-archer2
- submit with
qsub archer2.job
- compile with
-
Other things you could do with the halo swapping benchmark:
- change the buffer size to be very small ( a few tens of bytes) or very large (bigger than the eager limit) to see if that affects the results;
- run on different numbers of nodes.
-
Note that you will need to change the number of repetitions to get reasonable runtimes: many more for smaller messages, many fewer for larger messages. Each test needs to run for at least a few seconds to give reliable results.
-
The collectives exercises are included in this tar file
- instructions are included as comments at the top of each file
mpigather.c
andmpigather.f90
illustrate using vectors for gather operations;mpigather2d.c
andmpigather2d.f90
extend to gathering a 2D array as described in the lectures;- solutions are include (e.g.
mpigathersol.c
); - there are also other codes that illustrate user-defined operations in reductions (not covered in this course).
The reserved queue for today is called ta033_187
. To use this queue, change the --qos
and --reservation
lines to:
#SBATCH --qos=standard
#SBATCH --reservation=ta033_187
- Traffic modeling exercise sheet
- Traffic model source code and solutions (MPI / OpenMP)
- note that the OpenMP Makefiles are not properly configured for ARCHER2. To compile correctly you need to change the following flags:
- for the OpenMP C version:
CFLAGS= -O3 -fopenmp
- for the OpenMP Fortran version:
FFLAGS= -O3 -homp
- Traffic model source code and solutions (MPI RMA)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.