Accelerating portable HPC Applications with Standard C++
===

## Learning Objectives

This half-day hands-on tutorial teaches how to accelerate portable HPC applications with CPUs and GPUs using the parallelism and concurrency features of the C++17 and C++20 standards. Attendees will accelerate a canonical PDE solver for the unsteady heat equation from a single threaded implementation to a multi-CPU/multi-GPU implementation that overlaps computation with communication. Along the way, they will learn about C++ concurrency features like threads, atomics, barriers, and parallel algorithms. The tutorial teaches how to integrate these features into hybrid HPC applications using MPI. Finally, we conclude with an outlook of C++2x features that simplify overlapping communication and computation, and a summary of our experience applying ISO C++ as the programming model to accelerate large real-world HPC applications.

### Outline

This tutorial is split into three parts.

- [Topic 0: SAXPY]: Fundamentals of C++ parallel algorithms (beginner).
- [Topic 1: 2D Heat Equation]: Mini-application: integrating parallel algorithms with threads, atomics, and MPI (intermediate).
- [Topic 2: Parallel Tree Construction]: Mini-application: starvation-free concurrent algorithms (advanced).

[Topic 0: SAXPY]: topic0_saxpy/topic0_saxpy.ipynb
[Topic 1: 2D Heat Equation]: topic1_heat/topic1_heat.ipynb
[Topic 2: Parallel Tree Construction]: topic2_tree/topic2_tree.ipynb

## Audience, Content Level, Prerequisites, and Duration

This half-day tutorial is relevant for those interested in parallel programming models, the C++ programming language, performance portability and heterogeneous systems. 

The content is structured into three topics, and progressesn from beginner to intermediate to advanced.

Beginner-level experience with C++11 and MPI is required.

## Getting started

Let's start by testing the CUDA Driver and GPU you are running the code on in this lab:

In [3]:
!nvidia-smi

Mon Apr 18 13:56:52 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.12       Driver Version: 515.12       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100 80G...  On   | 00000000:41:00.0 Off |                    0 |
| N/A   31C    P0    43W / 300W |      0MiB / 81920MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

and the CPUs on the system:

In [1]:
!lscpu

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              8
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           49
Model name:                      AMD EPYC 7232P 8-Core Processor
Stepping:                        0
Frequency boost:                 enabled
CPU MHz:                         1419.549
CPU max MHz:                     3100.0000
CPU min MHz:                     1500.0000
BogoMIPS:                        6200.36
Virtualization:                  AMD-V
L1d cache:                       256 KiB
L1i cache:                       256 KiB
L2 cache:                        4 

--- 

## Licensing 

This material is provided under the MIT License.