Skip to content

HewlettPackard/libipc

Repository files navigation

Libipc

Libipc is a small library containing example code illustrating how to share GPU data between MPI processes sharing a GPU and using Inter-Process Communications (IPC). The code uses the ROCm HIP API and was based on published CUDA API converted to HIP (and back to CUDA for the CUDA version) Simple example on how to share data across MPI ranks on GPU via Inter-Process Communications (IPC).

Context

This work was undertaken by the staff of the ARCHER2 Centre of Excellence and motivated by a challenge of one user community.

Design requirements

  • Nodes with multiple GPUs
    • We assume a single GPU per each rank (via XXX_VISIBLE_DEVICES env variable)
  • Multiple MPI ranks attached to the same GPU
    • We could use sub-communicators (gpu_node_communicator)
  • Rank-0 of each gpu_node_communicator allocates data
    • Data are shared with the other ranks of gpu_node_communicator with direct access
    • Minimal synchronization implied (at the level of MPI or GPU?)

How to compile and run

First, you need to set the proper environment for CUDA or HIP. We provide two corresponding scripts (sourceme_cuda.sh for A100 nodes or sourceme_hip.sh for Mi250X nodes) that can be used on HPE Cray EX systems.

Then, you can use make ACC=cuda or make ACC=hip to compile the library and the examples. By default, the Fortran example is compiled. You can add EXT=cpp to compile the C++ example.

For the execution, we provide an example based on SLURM (run.sh).

Contritutors

The original contributers to this work were:

  • Alfio Lazzaro
  • Douglas Shanks
  • Harvey Richardson

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors