Add Support for MPI Parallel Execution #31

christopherwharrop-noaa · 2021-01-22T15:56:06Z

Issue checklist

I have read the README and ref/README, including the Troubleshooting Guide.
I have reviewed existing issues to ensure this has not already been reported.

Is your feature request related to a problem? Please describe.

The existing code only has support for parallelism via OpenMP threading. This limits the utility of the kernel for use in evaluating performance tradeoffs of various types of implementations. A proper evaluation must include analysis of multi-node as well as single-node parallel performance.

Describe the solution you'd like

A full featured MPI capability is needed. This will include:

Ability to run with any number of MPI tasks between 1 and N
Ability to run with MPI+OpenMP threads such that each MPI rank spawns a configurable number of OpenMP threads
A customizable processor grid for distribution of MPI ranks (e.g. 4x4, 2x8, 1x16)
A default processor grid that maximizes the "square-ness" of the grid for to minimize communication
Automatic domain decomposition of data onto the chosen processor grid
All options configurable at runtime in the input namelist

Describe alternatives you've considered

MPI is a ubiquitous and standard means of parallelizing across nodes of a supercomputer. While there may be other ways of achieving that, having MPI parallelism is necessary for establishing a baseline of performance against which other implementations should be compared.

christopherwharrop-noaa · 2021-01-22T16:11:14Z

Adding MPI will be a large change. If possible, it would be easier to evaluate and make progress if it could be broken down into smaller pieces. It isn't clear to me yet what those pieces should be. We can discuss here. To start, I'm going to throw out an initial breakdown for us to discuss and refine.

Implement halos for the existing code. This would simply extend the dimensions of existing arrays without adding any parallelism. The code would not run in parallel, and would loop over the same indices as it does in serial, but the arrays would be properly dimensioned for halos.
Implement calculation of domain decomposition for a default processor grid (as square as possible) for an arbitrary number of MPI ranks. This would be a routine that computes the local indices and allocates arrays for each tile. Allocations and loop indices would be adjusted as needed. Still no parallelism, but addition of tests to verify that the decomposition works would be added.
Add MPI_Init() and implement a halo exchange. This would be a routine for exchanging the data in the halo. Add a test to verify that the halo exchange works for multiple numbers of MPI ranks. Add tests to show running with different MPI ranks produces the same results.

This is just a starting point. Please comment/suggest adjustments as needed. Breaking parallelization down into smaller pieces may prove quite difficult. However, the smaller we can make the steps toward the final goal, the easer it will be to both implement, review, merge each step along the way.

christopherwharrop-noaa · 2021-11-17T20:45:03Z

A decision was made to not pursue a full MPI parallelization. Instead a simulation of parallel execution (see #35) is provided by running N copies of the kernel with MPI including simulating the work of a halo exchange operation using the N copies of the serial kernel. Since no further work on MPI will be pursued, this issue is being closed.

christopherwharrop-noaa added the enhancement New feature or request label Jan 22, 2021

christopherwharrop-noaa assigned middlecoff Jan 22, 2021

christopherwharrop-noaa mentioned this issue Mar 10, 2021

Feature/add mpi capability #35

Merged

13 tasks

christopherwharrop-noaa closed this as completed Nov 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for MPI Parallel Execution #31

Add Support for MPI Parallel Execution #31

christopherwharrop-noaa commented Jan 22, 2021

christopherwharrop-noaa commented Jan 22, 2021

christopherwharrop-noaa commented Nov 17, 2021

Add Support for MPI Parallel Execution #31

Add Support for MPI Parallel Execution #31

Comments

christopherwharrop-noaa commented Jan 22, 2021

Issue checklist

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

christopherwharrop-noaa commented Jan 22, 2021

christopherwharrop-noaa commented Nov 17, 2021