Process sets - do we need a more precise definition for MPI - 5 #24

hppritcha · 2022-07-23T22:37:09Z

The MPI-4 standard does not define precisely what a process set is, but rather has this wording

Process sets are the mechanism for MPI applications to query the runtime.

and

Mechanisms for defining process sets and how system resources are assigned to these
sets is considered to be implementation dependent.

An external agent (PMIx) being used by some MPI implementations defines a process set as:

A PMIx Process Set is a user-provided or host environment assigned label associated with a given
set of application processes.

In the PMIx world, there is not a direct association of system resources with a process set. System resources in PMIx land aren't defined in terms of process sets.

We may wish to bear this in mind as we consider MPI methods for requesting additional resources (in PMIx land an allocation).

The text was updated successfully, but these errors were encountered:

rhc54 · 2022-07-24T00:41:56Z

So things have been evolving with other groups - I can summarize them here in case it helps. The primary driver really is to dissociate processes (i.e., executing pieces of code) from resources (i.e., things that processes use to execute). Reason being that they wanted to define a process set and a collection of resources as separate entities, and then create various combinations of them at some later time.

A process set is therefore defined ala PMIx - it is a given collection of application definitions, each application consisting of some specified number of instances of executable code. We allow the specification to be abstract - e.g, instead of saying that the process set consists of N instances of a given application executable, you can say that it consists of M instances per resource type. You can name them for ease of reference - and we do allow dynamic definitions.

A resource set consists of a collection of allocated resources. The scheduler hands out a default resource set when the allocation is initially made, but the user may subdivide that into as many resource sets as they like. The user can also define abstract resource sets - i.e., resource sets that do not consist of specific resources (e.g., all of nodeA, 2 GPUs from nodeB) but instead specify an abstracted collection of resources (e.g., three dedicated nodes, 2 GPUs from a non-dedicated node). You can name these as well.

A compute set is defined by combining a process set with a resource set - and as you'd expect, you can name these for reference. This defines an executable unit that can be launched by the RTE via something like the PMIx_Spawn API, where the RTE is responsible for mapping the process set onto the resource set. The final result is called a compute block - i.e., a compute set that has has been mapped, launched and is executing.

Traditional programming models simply asked to launch a process set, usually described on the cmd line (as opposed to formally calling it a process set). PMIx_Spawn knows that a request to launch a process set with no specified resource set is equivalent to using the default resource set, so a mechanism for handling current codes is easy to support.

In some programming models, it is really convenient to think in terms of compute sets as opposed to the traditional application since the compute set is capable of performing a complex task (e.g., modeling propagation of a specific crack), essentially acting like an object-based module. We are therefore exploring how to best reflect these definitions in PMIx - e.g., passing them to the PMIx_Spawn API for launch, or to the PMIx_Connect API to couple compute blocks together.

All still in its infancy, so nothing written in stone. Hope that helps provide some thoughts for your discussion.

schreiberx · 2022-07-25T11:54:29Z

My summary / perspective of the previous message (Thanks Ralph!):

process set: collection of instances of executable code
resource set: GPUs, CPUs, ..., mix of them
compute set: Set where each element is a tuple (element of process set, (sub)set of resources)

Now to MPI:
For me, "process set" refers simply to a set of MPI processes. This can be related to either the PMIx process set above (not clearly specifying which resources are used) or the compute set (linking it also to computing resources).

It's probably best to stick to the same abstract definition of PMIx since MPI shouldn't be about the particular computing resources used. There's a first shot:

A "process set" refers to a set of executions of MPI processes.
Different process sets can include the same MPI processes.
Process sets are not related to resources, but can be made related to resources by the runtime system.

rhc54 · 2023-10-20T20:38:44Z

I think that makes sense. You might also want to define an "execution block" or "execution set" which would be the combination of the processes and the resources that are allocated for their use. Can't say "that they are using" as that can be ephemeral - but "allocated for their use" should have some clearer meaning to both MPI and the runtime.

hppritcha added the mpi-5 label Jul 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process sets - do we need a more precise definition for MPI - 5 #24

Process sets - do we need a more precise definition for MPI - 5 #24

hppritcha commented Jul 23, 2022

rhc54 commented Jul 24, 2022

schreiberx commented Jul 25, 2022

rhc54 commented Oct 20, 2023

Process sets - do we need a more precise definition for MPI - 5 #24

Process sets - do we need a more precise definition for MPI - 5 #24

Comments

hppritcha commented Jul 23, 2022

rhc54 commented Jul 24, 2022

schreiberx commented Jul 25, 2022

rhc54 commented Oct 20, 2023