Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process sets - do we need a more precise definition for MPI - 5 #24

Open
hppritcha opened this issue Jul 23, 2022 · 3 comments
Open

Process sets - do we need a more precise definition for MPI - 5 #24

hppritcha opened this issue Jul 23, 2022 · 3 comments
Labels

Comments

@hppritcha
Copy link
Contributor

The MPI-4 standard does not define precisely what a process set is, but rather has this wording

Process sets are the mechanism for MPI applications to query the runtime.

and

Mechanisms for defining process sets and how system resources are assigned to these
sets is considered to be implementation dependent.

An external agent (PMIx) being used by some MPI implementations defines a process set as:

A PMIx Process Set is a user-provided or host environment assigned label associated with a given
set of application processes.

In the PMIx world, there is not a direct association of system resources with a process set. System resources in PMIx land aren't defined in terms of process sets.

We may wish to bear this in mind as we consider MPI methods for requesting additional resources (in PMIx land an allocation).

@rhc54
Copy link

rhc54 commented Jul 24, 2022

So things have been evolving with other groups - I can summarize them here in case it helps. The primary driver really is to dissociate processes (i.e., executing pieces of code) from resources (i.e., things that processes use to execute). Reason being that they wanted to define a process set and a collection of resources as separate entities, and then create various combinations of them at some later time.

A process set is therefore defined ala PMIx - it is a given collection of application definitions, each application consisting of some specified number of instances of executable code. We allow the specification to be abstract - e.g, instead of saying that the process set consists of N instances of a given application executable, you can say that it consists of M instances per resource type. You can name them for ease of reference - and we do allow dynamic definitions.

A resource set consists of a collection of allocated resources. The scheduler hands out a default resource set when the allocation is initially made, but the user may subdivide that into as many resource sets as they like. The user can also define abstract resource sets - i.e., resource sets that do not consist of specific resources (e.g., all of nodeA, 2 GPUs from nodeB) but instead specify an abstracted collection of resources (e.g., three dedicated nodes, 2 GPUs from a non-dedicated node). You can name these as well.

A compute set is defined by combining a process set with a resource set - and as you'd expect, you can name these for reference. This defines an executable unit that can be launched by the RTE via something like the PMIx_Spawn API, where the RTE is responsible for mapping the process set onto the resource set. The final result is called a compute block - i.e., a compute set that has has been mapped, launched and is executing.

Traditional programming models simply asked to launch a process set, usually described on the cmd line (as opposed to formally calling it a process set). PMIx_Spawn knows that a request to launch a process set with no specified resource set is equivalent to using the default resource set, so a mechanism for handling current codes is easy to support.

In some programming models, it is really convenient to think in terms of compute sets as opposed to the traditional application since the compute set is capable of performing a complex task (e.g., modeling propagation of a specific crack), essentially acting like an object-based module. We are therefore exploring how to best reflect these definitions in PMIx - e.g., passing them to the PMIx_Spawn API for launch, or to the PMIx_Connect API to couple compute blocks together.

All still in its infancy, so nothing written in stone. Hope that helps provide some thoughts for your discussion.

@schreiberx
Copy link

My summary / perspective of the previous message (Thanks Ralph!):

  • process set: collection of instances of executable code
  • resource set: GPUs, CPUs, ..., mix of them
  • compute set: Set where each element is a tuple (element of process set, (sub)set of resources)

Now to MPI:
For me, "process set" refers simply to a set of MPI processes. This can be related to either the PMIx process set above (not clearly specifying which resources are used) or the compute set (linking it also to computing resources).

It's probably best to stick to the same abstract definition of PMIx since MPI shouldn't be about the particular computing resources used. There's a first shot:

A "process set" refers to a set of executions of MPI processes.
Different process sets can include the same MPI processes.
Process sets are not related to resources, but can be made related to resources by the runtime system.

@rhc54
Copy link

rhc54 commented Oct 20, 2023

I think that makes sense. You might also want to define an "execution block" or "execution set" which would be the combination of the processes and the resources that are allocated for their use. Can't say "that they are using" as that can be ephemeral - but "allocated for their use" should have some clearer meaning to both MPI and the runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants