Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0 #1340

Merged

Conversation

jsquyres
Copy link
Member

@jsquyres jsquyres commented Feb 4, 2016

Decrease the default value of the "mpi_add_procs_cutoff" MCA param from 1024 to 32.

@hppritcha Setting the milestone to v2.0.0 because the "partial add procs" behavior was supposed to be a key feature of v2.0.0. Setting that to only happen at 1024 procs seems a little high. Do you agree?

RFC for the rest of the community: does anyone have a concern about setting this value to 32?

@jsquyres jsquyres added this to the v2.0.0 milestone Feb 4, 2016
@lanl-ompi
Copy link
Contributor

Test FAILed.

1 similar comment
@lanl-ompi
Copy link
Contributor

Test FAILed.

@hppritcha
Copy link
Member

I agree. Setting to 1024 by default is a recipe for not getting tested often.

Von meinem iPhone gesendet

Am 03.02.2016 um 19:50 schrieb Jeff Squyres notifications@github.com:

Decrease the default value of the "mpi_add_procs_cutoff" MCA param from 1024 to 32.

@hppritcha Setting the milestone to v2.0.0 because the "partial add procs" behavior was supposed to be a key feature of v2.0.0. Setting that to only happen at 1024 procs seems a little high. Do you agree?

RFC for the rest of the community: does anyone have a concern about setting this value to 32?

You can view, comment on, or merge this pull request online at:

#1340

Commit Summary

ompi_mpi_params.c: set mpi_add_procs_cutoff default to 32
File Changes

M ompi/runtime/ompi_mpi_params.c (9)
Patch Links:

https://github.com/open-mpi/ompi/pull/1340.patch
https://github.com/open-mpi/ompi/pull/1340.diff

Reply to this email directly or view it on GitHub.

@lanl-ompi
Copy link
Contributor

Test FAILed.

(void) mca_base_var_register ("ompi", "mpi", NULL, "add_procs_cutoff",
"Maximum world size for pre-allocating resources for all "
"remote processes. Increasing this limit may improve "
"communication performance at the cost of memory usage "
"(default: 1024)", MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL,
"communication performance at the cost of memory usage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquyres you missed a , at the end of this line

@rhc54
Copy link
Contributor

rhc54 commented Feb 4, 2016

@jsquyres Only quibble I have is stating that it upping the value might impact communication performance. I don't think that is true - it might impact the latency of the first message, but it will have no impact on performance.

@jsquyres jsquyres force-pushed the pr/decrease-mpi_add_procs_cutoff branch from ac450cf to cd5bad8 Compare February 4, 2016 11:28
@jsquyres
Copy link
Member Author

jsquyres commented Feb 4, 2016

@hjelmn Per our IM discussion yesterday, can you share your performance testing results of setting mca_add_procs_cutoff to low values (including 0)?

@ggouaillardet
Copy link
Contributor

btw, what matters more to get the best performance

  • the number of MPI tasks ?
  • the number of tasks per node ?
    (iirc, there are some optimizations to handle several tasks per node, so I guess the number of nodes is more relevant than the number of MPI tasks alone)

@rhc54
Copy link
Contributor

rhc54 commented Feb 4, 2016

Launch performance is more impacted when used in conjunction with the direct modex param. Memory footprint is the primary target for this param when used by itself.

So if we want to show a major improvement in launch performance, we'd need to turn both params "on". Not sure if the direct modex one has actually been tested enough for that step. Maybe we should turn it "on" by default in master for awhile?

@ggouaillardet
Copy link
Contributor

can you remind me what direct modex is ?

  • is this mpirun-less start (e.g. srun a.out) ?
  • is this async_modex ? (and if no, should we turn this on by default too ?)

@rhc54
Copy link
Contributor

rhc54 commented Feb 4, 2016

Sorry - yes, it is the async_modex parameter (which results in the PMIx
direct modex operation being used). Basically, it means the current modex
blocking barrier is replaced with a "no-op" and data is only fetched as
required.

On Thu, Feb 4, 2016 at 3:59 AM, Gilles Gouaillardet <
notifications@github.com> wrote:

can you remind me what direct modex is ?

  • is this mpirun-less start (e.g. srun a.out) ?
  • is this async_modex ? (and if no, should we turn this on by default
    too ?)


Reply to this email directly or view it on GitHub
#1340 (comment).

@@ -263,12 +263,11 @@ int ompi_mpi_register_params(void)
ompi_rte_abort(1, NULL);
}

ompi_add_procs_cutoff = 1024;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn I'm not sure why you say that this extra assignment is necessary (in jsquyres@cd5bad8#commitcomment-15887486). Even if we come through registration a 2nd time, we don't want to reset the default if the user already set a different value. What am I missing here?

Regardless, we should either be doing this extra assignment for all MCA params, or no MCA params -- if there's a reason to do the extra assignment for this MCA param, that same reason should apply to all MCA params, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should set the variable back to the default. That is the way I have done it pretty much everywhere throughout the code base. See the other registrations in this same function.

If we come through registration twice it means we finalized the project or component. I don't think we should use a possibly stale value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note this is the default behavior of dynamically loaded components. If the component is unloaded all its variables will automatically be reset on the next dlopen. I was trying to be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn and I chatted on the phone -- he filled me in on what I was missing: there's a possible inconsistency in behavior with a program that does something like this:

MPI_T_Init(...);
MPI_T_Cvar_read(..., &value_pre);
MPI_T_Cvar_write(...);
MPI_T_Finalize(...);

MPI_Init(...)
MPI_T_Init(...);
MPI_T_Cvar_read(..., &value_post);
assert(value_pre == value_post);

When the cvar is part of a dynamically-loaded component, the assert will be true. If we remove the additional assignment, when the cvar is part of a statically-loaded component, the assert will be false. Nathan's second assignment ensures that the assert will be true in both cases.

This is actually a larger question for the MPI Tools Working Group in the MPI Forum: what consistency guarantees -- if any -- are provided by the MPI_T API when MPI_T and/or MPI is finalized?

For this PR, I'll remove my deletion of the 2nd assignment, and we'll leave OMPI's behavior in this area as it was. Nathan and I will bring up this what-does-MPI_T-guarantee issue with the Tools WG separately.

@jsquyres jsquyres force-pushed the pr/decrease-mpi_add_procs_cutoff branch from cd5bad8 to ae254c1 Compare February 4, 2016 18:05
Decrease the default value of the "mpi_add_procs_cutoff" MCA param
from 1024 to 0.
@jsquyres jsquyres force-pushed the pr/decrease-mpi_add_procs_cutoff branch from ae254c1 to 902b477 Compare February 9, 2016 17:41
@jsquyres jsquyres changed the title RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 32 RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0 Feb 9, 2016
@jsquyres
Copy link
Member Author

jsquyres commented Feb 9, 2016

Per Feb 9 2016 webex, we decided to set the value to 0 on the rationale that dynamic add procs stuff is not used for processes that are on the same server -- so therefore 0 is not a whole lot different than 32. So this PR is now updated to set the default add procs cutoff to 0.

jsquyres added a commit that referenced this pull request Feb 9, 2016
RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0
@jsquyres jsquyres merged commit d537ee9 into open-mpi:master Feb 9, 2016
@jsquyres jsquyres deleted the pr/decrease-mpi_add_procs_cutoff branch February 9, 2016 18:36
hjelmn pushed a commit to hjelmn/ompi that referenced this pull request Sep 13, 2016
Correct the binding algorithm to decouple it from oversubscribe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants