Initialize threading support properly with MPI2 #153

Merged
merged 5 commits into from Apr 30, 2014

Projects

None yet

5 participants

@roystgnr
libMesh - C++ Finite Element Library member

I haven't tested this with MPI1 yet.

With MPI2, apparently "MPI_Init" is shorthand for "MPI_Init_thread(...MPI_THREAD_SINGLE...)" which is a way to tell the MPI stack that it should feel free to use optimizations like non-threadsafe-malloc which can trash a threaded code. I haven't actually seen any incorrect output due to this bug yet, but we've seen some suboptimal performance that might relate to CPU pinning tricks mvapich2 plays when it thinks it's running single-threaded.

@friedmud
libMesh - C++ Finite Element Library member

This should only be necessary if you're doing MPI calls within threaded sections, right? We don't do that - so there is no need to turn on the thread support in MPI because it's just going to slow stuff down...

Also, I'm not really a fan of these warnings like the one that's always going to get printed out for anyone using MPI1 with threads. We take care to build things correctly for our users... but they will see this warning no matter what if they are using threading....

One that's been a pain lately is the CommWorld stuff... we get a warning about that being deprecated every time our users run... so when they're doing live demos for upper level DOE staff we always get a question of "What is that warning about???"......

Is there any way we can lump all of this kind of stuff into a --yes-i-understand-there-are-warnings-please-dont-print-them configure option?

(Note, the warnings are valid... we are planning on fixing the CommWorld thing... it's just that our users are making end user applications... and we would like to be able to be able to hide libMesh warnings from them).

@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member
@friedmud
libMesh - C++ Finite Element Library member

Roy - can you describe a scenario where using MPI_THREAD_SINGLE used in the libMesh style will lead to a race condition? I simply don't see how it's possible (but I fully admit that I might not be able to think about it).

I know that OpenMPI does nothing different for MPI_THREAD_FUNNELED.

I really feel like this is an extremely drastic change. We use threading (in the funneled style) with both TBB and Pthreads across multiple different MPI implementations (mpich2, openMPI, mvapich, mvapich2) on multiple architectures (I've even used it on Blue gene Q) and we run ~600 tests using threading all day every day and we haven't seen a problem (granted, that doesn't mean it doesn't exist!)

Please don't auto throttle to 1 thread. I would vote for an error if anything. Our users might not notice the fact that it's been throttled (they will just think threading doesn't work), but at least we can start a conversation with them if there is an error.

As for the warning stuff: sorry i mixed that in here... I just saw your direct use of std::cerr and in my head i saw another thing printing out that we can't control ;-). We use macros for warnings in MOOSE... and I think it's a good idea in libMesh. I will talk to the guys about it and get it prioritized.... we can certainly get this done.

@friedmud
libMesh - C++ Finite Element Library member

BTW this is the type of stuff I'm worried about:

http://www.open-mpi.org/community/lists/users/2011/05/16451.php

(Admittedly, that is old - I'm not saying that stuff is a particular problem currently)

We always turn off all threading support in MPI because I don't want to take chances on getting some junk turned on that kills performance (or has other issues).

@jedbrown
@friedmud
libMesh - C++ Finite Element Library member

That is a good thread (best one I've seen in my googling tonight) thanks for pointing me to it!

That said, I still stand by my statement that it is simply not possible for there to be a race condition with the way we use threads in libmesh. The best explanation for why funneled is necessary is here:

http://www.open-mpi.org/community/lists/users/2010/03/12244.php

However, even that example (st_malloc) doesn't apply to us. It assumes that you have other threads running simultaneously when an MPI call is being made... which doesn't happen in libMesh (especially not with the pthread implementation as threads don't even exist outside of threaded loops).

The way I read it, when using funneled there could be extra locking done around system calls in case some other thread calls those system calls (it's also clear that this is mostly a fictional problem as most system calls are thread safe these days, including malloc). With libmesh style threading it's simply not going to happen...

At the very least I think this pedantry should be a compile time option. Even default it to on if you like (I still vote for an error instead of auto throttling too). But, I would really like a configure option to turn this checking off....

Note that in that thread they point out that SINGLE and FUNNELED are the same for mpich2 as well:

http://www.open-mpi.org/community/lists/users/2010/03/12250.php

I really think that this checking is overzealous at this point....

@jedbrown
@jwpeterson
libMesh - C++ Finite Element Library member
@friedmud
libMesh - C++ Finite Element Library member

Agreed about TBB and OpenMP... It is a technical possibility.

Only downsides are:

  1. Must recompile MPI implementation to turn it on (with some implementations)
  2. Vendor supplied MPI might not support it... and this patch will mean we can't do threading on that platform (even though we're 99.99% sure it wouldn't cause a problem).
  3. Configuring MPI for funneled might require turning on all threading support in the MPI implementation (possibly causing large slowdowns).
  4. Everyone that is running fine right now will all of a sudden not be able to use threads if their MPI doesn't support funneled.

In our particular case we definitely have a (medium sized?) user base that has threading completely turned off in their MPI... it would definitely be handy to be able to turn off this checking at configure time until we can assess the impact and distribute MPI packages that are more compliant...

Besides, I think that we'll always want the ability to bypass this check in case we're using vendor MPI that isn't compliant and we just want to force it to work.....

@jwpeterson
libMesh - C++ Finite Element Library member
@roystgnr
libMesh - C++ Finite Element Library member
@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member
@jwpeterson
libMesh - C++ Finite Element Library member

I'm dismayed, though. Deliberately invoking undefined behavior is nuts. Should we add a few "free(NULL)" calls or some out-of-bounds writes to adjacent stack-allocated arrays next? It'll work on some implementations...

Totally agree with you that purposely invoking UB sucks. I think there are probably different degrees of UB though, the kind that comes from writing past the end of an array being slightly worse (for some definition of worse) than the kind that comes from disobeying possibly over-conservative MPI implementors :-)

@jwpeterson
libMesh - C++ Finite Element Library member

Should we add a few "free(NULL)"
This is valid.

+1 for pedantry!

@jedbrown
@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member
@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member

This is a gross digression, but I have to know: "int a[3], b[3]; a[4] = 0;" behavior can change from run to run? I thought that ASLR (and more random behavior) affected access off the end of the stack, not OOB access to other variables in the same stack frame.

@roystgnr
libMesh - C++ Finite Element Library member

Still investigating to see how extensive the breakage is:

It appears that openmpi-1.6 required threading compatibility to be specifically configured, but openmpi 1.7.2 handles SINGLE/FUNNELLED/SERIALIZED regardless of compilation options and only needs special support for MPI_THREAD_MULTIPLE.

@roystgnr
libMesh - C++ Finite Element Library member

Ah, back to the digression - I suppose there's no way to tell that a non-segfaulting OOB access on a is actually hitting b; if the compiler reordered them then you could be accessing off the end of the stack and experiencing dumb luck.

@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member

mvapich2 appears to be thread-savvy, except if it's configured to support CPU affinity and that isn't disabled in the user's env, in mpid_init.c it looks like they force MPI_THREAD_SINGLE. That's exactly the sort of situation we want to detect and scream about...

@jedbrown
@roystgnr
libMesh - C++ Finite Element Library member

Warning: I'm going to rebase and force push this to try and get it cleaned up to merge.

@friedmud
libMesh - C++ Finite Element Library member

Wait - I thought we said this was a bad idea still?

Or am I misremembering?

@roystgnr
libMesh - C++ Finite Element Library member

Trusting the return value from the MPI stack was a bad idea, since there are stacks which experimentally work fine with MPI_THREAD_FUNNELED but which don't promise to support any more than MPI_THREAD_SINGLE. But actually asking for MPI_THREAD_FUNNELED seems to be an improvement in some cases and doesn't seem to hurt us (unless you don't want to see even my warning message when the support isn't returned?) in any case.

@friedmud
libMesh - C++ Finite Element Library member

I don't want my users to see even a warning in this case. But I'm willing to configure it off if that's necessary...

@roystgnr
libMesh - C++ Finite Element Library member

Okay. I won't merge until the warning is a configure-time option; if it's okay with you I'll default that option to "on".

@friedmud
libMesh - C++ Finite Element Library member

We have --enable-warnings now... do you want to just make this part of that set?

@roystgnr
libMesh - C++ Finite Element Library member

Hmm... the help text will need to be updated, but the name is certainly appropriate. Sure!

@jwpeterson
libMesh - C++ Finite Element Library member
@moosebuild

Results of testing 0690663 using libmesh recipe:

Passed on: linux

View the results here: https://www.moosebuild.com/view_job/1191

@moosebuild

Results of testing 787b355 using libmesh recipe:

Failed on: linux

View the results here: https://www.moosebuild.com/view_job/1244

@moosebuild

Results of testing 61f5b37 using libmesh recipe:

Failed on: linux

View the results here: https://www.moosebuild.com/view_job/1245

roystgnr added some commits Oct 7, 2013
@roystgnr roystgnr Initialize threading support properly with MPI2 df16137
@roystgnr roystgnr Proceed anyway without MPI_THREAD_FUNNELED
Making appropriate warnings less imperative is the price we pay for
making overzealous warnings non-lethal.
e492a15
@roystgnr roystgnr Only output warnings from rank 0 620f428
@roystgnr roystgnr Add libmesh_warning() to --enable-warnings 89dfcc9
@roystgnr roystgnr Use libmesh_warning() when unsure of MPI+threads
This way anyone who doesn't want users to be spammed with
usually-false-positive warnings can shut them up.
34bd3e2
@moosebuild

Results of testing 34bd3e2 using libmesh recipe:

Passed on: linux

View the results here: https://www.moosebuild.com/view_job/1246

@jwpeterson
libMesh - C++ Finite Element Library member
@roystgnr
libMesh - C++ Finite Element Library member

That was just me flip-flopping an error. The subdivision PR broke my standard (--enable-everything) build, but then my first attempt to fix it broke builds without ifem enabled. It should be fixed for everyone now.

@roystgnr
libMesh - C++ Finite Element Library member

Nothing wrong with the moosebuild configuration. It's nice and slightly-less-embarrassing to have my screwups get first noticed by something nonsentient.

@roystgnr
libMesh - C++ Finite Element Library member

You might want to put human eyes on this PR again, though. As far as I'm concerned it's now ready to pull, ideally early next week.

@friedmud
libMesh - C++ Finite Element Library member

This looks really good to me - thanks for creating/using libmesh_warning()!

@jwpeterson
libMesh - C++ Finite Element Library member

Nothing wrong with the moosebuild configuration. It's nice and slightly-less-embarrassing to have my screwups get first noticed by something nonsentient.

We're working on fixing that last part ;-)

@roystgnr roystgnr merged commit 1872d1d into master Apr 30, 2014

1 check passed

Details default Successfully passed all tests
@jwpeterson jwpeterson deleted the mpi_init_thread branch Apr 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment