Remove global functions and data associated with a default COMM_WORLD #196

Closed
wants to merge 1 commit into
from

Projects

None yet

4 participants

@friedmud
Owner

This removes:

libMesh::COMM_WORLD
libMesh::Parallel::n_processors()
libMesh::Parallel::processor_id()

when the default COMM is disabled. This is extremely helpful when trying to track down places in application code that need to be updated.

I also added a different version of all of these called

libMesh::GLOBAL_COMM_WORLD
libMesh::Parallel::global_n_processors()
libMesh::Parallel::global_processor_id()

That always uses the communicator libMesh was initialized with. This is handy for when you truly do need "global" MPI information.

Owner

As you can see from the changed files... removing these even highlighted a few areas where default comm use had snuck back into libMesh (like in the UniqueID stuff). So having these functions in there is just a bad idea....

Owner

Good catch. But won't adding the GLOBAL_ functions just reintroduce the same risk?

Owner

My thought was that you now have to think about it a little more. libMesh::n_processors() sounds like "give me the number of processors for my current situation".... where libMesh::global_n_processors() sounds like "give me the total number of processors for this job".

I know it's subtle (and I put some comments in for those functions to try to explain what they are) but I do think this will help.

No matter how you look at it you do sometimes need the "global" MPI_COMM... but leaving in the old names will definitely harm people trying to update their code...

Owner

BTW - I can report that running libMesh on sub communicators DOES now work! I have updated MOOSE and all of our "sub-solve" stuff now works without needing to swap the communicators around!

BTW - our first publication that includes some results with the "sub-solve" capability (which we call MultiApp) is out: http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=9147072

We are submitting another manuscript soon with more detail....

Owner

Anyone else want to comment on this? I'm thinking of merging. It really will help people remove comm_world usage from their apps....

Owner

I'm leaning slightly against this, but if I'm outvoted I won't object.

It feels like a long-term problem (code which actually does need global information and is using the APIs correctly to get it will cease to work in the default configuration) for a short-term fix (we catch all the code which uses the current APIs incorrectly, but then nothing stops people from writing code which uses the new APIs incorrectly and we're back to square one).

On the other hand, you did identify a valid problem which needs fixing, and I still haven't thought of a better way to fix it.

If you wanted to add the global_ names but make the disabling of the original names a new non-default configure option, I'd have no further objections.

Owner

If we were to make an option, what would it be?

--disable-comm-world

??

How weird will be it be to have --disable-default-commm and --disable-comm-world? And... why wouldn't --disable-default-comm disable comm-world?

Moreover, it's already been shown that leaving these functions in there like they are leads to comm world usage sneaking back into the framework (just see my patches where I remove calls out of the UniqueID stuff).

I suppose that what I'm advocating is changing the name of n_processors() and processor_id() to have "global_" in front of it so it's a little more clear that those definitely aren't the ones associated with your sub-communicator.

In the meantime I've also left in the old names in the case where you aren't disabling the default comm - just for simple backwards compatibility. After a while we can remove those functions...

Owner
friedmud commented Feb 5, 2014

@benkirk any preference on this?

@jwpeterson any better ideas?

Owner
friedmud commented Feb 7, 2014

I've thought more about this and I still think this is a good idea.

Let's put it in and if there is too much wailing and gnashing of teeth then we'll talk about a configure option.

You can always get back to the old behavior by enabling the default comm....

Owner
roystgnr commented Feb 7, 2014

This breaks my personal application framework, it breaks FIN-S, it breaks the QUESO libMesh integration, and it breaks my coworkers' turbulence calibration apps. It breaks about a third of our example codes. "make check" itself is wailing and gnashing teeth!

Most importantly, it breaks the tradition that so far we'd been keeping up with the communicator API refactoring: we try not to break API compatibility without giving people a transition period: a libMesh release for which both new-style and old-style apps are compatible with the default configuration.

Owner
roystgnr commented Feb 7, 2014

Could we add the global_ functions, and backport them to the upcoming release?

Owner
roystgnr commented Feb 7, 2014

Then we could disable the old functions in a second patch, not backported and so not default active until the current release. That would at least make it possible to build a code that was both compatible with the latest release and with the git head in their default configurations.

Owner
friedmud commented Feb 7, 2014

I don't understand - it doesn't break things any more than switching off the default comm did. Surely you had to update your codes when Ben made that change recently... right? And the fix is the same --enable-default-comm-world. These are really changes that should have been made when we switched off the default comm... and they were just overlooked.

In fact - if this is breaking so many of your codes then you REALLY need this change because it means that you are still using COMM_WORLD in places... and you really shouldn't be.

I will fix "make check" - sorry I still haven't gotten in the habit of running that when I change libMesh - that is definitely a habit I need to change.

I'll go ahead and split it into two patches so the first can be backported.

Instead of screwing with this branch and PR I'll probably just put up another branch with the two patches and do a separate PR. If I do I'll close this one...

Owner
benkirk commented Feb 14, 2014

Is this PR deprecated now?

Owner

On Fri, Feb 14, 2014 at 10:18 AM, Benjamin S. Kirk <notifications@github.com

wrote:

Is this PR deprecated now?

Yeah I think the code associated with it has been checked in now.

John

@benkirk benkirk closed this Feb 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment