Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint splitter #1103

Closed
wants to merge 6 commits into from
Closed

Conversation

friedmud
Copy link
Member

Forget Nemesis reading. It's too damn complicated.

We can do this ourselves. DistributedMesh is actually really awesome at reinitializing itself... it barely needs anything. Just the local elements and the nodes they're connected to... along with BC info. Done.

That said... there is still a small issue with this PR that must be resolved before it can be merged. It currently throws an assert() in devel/dbg mode if you leave mesh partitioning on when you prepare the newly read-in mesh for use.

You can see the way I'm needing to do the reading over here:

friedmud/moose@6e08eef

@jwpeterson @roystgnr could I get just the tiniest bit of help tracking that one down?

To easily create meshes that are split using this new capability use this branch of my simple_libmesh_app:

https://github.com/friedmud/simple_libmesh_app/tree/splitter

refs idaholab/moose#7752 , idaholab/moose#7744 , idaholab/moose#7745 , #1087 , #1086

*/
const processor_id_type & current_n_processors() const { return _my_n_processors; }
processor_id_type & current_n_processors() { return _my_n_processors; }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't redundant with the ParallelObject APIs because during reading they come from the file rather than the CheckpointIO object?

If I'm understanding that correctly, could you make it clearer in the comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually - those are for m->n splitting. Like run with m MPI processes... but create output files for n MPI processes.

To do that a code can loop over and set current_n_processors() and current_processor_id() and call write()... and the file that comes out will be for that situation.

I'll update the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moosebuild
Copy link

Job Test debug:linux-gnu on 771fe81 : invalidated by @friedmud

Rerun bcs/periodic.testlevel1

@permcody
Copy link
Member

Hmm - that same test timed out yesterday as well. I don't recall it ever timing out before...

@friedmud
Copy link
Member Author

@permcody It looks like it passed this time. It definitely doesn't have anything to do with this PR... so I don't know what's up.

@@ -1349,14 +1349,14 @@ void DistributedMesh::delete_remote_elements()
{
#ifdef DEBUG
// Make sure our neighbor links are all fine
MeshTools::libmesh_assert_valid_neighbors(*this);
//MeshTools::libmesh_assert_valid_neighbors(*this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roystgnr can we just delete these asserts altogether if they make the code too slow in DEBUG mode? Or maybe add some kind of "extra_pararnoid_slow" debugging flag that people can enable. I don't think they should just be commented out...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely don't want to delete them entirely. "This assert would pass here" is very useful information to leave in the code even if we don't compile it.

Personally, I thought DEBUG was the extra_paranoid_slow flag. We've already got -O0 and GLIBCXX_DEBUG* active in dbg mode too; it's basically as unoptimized as it gets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry guys: I didn't mean to leave these changes in here. Those just slipped through. I'll revert those and force-push this.

BTW: As you can see with my newest commit... I'm still working on this a bit. One of my problems still won't run using this capability. Still hunting down why.

friedmud and others added 2 commits September 20, 2016 15:19
Only write out boundary info for truly local objects.
Need to write out all point neighbors of local elements (and the nodes that connect to them)
@friedmud
Copy link
Member Author

Closing this in favor of #1106

@friedmud friedmud closed this Sep 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants