Explain about calling renumber when skipping renumbering #1666

roystgnr · 2018-04-20T18:04:28Z

In the long run we should perhaps rename that method to
"delete_disconnected_nodes_and_update_counts_and_maybe_renumber" or we
should actually break those into separate methods.

In theory this should just add some comments and remove a little redundancy.

In practice, something in #1621 is breaking Rattlesnake and this is the most likely of the very few remaining suspects.

roystgnr · 2018-04-20T21:26:49Z

And now I get it. ReplicatedMesh::renumber_nodes_and_elements() doesn't really respect skip_renumber_nodes_and_elements - if there's been adaptive coarsening or other element deletion, it still renumbers as necessary to make the numbering contiguous again.

I'll try fixing that in a separate PR, and with luck #1658 was all we needed to support discontiguous numbering properly.

In the long run we should perhaps rename that method to "delete_disconnected_nodes_and_update_counts_and_maybe_renumber" or we should actually break those into separate methods.

roystgnr · 2018-07-06T17:52:09Z

So, this is baffling. Pinging @jwpeterson and anyone he knows who has exodiff experience.

Two XFEM tests now break here, so I was excited to have something more convenient than Rattlesnake to use for debugging. I replicated the failures, and for one I used ncdump to get at the differences between the gold file and the tested file...

And the only significant difference is the node_num_map array. When turning renumbering off actually turns renumbering off, then we end up with a gap in the node numbers in the tested file, just as we'd expect when nodes are being added and deleted. And if I renumber node_num_map, change nothing else, and ncgen a replacement exodus file, then exodiff is happy with it.

So how does the node matching in exodiff work? The error

exodiff: ERROR: Unable to match node 37 in first file with node in second file.

makes no sense when the second file has a node with the exact same location, exact same connectivity, but simply a different number! I could swear I've seen exodiff handle much more scrambled node numbering in the past!

jwpeterson · 2018-07-06T18:18:37Z

We usually run exodiff with the -m option, which ignores the node numbering and just tries to compare the spatial locations of the nodes. The exodiffing test does not seem to be using -m, however:

/opt/civet/build_0/moose/framework/contrib/exodiff/exodiff  -F 1e-10 -t 5.5e-06  /opt/civet/build_0/moose/modules/xfem/test/tests/moving_interface/gold/moving_bimaterial_out.e /opt/civet/build_0/moose/modules/xfem/test/tests/moving_interface/moving_bimaterial_out.e

That seems to be because map = false is set in the test spec for the test in question.

jwpeterson · 2018-07-06T18:22:48Z

So it's possible that test is still working fine. The nodes may have been getting inadvertently renumbered up to now; and so now that we are better enforcing allow_renumbering() == false the numbering has changed for the better?

roystgnr · 2018-07-06T18:47:34Z

Aha! I should ping @bwspenc and @jiangwen84, then, so we can ask whether that map setting is overzealous?

roystgnr · 2018-07-06T18:49:25Z

The nodes may have been getting inadvertently renumbered up to now; and so now that we are better enforcing allow_renumbering() == false the numbering has changed for the better?

Well, if by better you mean "actually always doing what our API says we'll do" then this is definitely a change for the better. There may be other users in the same depending-on-old-broken-behavior boat, though. Anyone using ReplicatedMesh and deleting (or coarsening away) nodes or elements might be affected.

jwpeterson · 2018-07-06T18:54:56Z

Aha! I should ping @bwspenc and @jiangwen84, then, so we can ask whether that map setting is overzealous?

Looks like that test was added fairly recently in idaholab/moose@c520971; no comment about why map=false would have been specified, and it seems to me that it shouldn't be necessary to make sure the test is working...

roystgnr · 2018-07-06T19:09:04Z

Hmmm... if I remove map = false from xfem/test/tests/moving_interface/tests then I get an error sooner:

xfem/test:moving_interface.moving_interface: exodiff: ERROR: Two elements in file 2 have the same midpoint (within tolerance). 
xfem/test:moving_interface.moving_interface:    Local element  19 at (0.7, 0.1, 0) and 
xfem/test:moving_interface.moving_interface:    Local element 20 at (0.7, 0.1, 0) 
xfem/test:moving_interface.moving_interface:    No unique element mapping possible.

So that would explain why map=false was set.

jwpeterson · 2018-07-06T20:28:07Z

So that would explain why map=false was set.

Ah, yeah that would make sense for XFEM as I believe they end up with overlapping elements in the Mesh during the simulation and in the output. We could wait on this PR until the next libmesh update and regold the failing tests (assuming the only difference is the node numbering) at that time?

jwpeterson · 2018-07-06T20:32:42Z

The Marmot and Rattlesnake failures on the other hand don't look like the same problem to me (the Marmot one has to do with PeriodicBoundaries?)

roystgnr · 2018-07-06T20:35:34Z

Hell, there's a Marmot failure too now?

Yeah, it looks like this will continue to be delayed.

jwpeterson · 2018-07-06T21:05:15Z

Just FYI both tests seem to involve a SolutionUserObject in some way, one reads a solution from an XDA file and one reads from an Exodus file. It's possible a different mesh numbering is causing problems with that...

bwspenc · 2018-07-09T18:18:52Z

@roystgnr Did you get your question about the XFEM test resolved? I was out last week, and am just digging through the things that piled up. It looks like you figured this out already, but we use that map=false option in all of the XFEM tests because they all generate output files with overlapping elements.

roystgnr · 2018-07-09T18:54:44Z

Did you get your question about the XFEM test resolved?

I think so. The question of "why do we turn off exodiff remapping" is answered, at least.

I'm not entirely sure where to go from here, though.

I think the short-term question of "how do we avoid false positive test failures when remapping is off" will be answered by "keep renumbering disabled, force ReplicatedMesh to keep it identical, and regenerate a new gold standard after this PR gets merged", but that's hardly a sufficient long-term answer. Anything creating and destroying elements on a distributed mesh will end up with a partitioning-dependent element numbering, which will require remapping to compare against a gold standard with exodiff.

roystgnr · 2020-07-04T12:53:54Z

This was causing breakage and we've moved past it now.

roystgnr force-pushed the rattlesnake_test branch from df1ab84 to fbe171c Compare April 25, 2018 21:14

roystgnr mentioned this pull request Jun 1, 2018

build dual graph in partitioner #1715

Merged

roystgnr added 2 commits June 26, 2018 15:36

if skip_renumbering, then SKIP RENUMBERING

0c23cc5

Explain about calling renumber when skipping renumbering

5eb5992

In the long run we should perhaps rename that method to "delete_disconnected_nodes_and_update_counts_and_maybe_renumber" or we should actually break those into separate methods.

roystgnr force-pushed the rattlesnake_test branch from fbe171c to 5eb5992 Compare June 26, 2018 20:38

roystgnr mentioned this pull request Jul 9, 2018

Set node processor ids less unevenly #1621

Closed

roystgnr closed this Jul 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain about calling renumber when skipping renumbering #1666

Explain about calling renumber when skipping renumbering #1666

roystgnr commented Apr 20, 2018

roystgnr commented Apr 20, 2018

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

roystgnr commented Jul 6, 2018

roystgnr commented Jul 6, 2018 •

edited

Loading

jwpeterson commented Jul 6, 2018 •

edited

Loading

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

bwspenc commented Jul 9, 2018

roystgnr commented Jul 9, 2018

roystgnr commented Jul 4, 2020

Explain about calling renumber when skipping renumbering #1666

Explain about calling renumber when skipping renumbering #1666

Conversation

roystgnr commented Apr 20, 2018

roystgnr commented Apr 20, 2018

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

roystgnr commented Jul 6, 2018

roystgnr commented Jul 6, 2018 • edited Loading

jwpeterson commented Jul 6, 2018 • edited Loading

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

roystgnr commented Jul 6, 2018

jwpeterson commented Jul 6, 2018

bwspenc commented Jul 9, 2018

roystgnr commented Jul 9, 2018

roystgnr commented Jul 4, 2020

roystgnr commented Jul 6, 2018 •

edited

Loading

jwpeterson commented Jul 6, 2018 •

edited

Loading