-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain about calling renumber when skipping renumbering #1666
Conversation
And now I get it. ReplicatedMesh::renumber_nodes_and_elements() doesn't really respect skip_renumber_nodes_and_elements - if there's been adaptive coarsening or other element deletion, it still renumbers as necessary to make the numbering contiguous again. I'll try fixing that in a separate PR, and with luck #1658 was all we needed to support discontiguous numbering properly. |
df1ab84
to
fbe171c
Compare
In the long run we should perhaps rename that method to "delete_disconnected_nodes_and_update_counts_and_maybe_renumber" or we should actually break those into separate methods.
fbe171c
to
5eb5992
Compare
So, this is baffling. Pinging @jwpeterson and anyone he knows who has exodiff experience. Two XFEM tests now break here, so I was excited to have something more convenient than Rattlesnake to use for debugging. I replicated the failures, and for one I used ncdump to get at the differences between the gold file and the tested file... And the only significant difference is the node_num_map array. When turning renumbering off actually turns renumbering off, then we end up with a gap in the node numbers in the tested file, just as we'd expect when nodes are being added and deleted. And if I renumber node_num_map, change nothing else, and ncgen a replacement exodus file, then exodiff is happy with it. So how does the node matching in exodiff work? The error
makes no sense when the second file has a node with the exact same location, exact same connectivity, but simply a different number! I could swear I've seen exodiff handle much more scrambled node numbering in the past! |
We usually run exodiff with the
That seems to be because |
So it's possible that test is still working fine. The nodes may have been getting inadvertently renumbered up to now; and so now that we are better enforcing |
Aha! I should ping @bwspenc and @jiangwen84, then, so we can ask whether that map setting is overzealous? |
Well, if by better you mean "actually always doing what our API says we'll do" then this is definitely a change for the better. There may be other users in the same depending-on-old-broken-behavior boat, though. Anyone using ReplicatedMesh and deleting (or coarsening away) nodes or elements might be affected. |
Looks like that test was added fairly recently in idaholab/moose@c520971; no comment about why |
Hmmm... if I remove
So that would explain why |
Ah, yeah that would make sense for XFEM as I believe they end up with overlapping elements in the Mesh during the simulation and in the output. We could wait on this PR until the next libmesh update and regold the failing tests (assuming the only difference is the node numbering) at that time? |
The Marmot and Rattlesnake failures on the other hand don't look like the same problem to me (the Marmot one has to do with PeriodicBoundaries?) |
Hell, there's a Marmot failure too now? Yeah, it looks like this will continue to be delayed. |
Just FYI both tests seem to involve a |
@roystgnr Did you get your question about the XFEM test resolved? I was out last week, and am just digging through the things that piled up. It looks like you figured this out already, but we use that |
I think so. The question of "why do we turn off exodiff remapping" is answered, at least. I'm not entirely sure where to go from here, though. I think the short-term question of "how do we avoid false positive test failures when remapping is off" will be answered by "keep renumbering disabled, force ReplicatedMesh to keep it identical, and regenerate a new gold standard after this PR gets merged", but that's hardly a sufficient long-term answer. Anything creating and destroying elements on a distributed mesh will end up with a partitioning-dependent element numbering, which will require remapping to compare against a gold standard with exodiff. |
This was causing breakage and we've moved past it now. |
In the long run we should perhaps rename that method to
"delete_disconnected_nodes_and_update_counts_and_maybe_renumber" or we
should actually break those into separate methods.
In theory this should just add some comments and remove a little redundancy.
In practice, something in #1621 is breaking Rattlesnake and this is the most likely of the very few remaining suspects.