v0.9.3-final #197

Closed
benkirk opened this Issue Feb 4, 2014 · 28 comments

Comments

Projects
None yet
4 participants
Owner

benkirk commented Feb 4, 2014

@roystgnr, could you help me confirm the commits that need to be merged into v0.9.3?

eeb2620
b1b891f

Is that it?

benkirk was assigned Feb 4, 2014

Owner

roystgnr commented Feb 4, 2014

Also 49c4f4f - I ran into that while investigating Derek's new patch, but it seems to be an older problem.

I'm still trying to get a handle on that weird ParallelMesh+adjoints_ex3 failure, though; sorry it's taking so long.

Owner

benkirk commented Feb 4, 2014

Thanks - I would have missed that; thinking it was dependent on Derek’s new stuff.

-Ben

On Feb 4, 2014, at 11:46 AM, roystgnr notifications@github.com wrote:

Also 49c4f4f - I ran into that while investigating Derek's new patch, but it seems to be an older problem.

I'm still trying to get a handle on that weird ParallelMesh+adjoints_ex3 failure, though; sorry it's taking so long.


Reply to this email directly or view it on GitHub.

Owner

roystgnr commented Feb 4, 2014

I've managed to at least trigger a lower level of failure now. I can get adjoints_ex3 to generate a Nemesis restart file which, although it looks fine in my Paraview 4.0.1, triggers assertion failures when I load it into libMesh. Can anyone else take a look at http://users.ices.utexas.edu/~roystgnr/badmesh/ ?

Owner

benkirk commented Feb 4, 2014

OK, so simply loading this on 4 processors freaks things out?

I'll see if I can do anything there...

Owner

roystgnr commented Feb 4, 2014

Just UnstructuredMesh::read() on four processors in dbg or devel mode should trigger the problem. One processor sees a global_node_idx of -1; another processor somehow gets a num_elems_global of 24 instead of 22.

Thanks!

Owner

benkirk commented Feb 10, 2014

Has this mesh been refined? If so, I'm not sure what to expect from Nemesis...

screenshot-paraview 4 1 0 64-bit

Owner

roystgnr commented Feb 10, 2014

This is the mesh that gets written out after one adaptive refinement step. Your screenshot looks like what I'd expect. Have you tried loading it in a libMesh app yet?

Owner

benkirk commented Feb 10, 2014

Not yet, but what I meant is that the libMesh nemesis reader would not understand adapted meshes so I'm not surprised that could cause problems. Still, I'll run it through and see what I can find.

On Feb 10, 2014, at 4:10 PM, "roystgnr" <notifications@github.commailto:notifications@github.com> wrote:

This is the mesh that gets written out after one adaptive refinement step. Your screenshot looks like what I'd expect. Have you tried loading it in a libMesh app yet?


Reply to this email directly or view it on GitHubhttps://github.com/libMesh/libmesh/issues/197#issuecomment-34688909.

Owner

benkirk commented Feb 10, 2014

Yeah, the error I see is

*** Warning, This code is untested, experimental, or likely to see future API changes: ../src/mesh/nemesis_io_helper.C, line 63, compiled Feb 10 2014 at 15:44:30 ***
Assertion 'global_node_idx < to_uint(nemhelper->num_nodes_global)' failed.
global_node_idx = 4294967295
to_uint(nemhelper->num_nodes_global) = 131
[0] ../src/mesh/nemesis_io.C, line 431, compiled Feb 10 2014 at 15:44:30
Assertion 'sum_internal_elems+sum_border_elems == nemhelper->num_elems_global' failed.
sum_internal_elems+sum_border_elems = 22
nemhelper->num_elems_global = 24
[3] ../src/mesh/nemesis_io.C, line 726, compiled Feb 10 2014 at 15:44:30
Owner

roystgnr commented Feb 10, 2014

Wait, our Nemesis reader doesn't understand adapted meshes? I didn't know that. We should probably toss a libmesh_not_implemented() in there somewhere appropriate.

Our Exodus reader handles adapted meshes, right?

Owner

roystgnr commented Feb 10, 2014

Anyway, thanks for the help; sorry this was a red herring. I'll get back to the regression myself.

Owner

friedmud commented Feb 10, 2014

Neither Exodus nor Nemesis can read adapted meshes. There is no such thing in Exodus land. There is no way to store a "family" or "tree of elements" in Exodus/Nemesis at all.

When we write out adapted Exodus/Nemesis we're just writing the active elements....

Owner

roystgnr commented Feb 11, 2014

Gah. This puts a crimp in my "let's just make our next restart file format be Exodus with some HDF5 extensions" dream.

Owner

benkirk commented Feb 11, 2014

Yeah, I've had some thoughts about augmenting those formats with an index representation of the tree...

Owner

benkirk commented Feb 12, 2014

So I'm a little confused about what bug you are seeing at this point.

  1. Is there any thign I can do at this point to help?
  2. should we wait on 0.9.3-final or move on with this bug outstanding?
Owner

roystgnr commented Feb 12, 2014

I'm getting a convergence failure when running adjoints_ex3 with ParallelMesh on 4 processors.

If you want to take a crack at replicating and figuring it out, I certainly wouldn't mind, but at this point I'd be fine seeing 0.9.3-final released just as soon as Derek's global_foo() is backported.

Owner

benkirk commented Feb 12, 2014

I wasn’t thinking we need to backport that for 0.9.3 as the default communicator is still active - am I off base on that?

Owner

roystgnr commented Feb 12, 2014

Basically I want to make it possible to write "forwards-compatible" software against a release as early as possible. Same argument for why Paul and I backported the DiffContext/FEMContext accessors.

Owner

benkirk commented Feb 12, 2014

A noble goal indeed... That pull request looks pretty good to me.

Owner

roystgnr commented Feb 12, 2014

If you want to do the pull and backport, I think Derek's stuff is ready.

I've narrowed down the adjoints_ex3 problem slightly - somehow we're missing a single DoF constraint in the ParallelMesh case.

Owner

roystgnr commented Feb 13, 2014

Now I've found the adjoints_ex3 problem and I'm testing a fix.

Owner

roystgnr commented Feb 13, 2014

This is probably worth delaying the release for; it's a real corner case, but it could affect any app with ParallelMesh, mixed finite elements, hanging nodes, and enough bad luck.

Owner

jwpeterson commented Feb 13, 2014

On Thu, Feb 13, 2014 at 11:52 AM, roystgnr notifications@github.com wrote:

This is probably worth delaying the release for; it's a real corner case,
but it could affect any app with ParallelMesh, mixed finite elements,
hanging nodes, and enough bad luck.

Yikes. Sounds like a good thing to delay a release for. How involved is
the fix?

John

Owner

benkirk commented Feb 13, 2014

I'm totally fine holding off for this, I'll cherry-pick Derek's recent PR later tomorrow or his weekend - I'm about to get off an airplane and head up to Steamboat...

On Feb 13, 2014, at 11:54 AM, "jwpeterson" <notifications@github.commailto:notifications@github.com> wrote:

On Thu, Feb 13, 2014 at 11:52 AM, roystgnr <notifications@github.commailto:notifications@github.com> wrote:

This is probably worth delaying the release for; it's a real corner case,
but it could affect any app with ParallelMesh, mixed finite elements,
hanging nodes, and enough bad luck.

Yikes. Sounds like a good thing to delay a release for. How involved is
the fix?

John


Reply to this email directly or view it on GitHubhttps://github.com/libMesh/libmesh/issues/197#issuecomment-35012129.

Owner

roystgnr commented Feb 13, 2014

I'll do the last cherry-picking; hopefully then we can run it through regression tests tomorrow morning and do the release tomorrow afternoon. Enjoy your trip.

Owner

benkirk commented Feb 13, 2014

Thanks, I especially want to ensure that both with and without default communicator options work as expected.

On Feb 13, 2014, at 1:02 PM, "roystgnr" <notifications@github.commailto:notifications@github.com> wrote:

I'll do the last cherry-picking; hopefully then we can run it through regression tests tomorrow morning and do the release tomorrow afternoon. Enjoy your trip.


Reply to this email directly or view it on GitHubhttps://github.com/libMesh/libmesh/issues/197#issuecomment-35019417.

Owner

roystgnr commented Feb 14, 2014

We're passing all the GRIN-S and internal tests, and John tells me we're passing all the MOOSE tests. Ready to release when you are.

Owner

benkirk commented Feb 14, 2014

Beautiful, let's update the NEWS to describe the bugfix you found and Derek's new global functions, then we're good!

On Feb 14, 2014, at 3:40 PM, "roystgnr" <notifications@github.commailto:notifications@github.com> wrote:

We're passing all the GRIN-S and internal tests, and John tells me we're passing all the MOOSE tests. Ready to release when you are.


Reply to this email directly or view it on GitHubhttps://github.com/libMesh/libmesh/issues/197#issuecomment-35132246.

benkirk closed this Feb 18, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment