Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed coupling between on/off displaced mesh #14311

Merged
merged 4 commits into from
Nov 19, 2019

Conversation

tophmatthews
Copy link
Contributor

This is a test to show the inability of coupling variables that only exist on the displaced mesh, with variables that only exist on the undisplaced mesh. This fails an assert using the debug executable Assertion 'i*j < _val.size()' failed. This has been an ongoing problem for some years now, with some false assumptions on my part that it is no longer a problem. Alas, it prevents using debug to solve problems on complex simulations, and extremely inhibits development...
This was close to being fixed at one point by @andrsd and @dschwen . Perhaps this will give motivation to push it over the line!

Closes #9659

@tophmatthews
Copy link
Contributor Author

tophmatthews commented Nov 5, 2019

ping @andrsd @dschwen @permcody. Quoting @andrsd

So, here is the problem (mostly for me so that I do not forget):

SMP full=true has to be active, so that we are computing all off-diagonal blocks
then we resize jacobian blocks in Assembly like so:
jacobianBlock(vi, vj).resize(ivar.dofIndices().size(), jvar.dofIndices().size());
Now, u is on displaced mesh and v is not. So in the non-displaced instance for variable v, we resize the block to (2, 0).
in Kernel::computeOffDiagJacobian for v variable we get:
_test.size() == 2
_phi.size() == 2
And then we go into the previously sized block that has incorrect dimensions.

@tophmatthews
Copy link
Contributor Author

Here is the full error
Time Step 1, time = 1, dt = 1
 0 Nonlinear |R| = 2.345208e-04
Assertion `i*j < _val.size()' failed.
i*j = 0
_val.size() = 0


Stack frames: 34
0: 0   libmesh_dbg.0.dylib                 0x000000011f48c18a libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + 458
1: 1   libmesh_dbg.0.dylib                 0x000000011f482f4e libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*) + 222
2: 2   libbison-dbg.0.dylib                0x000000010b3ac0fe libMesh::DenseMatrix<double>::operator()(unsigned int, unsigned int) + 382
3: 3   libmoose-dbg.0.dylib                0x0000000111edc8ca IntegratedBC::computeJacobianBlock(MooseVariableFEBase&) + 490
4: 4   libmoose-dbg.0.dylib                0x00000001127f1d7f ComputeFullJacobianThread::computeFaceJacobian(short) + 655
5: 5   libmoose-dbg.0.dylib                0x00000001127f9593 ComputeJacobianThread::onBoundary(libMesh::Elem const*, unsigned int, short) + 339
6: 6   libmoose-dbg.0.dylib                0x0000000112139270 ThreadedElementLoopBase<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, bool) + 880
7: 7   libmoose-dbg.0.dylib                0x0000000112892532 void libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeFullJacobianThread>(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, ComputeFullJacobianThread&) + 130
8: 8   libmoose-dbg.0.dylib                0x000000011288f16b NonlinearSystemBase::computeJacobianInternal(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 2267
9: 9   libmoose-dbg.0.dylib                0x000000011289393f NonlinearSystemBase::computeJacobianTags(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 95
10: 10  libmoose-dbg.0.dylib                0x00000001122605be FEProblemBase::computeJacobianTags(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 1950
11: 11  libmoose-dbg.0.dylib                0x000000011225fd89 FEProblemBase::computeJacobianInternal(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&, std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 217
12: 12  libmoose-dbg.0.dylib                0x000000011225fca2 FEProblemBase::computeJacobian(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&) + 258
13: 13  libmoose-dbg.0.dylib                0x000000011225fa78 FEProblemBase::computeJacobianSys(libMesh::NonlinearImplicitSystem&, libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&) + 56
14: 14  libmoose-dbg.0.dylib                0x0000000112871622 Moose::compute_jacobian(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&, libMesh::NonlinearImplicitSystem&) + 146
15: 15  libmesh_dbg.0.dylib                 0x000000011fff38a3 libmesh_petsc_snes_jacobian + 2323
16: 16  libpetsc.3.10.dylib                 0x000000012391c3fa SNESComputeJacobian + 874
17: 17  libpetsc.3.10.dylib                 0x000000012394fb3f SNESSolve_NEWTONLS + 1375
18: 18  libpetsc.3.10.dylib                 0x0000000123921205 SNESSolve + 1717
19: 19  libmesh_dbg.0.dylib                 0x000000011fff21db libMesh::PetscNonlinearSolver<double>::solve(libMesh::SparseMatrix<double>&, libMesh::NumericVector<double>&, libMesh::NumericVector<double>&, double, unsigned int) + 3595
20: 20  libmesh_dbg.0.dylib                 0x000000012007c45d libMesh::NonlinearImplicitSystem::solve() + 621
21: 21  libmoose-dbg.0.dylib                0x0000000112a87b0d TimeIntegrator::solve() + 45
22: 22  libmoose-dbg.0.dylib                0x0000000112872c8d NonlinearSystem::solve() + 925
23: 23  libmoose-dbg.0.dylib                0x000000011225c5aa FEProblemBase::solve() + 170
24: 24  libmoose-dbg.0.dylib                0x0000000111e29464 FEProblemSolve::solve() + 36
25: 25  libmoose-dbg.0.dylib                0x0000000111e2ed30 PicardSolve::solveStep(double, double&, double, double&, bool, std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 1808
26: 26  libmoose-dbg.0.dylib                0x0000000111e2d2e0 PicardSolve::solve() + 3360
27: 27  libmoose-dbg.0.dylib                0x00000001126ef234 TimeStepper::step() + 52
28: 28  libmoose-dbg.0.dylib                0x0000000111e34bb3 Transient::takeStep(double) + 275
29: 29  libmoose-dbg.0.dylib                0x0000000111e34304 Transient::execute() + 148
30: 30  libmoose-dbg.0.dylib                0x0000000112d329ff MooseApp::executeExecutioner() + 239
31: 31  libmoose-dbg.0.dylib                0x0000000112d338c1 MooseApp::run() + 337
32: 32  bison-dbg                           0x000000010b26b483 main + 339
33: 33  libdyld.dylib                       0x00007fff5a88e3d5 start + 1
[0] /Users/topher/projects/moose/scripts/../libmesh/installed/include/libmesh/dense_matrix.h, line 898, compiled nodate at notime
We caught a libMesh error
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Matrix is missing diagonal entry 0
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 
[0]PETSC ERROR: ../bison-dbg on a  named pn1840603.lanl.gov by topher Tue Nov  5 09:02:27 2019
[0]PETSC ERROR: Configure options --prefix=/opt/moose/petsc-3.10.5/mpich-3.3_clang-8.0.0-opt --download-hypre=1 --with-ssl=0 --with-debugging=no --with-pic=1 --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack=1 --download-metis=1 --download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1 --with-cxx-dialect=C++11 -CC=mpicc -CXX=mpicxx -FC=mpif90 -F77=mpif77 -F90=mpif90 -CFLAGS="-fPIC -fopenmp" -CXXFLAGS="-fPIC -fopenmp" -FFLAGS="-fPIC -fopenmp" -FCFLAGS="-fPIC -fopenmp" -F90FLAGS="-fPIC -fopenmp" -F77FLAGS="-fPIC -fopenmp"
[0]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1687 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/mat/impls/aij/seq/aijfact.c
[0]PETSC ERROR: #2 MatILUFactorSymbolic() line 6664 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/mat/interface/matrix.c
[0]PETSC ERROR: #3 PCSetUp_ILU() line 144 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: #4 PCSetUp() line 932 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #5 KSPSetUp() line 391 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #6 KSPSolve() line 723 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #7 SNESSolve_NEWTONLS() line 224 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/snes/impls/ls/ls.c
[0]PETSC ERROR: #8 SNESSolve() line 4397 in /private/var/folders/bf/pqv0ttts3wz_bkcc7_ww66jr0000gn/T/moose_package_build_temp/petsc-default-mpich-clang-opt/petsc-3.10.5/src/snes/interface/snes.c
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1
:
system msg for write_line failure : Bad file descriptor

@moosebuild
Copy link
Contributor

moosebuild commented Nov 5, 2019

Job Documentation on 0f587fc wanted to post the following:

View the site here

This comment will be updated on new commits.

@moosebuild
Copy link
Contributor

Job Modules debug PETSc alt on 6330a08 : canceled by @permcody

canceling to free up build boxes (looks like this is bad anyway)

@moosebuild
Copy link
Contributor

Job Modules debug PETSc submodule on 6330a08 : canceled by @permcody

canceling to free up build boxes (looks like this is bad anyway)

@tophmatthews tophmatthews changed the title Added tests to provide coupling between on/off displaced mesh WIP: Added tests to provide coupling between on/off displaced mesh Nov 5, 2019
@tophmatthews
Copy link
Contributor Author

This is only to show the error. in other words, I NEED HELP TO GET THIS WORKING!

@tophmatthews
Copy link
Contributor Author

Here we go again...

Still need help on this!

@tophmatthews
Copy link
Contributor Author

Hail mary 🏈 to @lindsayad ... this seems like something you could fix quickly?

@lindsayad lindsayad changed the title WIP: Added tests to provide coupling between on/off displaced mesh Added tests to provide coupling between on/off displaced mesh Nov 18, 2019
@lindsayad
Copy link
Member

lindsayad commented Nov 18, 2019

@tophmatthews I've added a commit here

@tophmatthews
Copy link
Contributor Author

Closer @lindsayad ! Now I get

Assertion `phi_size == _local_ke.n()' failed
The size of the phi container does not match the number of local Jacobian columns
at /Users/topher/projects/moose/framework/src/kernels/Kernel.C, line 149
Stack frames: 32
0: 0   libmesh_dbg.0.dylib                 0x0000000112ee218a libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + 458
1: 1   libmoose-dbg.0.dylib                0x000000010d2105f0 Kernel::computeOffDiagJacobian(MooseVariableFEBase&) + 1104
2: 2   libmoose-dbg.0.dylib                0x000000010dc879a7 ComputeFullJacobianThread::computeJacobian() + 615
3: 3   libmoose-dbg.0.dylib                0x000000010dc901ab ComputeJacobianThread::onElement(libMesh::Elem const*) + 315
4: 4   libmoose-dbg.0.dylib                0x000000010d5c35cb ThreadedElementLoopBase<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, bool) + 539
5: 5   libmoose-dbg.0.dylib                0x000000010dd292c2 void libMesh::Threads::parallel_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeFullJacobianThread>(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, ComputeFullJacobianThread&) + 130
6: 6   libmoose-dbg.0.dylib                0x000000010dd25efb NonlinearSystemBase::computeJacobianInternal(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 2267
7: 7   libmoose-dbg.0.dylib                0x000000010dd2a6cf NonlinearSystemBase::computeJacobianTags(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 95
8: 8   libmoose-dbg.0.dylib                0x000000010d6ec8fe FEProblemBase::computeJacobianTags(std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 1950
9: 9   libmoose-dbg.0.dylib                0x000000010d6ec0c9 FEProblemBase::computeJacobianInternal(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&, std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 217
10: 10  libmoose-dbg.0.dylib                0x000000010d6ebfe2 FEProblemBase::computeJacobian(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&) + 258
11: 11  libmoose-dbg.0.dylib                0x000000010d6ebdb8 FEProblemBase::computeJacobianSys(libMesh::NonlinearImplicitSystem&, libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&) + 56
12: 12  libmoose-dbg.0.dylib                0x000000010dd083b2 Moose::compute_jacobian(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&, libMesh::NonlinearImplicitSystem&) + 146
13: 13  libmesh_dbg.0.dylib                 0x0000000113a498a3 libmesh_petsc_snes_jacobian + 2323
14: 14  libpetsc.3.10.dylib                 0x00000001174543fa SNESComputeJacobian + 874
15: 15  libpetsc.3.10.dylib                 0x0000000117487b3f SNESSolve_NEWTONLS + 1375
16: 16  libpetsc.3.10.dylib                 0x0000000117459205 SNESSolve + 1717
17: 17  libmesh_dbg.0.dylib                 0x0000000113a481db libMesh::PetscNonlinearSolver<double>::solve(libMesh::SparseMatrix<double>&, libMesh::NumericVector<double>&, libMesh::NumericVector<double>&, double, unsigned int) + 3595
18: 18  libmesh_dbg.0.dylib                 0x0000000113ad245d libMesh::NonlinearImplicitSystem::solve() + 621
19: 19  libmoose-dbg.0.dylib                0x000000010df1f7ad TimeIntegrator::solve() + 45
20: 20  libmoose-dbg.0.dylib                0x000000010dd09a1d NonlinearSystem::solve() + 925
21: 21  libmoose-dbg.0.dylib                0x000000010d6e88ea FEProblemBase::solve() + 170
22: 22  libmoose-dbg.0.dylib                0x000000010d2ae5a4 FEProblemSolve::solve() + 36
23: 23  libmoose-dbg.0.dylib                0x000000010d2b39b0 PicardSolve::solveStep(double, double&, double, double&, bool, std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> > const&) + 1808
24: 24  libmoose-dbg.0.dylib                0x000000010d2b1f60 PicardSolve::solve() + 3360
25: 25  libmoose-dbg.0.dylib                0x000000010db82d94 TimeStepper::step() + 52
26: 26  libmoose-dbg.0.dylib                0x000000010d2b9873 Transient::takeStep(double) + 275
27: 27  libmoose-dbg.0.dylib                0x000000010d2b8fc4 Transient::execute() + 148
28: 28  libmoose-dbg.0.dylib                0x000000010e1cbd0f MooseApp::executeExecutioner() + 239
29: 29  libmoose-dbg.0.dylib                0x000000010e1ccbd1 MooseApp::run() + 337
30: 30  moose_test-dbg                      0x000000010ba8f0df main + 143
31: 31  libdyld.dylib                       0x00007fff7aaf23d5 start + 1

@tophmatthews
Copy link
Contributor Author

Also some weird ArrayKernel diffs

kernels/array_kernels.test_standard_save_in: Sideset Distribution Factors:
kernels/array_kernels.test_standard_save_in:   --------- Time step 1, 0.0000000e+00 ~ 0.0000000e+00, rel diff:  0.00000e+00 ---------
kernels/array_kernels.test_standard_save_in: Global variables:
kernels/array_kernels.test_standard_save_in: Nodal variables:
kernels/array_kernels.test_standard_save_in:   --------- Time step 2, 1.0000000e+00 ~ 1.0000000e+00, rel diff:  0.00000e+00 ---------
kernels/array_kernels.test_standard_save_in: Global variables:
kernels/array_kernels.test_standard_save_in: Nodal variables:
kernels/array_kernels.test_standard_save_in:    u_vacuum_diag_save_in_0  rel diff:  4.1666667e-02 ~  0.0000000e+00 = 1.00000e+00 (node 9)
kernels/array_kernels.test_standard_save_in:    u_vacuum_diag_save_in_1  rel diff:  4.1666667e-02 ~  0.0000000e+00 = 1.00000e+00 (node 9)
kernels/array_kernels.test_standard_save_in: 
kernels/array_kernels.test_standard_save_in: exodiff: Files are different
kernels/array_kernels.test_standard_save_in: 

@lindsayad
Copy link
Member

Try now...

@tophmatthews
Copy link
Contributor Author

Yes! ... for non AD... AD still gets:

Assertion `phi_size == ke.n()' failed
The size of the phi container does not match the number of local Jacobian columns
at /Users/topher/projects/moose/framework/src/bcs/ADIntegratedBC.C, line 197

but I imagine that may be a quick fix?

If we don't we run into this issue: we call copyShapes for
the jvar_number which just ends up copying in the shape functions
corresponding to the basis functions for that variable. Those
shape functions are calculated in the Assembly class at the
beginning of element residual computation, e.g. FEProblemBase::prepare
-> Assembly::reinit, entirely divorced from variable degrees of
freedom. Hence the phi obtained after copyShapes absolutely will
have a size even if the variable doesn't live on the undisplaced/displaced
mesh
@lindsayad
Copy link
Member

Looks like we're passing now...

@tophmatthews
Copy link
Contributor Author

Yes! The genie came through, yet again!! You should get extra points based on how long issues have been open...

@tophmatthews tophmatthews changed the title Added tests to provide coupling between on/off displaced mesh Fixed coupling between on/off displaced mesh Nov 19, 2019
@tophmatthews
Copy link
Contributor Author

This is good to go for review @permcody .... FINALLY!

Copy link
Member

@permcody permcody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@permcody permcody merged commit 5cb1f32 into idaholab:next Nov 19, 2019
@tophmatthews tophmatthews deleted the disp_9659 branch November 19, 2019 19:14
@tophmatthews
Copy link
Contributor Author

So, were the opt simulations screwy before this @lindsayad? Did we just get lucky?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion `i*j < _val.size()' failed when coupling displaced and undisplaced variables with nonzero BC
4 participants