Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential issue with PETSc 3.9 debug builds #1889

Closed
jwpeterson opened this issue Oct 10, 2018 · 10 comments
Closed

Potential issue with PETSc 3.9 debug builds #1889

jwpeterson opened this issue Oct 10, 2018 · 10 comments

Comments

@jwpeterson
Copy link
Member

I have a preliminary report from @andfranklin (no stack trace yet, unfortunately) that we are once again incorrectly accessing a locked PETSc vector in PETSc 3.9 from one of the libmesh examples, basically the same thing that was reported in #1294. The problem does not seem to be present in PETSc 3.8. I'll try to add more information to this ticket as it becomes available.

cc: @fdkong

@jwpeterson
Copy link
Member Author

To reproduce the problem, you must configure PETSc with --with-debugging=1.

@fdkong
Copy link
Contributor

fdkong commented Oct 10, 2018

I will try to smooth MOOSE and libMesh with PETSc-3.9.4

@fdkong
Copy link
Contributor

fdkong commented Oct 10, 2018

I saw some similar issues before in MOOSE, and will try again

@pbauman
Copy link
Member

pbauman commented Oct 10, 2018

Dammit. All my CIVET tests are using PETSc opt. It's a simple change to the module statement on my CIVET server to use PETSc dbg. I can turn that on when folks are ready.

@jwpeterson
Copy link
Member Author

@pbauman is it possible for you to make an optional/attachable PETSc-debug recipe? That way we could attach it to a "test" PR and at least get some feedback without breaking all GRINS builds. I'm not sure exactly what your recipes look like, but you should be able to make an optional recipe by adding allow_on_pr = True to the [Main] section of your .cfg file...

@jwpeterson
Copy link
Member Author

We have this on CIVET but not for PETSc 3.9 yet (we currently do MOOSE testing with PETSc 3.8.x).

@andfranklin
Copy link

I was able to reproduce the failure using PETSc 3.9.4 configured with:

./configure CC="$CC" CXX="$CXX" --download-mpich --download-fftw --download-hdf5 --download-fblaslapack --download-chaco --download-exodusii --download-hypre --download-metis --download-ml --download-mumps --download-netcdf --download-parmetis --download-scalapack --download-suitesparse --download-superlu --download-superlu_dist --download-zlib --download-pnetcdf --with-c2html=0

and SLEPc 3.9.2. The following was used to configure libmesh:

../configure --prefix=$PWD/install --enable-cxx11-required --enable-nodeconstraint --enable-petsc-required --with-cxx="$CXX" --with-cc="$CC" --with-metis=PETSc

As far as I could tell nothing surprising occurred when running make. The following error was thrown when running make check:

Making check in optimization/optimization_ex1
/Applications/Xcode.app/Contents/Developer/usr/bin/make  check-am
/Applications/Xcode.app/Contents/Developer/usr/bin/make  example-dbg example-devel example-opt   run.sh
make[4]: Nothing to be done for `../../../../examples/optimization/optimization_ex1/run.sh'.
  CXX      example_dbg-optimization_ex1.o
  CXX      example_devel-optimization_ex1.o
  CXX      example_opt-optimization_ex1.o
  CXXLD    example-dbg
  CXXLD    example-opt
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src/.libs'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/libsupc++/.libs'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src/.libs'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/libsupc++/.libs'
  CXXLD    example-devel
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/src/.libs'
ld: warning: directory not found for option '-L/Users/fx/devel/gcc/build_package/ibin/x86_64-apple-darwin16/libstdc++-v3/libsupc++/.libs'
/Applications/Xcode.app/Contents/Developer/usr/bin/make  check-TESTS
***************************************************************
* Running Example optimization_ex1:
*   ./example-dbg -tao_monitor -tao_view -tao_type nls 
***************************************************************
 
 Mesh Information:
  elem_dimensions()={2}
  spatial_dimension()=2
  n_nodes()=441
    n_local_nodes()=441
  n_elem()=100
    n_local_elem()=100
    n_active_elem()=100
  n_subdomains()=1
  n_partitions()=1
  n_processors()=1
  n_threads()=1
  processor_id()=0

 EquationSystems
  n_systems()=1
   System #0, "Optimization"
    Type "Optimization"
    Variables="u" 
    Finite Element Types="LAGRANGE" 
    Approximation Orders="SECOND" 
    n_dofs()=441
    n_local_dofs()=441
    n_constrained_dofs()=80
    n_local_constrained_dofs()=80
    n_vectors()=4
    n_matrices()=2
    DofMap Sparsity
      Average  On-Processor Bandwidth <= 14.8776
      Average Off-Processor Bandwidth <= 0
      Maximum  On-Processor Bandwidth <= 25
      Maximum Off-Processor Bandwidth <= 0
    DofMap Constraints
      Number of DoF Constraints = 80
      Average DoF Constraint Length= 0
      Number of Node Constraints = 0

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR:  Vec is locked read only, argument # 1
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.9.4, unknown 
[0]PETSC ERROR: /Users/andfranklin/Projects/libmesh/build/examples/optimization/optimization_ex1/.libs/example-dbg on a arch-darwin-c-debug named franklin-laptop.local by andfranklin Wed Oct 10 20:44:18 2018
[0]PETSC ERROR: Configure options CC=clang CXX=clang++ --download-mpich --download-fftw --download-hdf5 --download-fblaslapack --download-chaco --download-exodusii --download-hypre --download-metis --download-ml --download-mumps --download-netcdf --download-parmetis --download-scalapack --download-suitesparse --download-superlu --download-superlu_dist --download-zlib --download-pnetcdf --with-c2html=0
[0]PETSC ERROR: #1 VecGetArray() line 1575 in /Users/andfranklin/Projects/petsc/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: #2 VecSetValues_MPI() line 948 in /Users/andfranklin/Projects/petsc/src/vec/vec/impls/mpi/pdvec.c
[0]PETSC ERROR: #3 VecSetValues() line 850 in /Users/andfranklin/Projects/petsc/src/vec/vec/interface/rvector.c
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1
:
system msg for write_line failure : Bad file descriptor
FAIL: run.sh
====================================================
1 of 1 test failed
Please report to libmesh-users@lists.sourceforge.net
====================================================
make[4]: *** [check-TESTS] Error 1
make[3]: *** [check-am] Error 2
make[2]: *** [check] Error 2
make[1]: *** [check-recursive] Error 1
make: *** [check-recursive] Error 1

Hopefully this is helpful.

@jwpeterson
Copy link
Member Author

Thanks for this info @andfranklin, looks like the error is in the optimization solvers (TAO) stuff this time around and that does make sense... we have not gone in and fixed those like we have for the NonlinearSolver and DiffSolver classes.

@jwpeterson
Copy link
Member Author

For reference, see also the related fixes in PetscDiffSolver (25154f3) and PetscNonlinearSolver (45241af). In this case the solution is probably to call

sys.get_dof_map().enforce_constraints_exactly(sys, sys.current_local_solution.get());

in the __libmesh_tao_objective, __libmesh_tao_gradient, and __libmesh_tao_hessian callback functions, and then enforce the constraints on system.solution around line 585 of tao_optimization_solver.C.

jwpeterson added a commit to jwpeterson/libmesh that referenced this issue Oct 15, 2018
This should fix the error:

Object is in wrong state
Vec is locked read only, argument # 1

we currently get when running the libmesh examples using a PETSc built
with --with-debugging=1.

Refs libMesh#1889.
See also: libMesh#1294, libMesh#630.
@jwpeterson
Copy link
Member Author

Fixed by #1898.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants