Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

step-32 triggers an assertion. #16332

Closed
bangerth opened this issue Dec 8, 2023 · 3 comments · Fixed by #16335
Closed

step-32 triggers an assertion. #16332

bangerth opened this issue Dec 8, 2023 · 3 comments · Fixed by #16335

Comments

@bangerth
Copy link
Member

bangerth commented Dec 8, 2023

As reported by Hussein Shaito on the mailing list, mpirun -np 2 step-32.debug triggers this exception:

Number of active cells: 12,288 (on 6 levels)
Number of degrees of freedom: 186,624 (99,840+36,864+49,920)

Timestep 0:  t=0 years

   Rebuilding Stokes preconditioner...
   Solving Stokes system... 41 iterations.
   Maximal velocity: 60.4963 cm/year
   Time step: 18166.5 years
   17 CG iterations for temperature
   Temperature range: 973 4273.16

Number of active cells: 17,721 (on 7 levels)
Number of degrees of freedom: 277,935 (149,848+53,163+74,924)

Timestep 0:  t=0 years

--------------------------------------------------------
An error occurred in line <682> of file </home/bangerth/p/deal.II/1/dealii/source/lac/trilinos_vector.cc> in function
    dealii::TrilinosScalar dealii::TrilinosWrappers::MPI::Vector::operator()(dealii::TrilinosWrappers::MPI::Vector::size_type) const
The violated condition was: 
    false
Additional information: 
    You are trying to access element 11546 of a distributed vector, but
    this element is not stored on the current processor. Note: There are
    37375 elements stored on the current processor from within the range
    [37549,74923] but Trilinos vectors need not store contiguous ranges on
    each processor, and not every element in this range may in fact be
    stored locally.
    
    A common source for this kind of problem is that you are passing a
    'fully distributed' vector into a function that needs read access to
    vector elements that correspond to degrees of freedom on ghost cells
    (or at least to 'locally active' degrees of freedom that are not also
    'locally owned'). You need to pass a vector that has these elements as
    ghost entries.

Stacktrace:
-----------
#0  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: dealii::TrilinosWrappers::MPI::Vector::operator()(unsigned int) const
#1  ./step-32.debug: dealii::TrilinosWrappers::MPI::Vector::operator[](unsigned int) const
#2  ./step-32.debug: dealii::TrilinosWrappers::MPI::Vector::extract_subvector_to(dealii::ArrayView<unsigned int const, dealii::MemorySpace::Host> const&, dealii::ArrayView<double, dealii::MemorySpace::Host>&) const
#3  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: void dealii::DoFCellAccessor<2, 2, false>::get_dof_values<double, double*>(dealii::ReadVector<double> const&, double*, double*) const
#4  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: void dealii::DoFCellAccessor<2, 2, false>::get_dof_values<dealii::ReadVector<double>, double>(dealii::ReadVector<double> const&, dealii::Vector<double>&) const
#5  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: void dealii::DoFCellAccessor<2, 2, false>::get_interpolated_dof_values<double>(dealii::ReadVector<double> const&, dealii::Vector<double>&, unsigned short) const
#6  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: void dealii::FEValuesBase<2, 2>::CellIteratorContainer::get_interpolated_dof_values<double>(dealii::ReadVector<double> const&, dealii::Vector<double>&) const
#7  /home/bangerth/p/deal.II/1/install/lib/libdeal_II.g.so.9.6.0-pre: void dealii::FEValuesBase<2, 2>::get_function_values<double>(dealii::ReadVector<double> const&, std::vector<double, std::allocator<double> >&) const
#8  ./step-32.debug: Step32::BoussinesqFlowProblem<2>::local_assemble_stokes_system(dealii::TriaActiveIterator<dealii::DoFCellAccessor<2, 2, false>

This is awkward. We need to fix this before the next release.

@bangerth bangerth added this to the Release 9.6 milestone Dec 8, 2023
@bangerth bangerth added the Bug label Dec 8, 2023
@bangerth bangerth self-assigned this Dec 8, 2023
@bangerth
Copy link
Member Author

bangerth commented Dec 8, 2023

The place where this happens is

  template <int dim>
  void BoussinesqFlowProblem<dim>::local_assemble_stokes_system(
    const typename DoFHandler<dim>::active_cell_iterator &cell,
    Assembly::Scratch::StokesSystem<dim>                 &scratch,
    Assembly::CopyData::StokesSystem<dim>                &data)
  {
    const unsigned int dofs_per_cell =
      scratch.stokes_fe_values.get_fe().n_dofs_per_cell();
    const unsigned int n_q_points =
      scratch.stokes_fe_values.n_quadrature_points;

    const FEValuesExtractors::Vector velocities(0);
    const FEValuesExtractors::Scalar pressure(dim);

    scratch.stokes_fe_values.reinit(cell);

    const typename DoFHandler<dim>::active_cell_iterator temperature_cell =
      cell->as_dof_handler_iterator(temperature_dof_handler);
    scratch.temperature_fe_values.reinit(temperature_cell);

    if (rebuild_stokes_matrix)
      data.local_matrix = 0;
    data.local_rhs = 0;

    scratch.temperature_fe_values.get_function_values(
      old_temperature_solution, scratch.old_temperature_values);

The issue is that old_temperature_solution does not have ghost entries (as already suggested by the error message).

@bangerth
Copy link
Member Author

bangerth commented Dec 8, 2023

Neither does temperature_solution, for that matter. Though this is clearly the idea. We initialize the two via

    temperature_solution.reinit(temperature_relevant_partitioning,
                                MPI_COMM_WORLD);
    old_temperature_solution.reinit(temperature_solution);

I've got to leave it for another time to figure out what is happening.

@bangerth
Copy link
Member Author

bangerth commented Dec 8, 2023

Upon digging sightly deeper:

      {
        Assert (old_temperature_solution.has_ghost_elements(),    // ************* ok here
                ExcInternalError());

        TrilinosWrappers::MPI::Vector distributed_temp1(temperature_rhs);
        TrilinosWrappers::MPI::Vector distributed_temp2(temperature_rhs);

        std::vector<TrilinosWrappers::MPI::Vector *> tmp = {&distributed_temp1,
                                                            &distributed_temp2};
        temperature_trans.interpolate(tmp);

        // enforce constraints to make the interpolated solution conforming on
        // the new mesh:
        temperature_constraints.distribute(distributed_temp1);
        temperature_constraints.distribute(distributed_temp2);

        temperature_solution     = std::move(distributed_temp1);
        old_temperature_solution = std::move(distributed_temp2);

        Assert (old_temperature_solution.has_ghost_elements(),   // ************** fails here
                ExcInternalError());
      }

These lines were last touched by #15438.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant