New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MappingQEulerian GMG bug #12791
MappingQEulerian GMG bug #12791
Conversation
Here is the call-stack:
|
My interpretation is that we are missing some ghost indices but I have no idea how to determine what I might need. Any ideas? @kronbichler ? |
MGLevelObject<LinearAlgebra::distributed::Vector<LevelNumberType>> | ||
displacement_level(min_level, max_level); | ||
mg_transfer_euler.interpolate_to_mg(dof_handler_euler, | ||
displacement_level, | ||
displacement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "bug" is here, and is fixed by adding this code before the interpolate_to_mg
call:
for (unsigned int level = 0; level <= max_level; ++level)
{
IndexSet relevant_mg_dofs;
DoFTools::extract_locally_relevant_level_dofs(dof_handler_euler,
level,
relevant_mg_dofs);
displacement_level[level].reinit(dof_handler_euler.locally_owned_mg_dofs(
level),
relevant_mg_dofs,
mpi_communicator);
}
This situation reveals an oversight in our multigrid classes, though: We somewhat naively assume that the interpolate_to_mg
would need the same indices as the restrict
and prolongate
functions (implied by copy_to_mg
). However, one might want to have more ghosts especially in the present context of MappingQEulerian
, as we also want to be able to fill the information on all ghost cells. Or, as in the case here, it might even fail on just the locally owned cells when some of the cells get handled by another MPI process in the transfer setting.
In our CFD application code, we use a similar strategy, namely to copy to a vector to more ghosts after the transfer has been execute (in order not to rely on the fact that copy_to_mg
preserves the ghost state on the level vectors), namely: https://github.com/exadg/exadg/blob/2eabd006f482001b22251632b3c97644a9918706/include/exadg/grid/mapping_dof_vector.h#L306-L322
The main question is whether we should always create vectors with all ghosts for the interpolate_to_mg
functions, i.e., here
dealii/include/deal.II/multigrid/mg_transfer_matrix_free.h
Lines 524 to 531 in f37e029
for (unsigned int level = min_level; level <= max_level; ++level) | |
if (dst[level].size() != dof_handler.n_dofs(level) || | |
dst[level].locally_owned_size() != | |
dof_handler.locally_owned_mg_dofs(level).n_elements()) | |
dst[level].reinit(this->vector_partitioners[level]); | |
// copy fine level vector to active cells in MG hierarchy | |
this->copy_to_mg(dof_handler, dst, src, true); |
dealii/include/deal.II/multigrid/mg_transfer_global_coarsening.h
Lines 682 to 684 in f37e029
for (unsigned int level = min_level; level <= max_level; ++level) | |
initialize_dof_vector(level, dst[level]); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This situation reveals an oversight in our multigrid classes, though: We somewhat naively assume that the
interpolate_to_mg
would need the same indices as therestrict
andprolongate
functions
That makes sense and I can confirm that this fixes the issue. I was mislead because I used tests/mappings/mapping_q_eulerian_08|09
as a basis for the code.
In our CFD application code, we use a similar strategy, namely to copy to a vector to more ghosts after the transfer has been execute
Is there an advantage to doing it afterwards? Is this because you use the transfer for something else?
The main question is whether we should always create vectors with all ghosts for the
interpolate_to_mg
functions
Do you think this would incur a significant cost? We probably only need this change when using MappingQEulerian
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why we do it afterwards is that we do not rely on the fact that the copy_to_mg
does not change the ghost layer. I do not think we make a guarantee here (or do we @peterrum, you have been working a lot with MG recently), so that was merely for being on the conservative side. But yes I agree, doing before the operation is in line with the original intent of these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think what would be consistent to cover the following three cases:
- input MG-vectors have the correct ghost DoFs: nothing to do
- ... have size 0: resize so that it matches with 1)
- ... have different ghost DoFs: create temporal mg vectors, proceed with 1) and finally copy the results to the actual output mg vector
What do you think?
I really don't like if the function change somewhere the ghost state of the vector, which might lead to unexpected behaviors. This reminds me that there is still an open PR I need to finalize (copy_to_mg
- #11871)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this concept of the three cases, our transfer classes should not change the ghost state, and working towards eliminating the need for things like adjust_ghost_range_if_necessary
, by making the vectors we have inside the multigrid hierarchy compatible with the matrix-vector product (no copy) and transfer (try to avoid deep copy on as many fine levels as possible). For the transfer, we need to enable user-provided initialization of vectors (either a partitioner or a lambda, as discussed in the other PR) and in case we find them match (as done in MGTransferMatrixFree
), we use that ghost approach, and if not, we create a copy.
The main point of action would now be to see over our interfaces. The code I linked to above is already problematic because we check
dealii/include/deal.II/multigrid/mg_transfer_matrix_free.h
Lines 524 to 528 in f37e029
for (unsigned int level = min_level; level <= max_level; ++level) | |
if (dst[level].size() != dof_handler.n_dofs(level) || | |
dst[level].locally_owned_size() != | |
dof_handler.locally_owned_mg_dofs(level).n_elements()) | |
dst[level].reinit(this->vector_partitioners[level]); |
I think we should change that to
if (dst[level].size() == 0)
in this code above, and add an assert in the else
case checking AssertDimension(dst[level].locally_owned_size(), dof_handler.dof_handler.locally_owned_mg_dofs(level).n_elements())
. We then later check for the ghost state in dealii/include/deal.II/multigrid/mg_transfer_matrix_free.h
Lines 545 to 554 in f37e029
if (dst[level].get_partitioner().get() == | |
this->vector_partitioners[level].get()) | |
input = &dst[level]; | |
else | |
{ | |
ghosted_fine.reinit(this->vector_partitioners[level]); | |
ghosted_fine.copy_locally_owned_data_from(dst[level]); | |
ghosted_fine.update_ghost_values(); | |
input = &ghosted_fine; | |
} |
MGTransferGlobalCoarsening
and MGTransferMatrixFree
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, our documentation would need to be very clear on the error observed by @tjhei in the original post, that one cannot and should not rely the ghosts to be set adequately, and to ideally present the function with a vector including the relevant ghosts that get filled (either in the same vector, or by a copy inside our functions).
Oh, and thank you. |
I added the code block to _08 as well. Would it make sense to merge this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is very good now and indeed good to have the check.
I thought about this some more @peterrum and @kronbichler . I think Is that an acceptable change? |
This tests changes the mesh of tests/mappings/mapping_q_eulerian_08.cc to trigger the error
when updating a MappingQEulerian on multigrid (with 4 ranks only).
Any help is appreciated.