Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonMatching::MappingInfo: precompute mapping data for an IteratorRange of cells/faces #14809

Merged
merged 1 commit into from Mar 14, 2023

Conversation

bergbauer
Copy link
Contributor

This PR extends the functionality of NonMatching::MappingInfo to precompute the mapping for a (filtered) IteratorRange of cells (and their faces).

It also adds access to precomputed JxW and normal vectors in case MappingInfo was constructed and reinitialized with appropriate Quadrature and UpdateFlags.

This enables writing FEPointEvaluation routines like this:

  evaluator.reinit(cell_index);

  evaluator.evaluate(solution_values_in, EvaluationFlags::gradients);

  for (unsigned int q = 0; q < evaluator.n_q_points; ++q)
    evaluator.submit_gradient(evaluator.JxW(q) * evaluator.get_gradient(q), q);

  evaluator.integrate(solution_values_out, EvaluationFlags::gradients);

which makes it very similar to FEEvaluation.

@peterrum peterrum self-requested a review February 20, 2023 09:04
Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great, thank you for the effort. My remarks mostly concern documentation and one interface change I am not happy with.

Comment on lines +947 to +960
unit_points =
mapping_info->get_unit_points(current_cell_index, current_face_number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this allocate memory in the operator= in case the old size of unit_points is already correct? Probably not, but it would be good to check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I check this? We could get rid of unit_points in FEPointEvaluation entirely and always ask MappingInfo (maybe in a follow-up).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could move all code into MappingInfo (also the case where the user never sets up such an object explicitly, e.g. in particle codes that want to evaluate information once), that would be great. Regarding how to check it: I use a compiler/trace tool and monitor the allocations, first using toy codes, to see how the compiler reacts. Here I use this one:

#include <vector>
#include <iostream>

int main()
{
  std::vector<unsigned int> data(5, 3);
  std::vector<unsigned int> second(3, 6);
  for (unsigned int i=0; i<10; ++i)
    data = second;
  std::cout << data.size() << " " << data.capacity() << " " << second.size() << " " << second.capacity() << std::endl;
}

Using valgrind as valgrind ./a.out to check for memory errors, I now see

==269622==   total heap usage: 4 allocs, 4 frees, 73,760 bytes allocated

This is less than 10 repetitions, so it can't allocate memory inside the loop (one might wonder if compilers might optimize it away, as it does not change the result, but at least gcc does not here). The reason for the loop is that I see more than 2 allocs, which I would expect for the two vectors - gcc counts the full program here. The cause is that std::cout << does another one to put the data to the output stream, and the last one is in the global program init outside main (it is done inside call_init.part.0 and part of the program startup, outside the control of my program).
The output is also revealing, which is why I included it:

3 5 3 3

You see from the first two numbers that the vector data kept its original capacity of 5, even though the length was changed to 3 after assignment from second. This is a strong indication that it kept its memory region. You could output the address of &data[0] to ensure it kept the same. You can also look at the source code of stl_vector.h to see this (for me, the file is /usr/include/c++/12/bits/stl_vector.h, but the implementation is in )/usr/include/c++/12/bits/vector.tcc:

  template<typename _Tp, typename _Alloc>
    _GLIBCXX20_CONSTEXPR
    vector<_Tp, _Alloc>&
    vector<_Tp, _Alloc>::
    operator=(const vector<_Tp, _Alloc>& __x)
    {
       ...
          const size_type __xlen = __x.size();
       ...
          else if (size() >= __xlen)
            {
              std::_Destroy(std::copy(__x.begin(), __x.end(), begin()),
                            end(), _M_get_Tp_allocator());
            }
       ...

From this, you see that we only call std::copy (and destroy the elements that now got invalid due to reducing the size), but we did not deallocate/reallocate.

include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
Comment on lines 92 to 93
* Reinitialize the mapping information for the incoming vector of cells and
* corresponding vector of quadratures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain n_unfilted_cells (also above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explanation for n_unfilted_cells. Do you think an example code would help?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a short code example would be helpful.

include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
Comment on lines 347 to 357
unit_points_index.resize(n_cells + 1);
unsigned int cell_index = 0;
unsigned int n_unit_points = 0;
for (const auto &cell : cell_iterator_range)
{
unit_points_index[cell_index] = n_unit_points;
n_unit_points += unit_points_vector[cell_index].size();

++cell_index;
}
unit_points_index[n_cells] = n_unit_points;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, I would prefer to loop over unit_points_vector instead. Maybe like this, if you would like to keep the range-based for loop:

Suggested change
unit_points_index.resize(n_cells + 1);
unsigned int cell_index = 0;
unsigned int n_unit_points = 0;
for (const auto &cell : cell_iterator_range)
{
unit_points_index[cell_index] = n_unit_points;
n_unit_points += unit_points_vector[cell_index].size();
++cell_index;
}
unit_points_index[n_cells] = n_unit_points;
unit_points_index.reserve(n_cells + 1);
unit_points_index.resize(1);
unit_points_index[0] = 0;
for (const auto &unit_points : unit_points_vector)
unit_points_index.push_back(unit_points_index.back() + unit_points.size());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced it with code similar to your suggestion for the cellwise reinit_* functions. For the faces I could not think of a better way because I need access to cell->face_indices() through the CellIterator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree for the faces.

Comment on lines 427 to 442
template <int dim, int spacedim>
template <typename Iterator>
void
MappingInfo<dim, spacedim>::reinit_surface(
const IteratorRange<Iterator> & cell_iterator_range,
const std::vector<ImmersedSurfaceQuadrature<dim>> &quadrature_vector,
const unsigned int n_unfiltered_cells)
{
do_cell_index_compression =
n_unfiltered_cells != numbers::invalid_unsigned_int;

if (update_flags_mapping & (update_JxW_values | update_normal_vectors))
update_flags_mapping |= update_covariant_transformation;

const unsigned int n_cells =
std::distance(cell_iterator_range.begin(), cell_iterator_range.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this and the following two functions are pretty similar, I wonder whether you see a way to implement the algorithm only once and use some unifying calls (e.g. transformation to arguments needed by other function) instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two reinit_faces are pretty similar indeed. I would suggest removing the reinit_faces function that only takes vectors of unit points on faces. I do not use this function anymore since I implemented the function that takes quadratures and it is also not tested by the test in this PR. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing the reinit_faces function that only takes vectors of unit points on faces. I do not use this function anymore since I implemented the function that takes quadratures and it is also not tested by the test in this PR. What do you think?

I agree, that would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}



template <int dim, int spacedim>
const std::vector<Point<dim>> &
MappingInfo<dim, spacedim>::get_unit_points() const
const std::vector<Point<dim>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this in person already, but I am not happy about this code change because we might create (many) copies as a return value. I think it would be better if we could return an ArrayView<Point<dim>> here. It breaks the interface, but so does the change you do, but I think the break is the least intrusive way while still ensuring good performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I see some slight decay in the performance of FEPointEvaluation comparing this branch with master (even after this change). Do you think this is the reason for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly yes. Memory allocations of many small arrays in inner loops are one of worst performance sinks. As said in the other post, we do not necessarily allocate memory in the assignment to a vector, but as far as I understand the code I think it needs to happen here, because we first must create a temporary std::vector object to hold the temporary data. Return-value optimization, which is an often-said argument why pass-by-value should be OK, is likely not possible because we do not simply return a variable stored internally. Thus, I suggest we change this here and see how many places it affects in the tests of deal.II, which might give us an indication of breakage of user codes.

include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
Copy link
Member

@peterrum peterrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. In a follow-up could also try to reuse mapping across cells. Most of the infrastructure for that we should have in place!

@@ -883,7 +953,7 @@ FEPointEvaluation<n_components, dim, spacedim, Number>::evaluate(
numbers::signaling_nan<gradient_type>());
}

if (unit_points.empty())
if (unit_points.size() == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this show it was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @peterrum - just an interesting remark, compiler people have been working on detecting size() == 0 and turn it into something like empty() for performance reasons. To give insights on why there might be a difference (apart from code reading arguments, which are also there): The latter checks where the addresses of begin() and end() are the same (one comparison, plus possibly loads on begin() and end()), whereas size() first computes temp = end() - begin(), and then we compare temp == 0, so two instructions instead of one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this because ArrayView does not provide an empty() function. If we decide to keep the local vector copy I will revert this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding empty() to ArrayView. We can do that in a follow-up PR!

@@ -57,27 +62,125 @@ namespace NonMatching
MappingInfo(const Mapping<dim> &mapping, const UpdateFlags update_flags);

/**
* Reinitialize the mapping information for the incoming cell and unit
* Compute the mapping information for the incoming cell and unit
* points. (needed to resolve ambiguity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you convert this to a sentence?

*/
enum class State
{
none,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
none,
invalid,

{
Assert(state == State::single_cell,
ExcMessage(
"This mapping info is not reinitialized for a single cell"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This mapping info is not reinitialized for a single cell"));
"This mapping info is not reinitialized for a single cell!"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I have added it to every Assert in the two getter functions.

return mapping_data[cell_index_offset[cell_index] + face_number];
}
else
AssertThrow(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can convert most AssertThrow instances to Assert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change this I have to return something for this invalid case.

Comment on lines +93 to +97
* It is possible to give an IteratorRange<FilteredIterator> to this
* function and together with @p n_unfiltered_cells specified this object
* can compress its storage while giving you access to the underlying data
* with the "unfiltered" index. The default argument for
* @p n_unfiltered_cells disables this built-in compression.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this should work...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the test to show how to use this feature.

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are on a good path. I have some additional comments. Next to @peterrum's comments, I think we can get close to have the first version in.

/**
* Number of quadrature points of the current cell/face.
*/
unsigned int n_q_points;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this variable const and use const_cast<unsigned int&>(n_q_points) = ... in lines 896 and similar below? I would really like this variable to be seen as const to the outside, in analogy to

const unsigned int n_quadrature_points;
or
const unsigned int n_q_points;

Since all these calls happen inside a non-static member function of the class, I do not think we can trigger undefined behavior when a compiler would completely eliminate the const variable without having a storage location for it (in which case static_cast<unsigned int&> would fail with a compile error).

Comment on lines 751 to 752
unsigned int current_cell_index = numbers::invalid_unsigned_int;
unsigned int current_face_number = numbers::invalid_unsigned_int;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not assign a value to member variables per the coding conventions. Please do it in the constructor instead. While making the change, please add a one-line comment for doxygen.

@@ -45,6 +46,10 @@ namespace NonMatching
class MappingInfo : public Subscriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember what exactly we discussed here, but I am a bit concerned about having this in the user-visible namespace, as MappingInfo does not tell very much. I used it inside an internal namespace of MatrixFree because I did not want to spell it out as MappingRelatedData or maybe even find a different name, to avoid confusion with

/**
* A class that stores all of the mapping related data used in
* dealii::FEValues, dealii::FEFaceValues, and dealii::FESubfaceValues
* objects. Objects of this kind will be given as <i>output</i> argument
* when dealii::FEValues::reinit() calls Mapping::fill_fe_values() for a
* given cell, face, or subface.
*
* The data herein will then be provided as <i>input</i> argument in the
* following call to FiniteElement::fill_fe_values().
*
* @ingroup feaccess
*/
template <int dim, int spacedim = dim>
class MappingRelatedData

In any case, I see the desire to have the same name as

/**
* The class that stores all geometry-dependent data related with cell
* interiors for use in the matrix-free class.
*
* @ingroup matrixfree
*/
template <int dim, typename Number, typename VectorizedArrayType>
struct MappingInfo
, but in that case I suggest to be clear with the doxygen description of the class, using inspiration from the other two classes (I admit MatrixFreeFunctions::MappingInfo is also not good, but we might extend there as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the current naming is not optimal but we should discuss where to put this separately from this PR. This is also connected to the question in which namespace and folder FEPointEvaluation belongs.

/**
* The reference points specified at reinit().
*/
std::vector<Point<dim>> unit_points;
ArrayView<const Point<dim>> unit_points;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this here to ArrayView to avoid the copy of unit points. I can also address this in a follow-up, when we get rid of this class member as proposed in https://github.com/dealii/dealii/pull/14809/files#r1125725149 .

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now, apart from the topics we need to address in a follow-up patch. Could you please squash your commits?

@kronbichler
Copy link
Member

/rebuild

@bergbauer
Copy link
Contributor Author

Squashed and ready from my side. Any final comments @peterrum ?

@@ -883,7 +953,7 @@ FEPointEvaluation<n_components, dim, spacedim, Number>::evaluate(
numbers::signaling_nan<gradient_type>());
}

if (unit_points.empty())
if (unit_points.size() == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding empty() to ArrayView. We can do that in a follow-up PR!

std::vector<Point<dim>>(unit_points.begin(), unit_points.end());
this->unit_points = unit_points;

const_cast<unsigned int &>(n_q_points) = unit_points.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you do this. I do the same in FEFaceValues to support mixed meshes. The reason for me to do that there was because n_q_points was already a const and public member variable. I was not happy with the solution there, since it is very error prone on user side if one should decide to store n_q_points in a variable. I would much more prefer if we could either introduce a getter function or, if we just need this to iterate over quadrature points, a function like quadrature_point_indices() in FEEvaluation (https://www.dealii.org/developer/doxygen/deal.II/classFEEvaluationData.html#a0eca800437c07c9f23ec6815e58f1432).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants