Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize NonMatching::MappingInfo #15137

Merged

Conversation

bergbauer
Copy link
Contributor

This PR proposes to vectorize NonMatching::MappingInfo depending on the newly introduced template parameter Number. As this class interacts with FEPointEvaluation, which is vectorized internally also for scalar Number template arguments, the unit points are always stored vectorized to make them easily accessible.

To enable vectorized storage, separate AlignedVector objects are created for Jacobians, ... and a temporary internal::FEValuesImplementation::MappingRelatedData local object is used to use the interfaces which Mapping provides to compute mapping data.

The result is a 100% - 10% speedup in throughput of FEPointEvaluation depending on the polynomial degree and the dimension of the problem in my Laplace type benchmark.

@peterrum peterrum self-requested a review April 25, 2023 14:57
@peterrum
Copy link
Member

The result is a 100% - 10% speedup in throughput of FEPointEvaluation depending on the polynomial degree and the dimension of the problem in my Laplace type benchmark.

Very nice!

include/deal.II/matrix_free/fe_point_evaluation.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
*/
std::vector<unsigned int> unit_points_index;
AlignedVector<unsigned int> unit_points_index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be AlignedVector? In general, we've been trying to use AlignedVector only where necessary, i.e., large data structures (where we might profit from first-touch and similar) and for VectorizedArray fields. (The latter will be resolved once we require C++17, because then std::vector can respect alignment of types > 16 bytes, i.e., VectorizedArray.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inspired by MappingInfoStorage, see

AlignedVector<unsigned int> quadrature_point_offsets;

I can use std::vector if it aligns the memory properly for unsigned int.

include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Show resolved Hide resolved
@kronbichler
Copy link
Member

/rebuild

Comment on lines +189 to +192
aux.resize(dim - 1);
aux[0].resize(n_original_q_points);
if (dim > 2)
aux[1].resize(n_original_q_points);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jh66637 this?

Copy link
Contributor

@jh66637 jh66637 Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this was the initial intention for this change, but with these changes, empty quadratures can be handled. I think in case of an empty quadrature rule a shortcut could be made, i.e., if (quadrature.size() == 0) return; in the first line of the function (or sth. similar).

@bergbauer @peterrum What do you think about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion the shortcut should not be in this function but in the reinit* functions of MappingInfo. Above change is necessary to resize the inner AlignedVectors correctly which is necessary if the InternalData object is reused.

include/deal.II/matrix_free/fe_point_evaluation.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/fe_point_evaluation.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Show resolved Hide resolved
Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes to the right direction. I have some small comments left.

include/deal.II/base/vectorization.h Show resolved Hide resolved
internal::FEPointEvaluation::
EvaluatorTypeTraits<dim, n_components, Number>::set_value(
val_and_grad.first, v, values[qb * stride + v]);
for (unsigned int v = 0; v < stride && q + v < n_q_points_scalar; ++v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the compiler able to see that the second condition is a subset of the outer loop in line 1574 and hence removes the if statement altogether in vectorized mode with stride = 1?

include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
include/deal.II/non_matching/mapping_info.h Outdated Show resolved Hide resolved
Copy link
Member

@peterrum peterrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

mapping_data.jacobians[point_index * n_lanes + v][d][s];
return vectorized_jacobian;
AssertIndexRange(point_index, n_q_points);
Assert(jacobian_ptr != nullptr, ExcMessage("jacobian_ptr is not set!"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message does not really help. I think in the case of FEEval the message tells you what to do:

Assert(this->jacobian != nullptr,
internal::ExcMatrixFreeAccessToUninitializedMappingField(
"update_gradients"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting declaring an own exception with a verbose message what to do or to improve the current messages in place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a verbose exception now which tells the user what to do!

Comment on lines +68 to +81
read_value(const ScalarNumber vector_entry,
const unsigned int component,
value_type & result)
scalar_value_type &result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd. Different styles ScalarNumber vs. scalar_value_type. Could we write ScalarValueType. This is not really something what the user should use!?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style dates back to the first implementation of this class and is therefore not the scope of this PR. I agree that it should be ScalarValueType. This should be done in a follow-up. The user should not use this but we expose it publicly with the using statements at the top of FEPointEvaluation.

mapping_data.initialize(quadrature.size(), update_flags_mapping);

// reuse internal_mapping_data for MappingQ to avoid memory allocations
if (const MappingQ<dim, spacedim> *mapping_q =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (const MappingQ<dim, spacedim> *mapping_q =
if (const MappingQ<dim, spacedim> * =

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@bergbauer bergbauer force-pushed the vectorized_fe_point_and_mapping_info branch from 9e42a0d to 0f1cad6 Compare May 2, 2023 16:07
@peterrum
Copy link
Member

peterrum commented May 2, 2023

Undefined symbols for architecture x86_64:
  "dealii::internal::VectorizedArrayTrait<dealii::VectorizedArray<double, 2ul> >::width", referenced from:
      dealii::internal::VectorizedArrayTrait<dealii::VectorizedArray<double, 2ul> >::get(dealii::VectorizedArray<double, 2ul>&, unsigned int) in data_out_resample.cc.o
  "dealii::internal::VectorizedArrayTrait<double>::width", referenced from:
      dealii::internal::VectorizedArrayTrait<double>::get(double&, unsigned int) in data_out_resample.cc.o

@bergbauer bergbauer force-pushed the vectorized_fe_point_and_mapping_info branch from 0f1cad6 to 17a4b28 Compare May 3, 2023 07:56
@kronbichler kronbichler merged commit e8d7c68 into dealii:master May 3, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants