-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize hanging-node constraints in MatrixFree #12560
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through the whole code and it looks fine. I have some minor comments.
|
78feffe
to
6b4cad4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent - I have a few small requests regarding variable names and documentation. Then this should be in a state to give it some try.
template <int fe_degree_, unsigned int side, bool transpose> | ||
static void | ||
interpolate_2D(const unsigned int fe_degree, | ||
bool type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give this a more descriptive name, e.g. flip_direction
? Below, you use not_flipped
, but I would also prefer the non-negated statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed it to is_subface_0
.
Assert(false, ExcNotImplemented()); | ||
|
||
(void)n_components; | ||
(void)fe_degree; | ||
(void)fe_eval; | ||
(void)transpose; | ||
(void)c_mask; | ||
(void)values; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that face integrals computed on FE_Q
with hanging nodes (e.g. for inhomogeneous Neumann conditions or weak Dirichlet conditions) end up here? But we should be able to use the same code here, shouldn't we? While there could be a case where FE_Q
goes into the contiguous unknown section, I think we should be able to rule that part out. It seems the matrix_vector_faces_xx
tests miss the part of FE_Q
combined with hanging nodes. We do not have to do it in this PR, but it should happen soon. I can do that next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that face integrals computed on FE_Q with hanging nodes (e.g. for inhomogeneous Neumann conditions or weak Dirichlet conditions) end up here?
At the moment, this function is not called for faces, since
dealii/include/deal.II/matrix_free/fe_evaluation.h
Lines 5307 to 5312 in 85f67b5
FEEvaluationBase<dim, n_components_, Number, is_face, VectorizedArrayType>:: | |
apply_hanging_node_constraints() const | |
{ | |
if (is_face || this->dof_info == nullptr || | |
this->dof_info->hanging_node_constraint_masks.size() == 0) | |
return; // nothing to do with faces |
It is more a way to please the compiler.
But we should be able to use the same code here, shouldn't we? While there could be a case where FE_Q goes into the contiguous unknown section, I think we should be able to rule that part out. It seems the matrix_vector_faces_xx tests miss the part of FE_Q combined with hanging nodes. We do not have to do it in this PR, but it should happen soon. I can do that next week.
I guess the code might work if the face dofs are embedded into the cell dofs. Of course, the mask would need to be updated.
@kronbichler I'll address the comments later today. There are two open question worth discussing:
|
@kronbichler I have done the requested changes. |
Why not choose the length 40 or so, and then assert that we don't exceed it? We won't exceed that length in any realistic scenario, and it is still less than 64 where the compiler might start to touch every page. Since all types are basic types without constructor, we don't pay for the additional elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to merge now. I have some comments on the generated machine code below.
@kronbichler I have made the changes. Once you give the final go, I will squash the commits! |
/rebuild |
@kronbichler May I ask you to take look at the last commit:
I had to implement this in this PR. Even though this feature is not tested here, it is used in the MeltPoolDG project. I furthermore have run successfully all tests in the MeltPoolDG project, which extensively uses AMR and is a good indication that the new implementation behaves as expected. FYI @mschreter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second commit looks exactly as I thought it should, 👍
a0bfb9a
to
62d421b
Compare
default_value = 0, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kronbichler I have introduced here a default value. Else I got some warnings in the tests in structures like this:
(direction == 0) ? dealii::internal::MatrixFreeFunctions::ConstraintTypes::face_y : 0;
Is the name fine or do you prefer another name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this does not necessarily mean that we have no constraints at all (in which case I would have chosen unconstrained
), but only that we do not add new content through a binary 'or' operation, right? In that case, I am fine with the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit I like unconstrained
, since this is the way we would use it in most cases (and we could replace all the comparisons with 0). The concrete instance that caused trouble looks like this:
const bool constrained_face =
(constraint_mask[v] &
(((direction == 0) ? dealii::internal::MatrixFreeFunctions::
ConstraintTypes::face_y :
dealii::internal::MatrixFreeFunctions::
ConstraintTypes::default_value) |
((direction == 1) ? dealii::internal::MatrixFreeFunctions::
ConstraintTypes::face_x :
dealii::internal::MatrixFreeFunctions::
ConstraintTypes::default_value)));
(i.e., we add new content through a binary 'or' operation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then we go with unconstrained
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I have renamed it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rebase this (to be sure regarding #11108), then we should merge.
Done! |
for (unsigned int j = 0; j < dofs_per_component; | ||
++j) | ||
if (1e-10 < | ||
std::abs(values_dofs[j][0] && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is something wrong with the parentheses here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed the same thing and put up a patch in #12619.
This is very weird. The OSX testers both fail on other branches but they both got green checkmarks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only realised because I accidentally built the examples and clang raised a warning (-Wabsolute-value).
depends on #12577