Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hanging-node constraints in MatrixFree #12560

Merged
merged 1 commit into from
Jul 30, 2021

Conversation

peterrum
Copy link
Member

@peterrum peterrum commented Jul 10, 2021

depends on #12577

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the whole code and it looks fine. I have some minor comments.

include/deal.II/matrix_free/dof_info.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/dof_info.templates.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
@peterrum
Copy link
Member Author

peterrum commented Jul 17, 2021

@kronbichler I have addressed 3 out 4 of your comments. There are still two topics I need to address:

  1. compute diagonal
  2. storage of plain indices

@peterrum peterrum force-pushed the mf_hn branch 2 times, most recently from 78feffe to 6b4cad4 Compare July 20, 2021 05:34
@peterrum peterrum changed the title [WIP] Optimize hanging-node constraints in MatrixFree Optimize hanging-node constraints in MatrixFree Jul 20, 2021
Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent - I have a few small requests regarding variable names and documentation. Then this should be in a state to give it some try.

include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
template <int fe_degree_, unsigned int side, bool transpose>
static void
interpolate_2D(const unsigned int fe_degree,
bool type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give this a more descriptive name, e.g. flip_direction? Below, you use not_flipped, but I would also prefer the non-negated statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed it to is_subface_0.

include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
Comment on lines 442 to 449
Assert(false, ExcNotImplemented());

(void)n_components;
(void)fe_degree;
(void)fe_eval;
(void)transpose;
(void)c_mask;
(void)values;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that face integrals computed on FE_Q with hanging nodes (e.g. for inhomogeneous Neumann conditions or weak Dirichlet conditions) end up here? But we should be able to use the same code here, shouldn't we? While there could be a case where FE_Q goes into the contiguous unknown section, I think we should be able to rule that part out. It seems the matrix_vector_faces_xx tests miss the part of FE_Q combined with hanging nodes. We do not have to do it in this PR, but it should happen soon. I can do that next week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that face integrals computed on FE_Q with hanging nodes (e.g. for inhomogeneous Neumann conditions or weak Dirichlet conditions) end up here?

At the moment, this function is not called for faces, since

FEEvaluationBase<dim, n_components_, Number, is_face, VectorizedArrayType>::
apply_hanging_node_constraints() const
{
if (is_face || this->dof_info == nullptr ||
this->dof_info->hanging_node_constraint_masks.size() == 0)
return; // nothing to do with faces

It is more a way to please the compiler.

But we should be able to use the same code here, shouldn't we? While there could be a case where FE_Q goes into the contiguous unknown section, I think we should be able to rule that part out. It seems the matrix_vector_faces_xx tests miss the part of FE_Q combined with hanging nodes. We do not have to do it in this PR, but it should happen soon. I can do that next week.

I guess the code might work if the face dofs are embedded into the cell dofs. Of course, the mask would need to be updated.

include/deal.II/matrix_free/matrix_free.templates.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/tools.h Outdated Show resolved Hide resolved
tests/matrix_free/hanging_node_kernels_01.cc Show resolved Hide resolved
tests/matrix_free/hanging_node_kernels_01.cc Outdated Show resolved Hide resolved
@peterrum
Copy link
Member Author

Excellent - I have a few small requests regarding variable names and documentation. Then this should be in a state to give it some try.

@kronbichler I'll address the comments later today. There are two open question worth discussing:

  • how to allocate the memory for the temporal storage for the 1D interpolations
    typename Number::value_type temp[10 /*TODO*/];
  • When to set up the optimized hanging-node data structures? I am not really in favor of introducing a new flag in AdditionalData: as it is now the new code is well tested (since it is chosen always). I guess one could alternatively create internally a AffineConstraint object with make_hanging_node_constraints() and check compatibility with the user provided one. But I am not sure the check is always possible!?

@peterrum
Copy link
Member Author

@kronbichler I have done the requested changes.

@kronbichler
Copy link
Member

kronbichler commented Jul 25, 2021

how to allocate the memory for the temporal storage for the 1D interpolations

Why not choose the length 40 or so, and then assert that we don't exceed it? We won't exceed that length in any realistic scenario, and it is still less than 64 where the compiler might start to touch every page. Since all types are basic types without constructor, we don't pay for the additional elements.

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to merge now. I have some comments on the generated machine code below.

include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
include/deal.II/matrix_free/evaluation_kernels.h Outdated Show resolved Hide resolved
@peterrum
Copy link
Member Author

@kronbichler I have made the changes. Once you give the final go, I will squash the commits!

@peterrum
Copy link
Member Author

/rebuild

@peterrum
Copy link
Member Author

@kronbichler May I ask you to take look at the last commit:

Does this mean that face integrals computed on FE_Q with hanging nodes (e.g. for inhomogeneous Neumann conditions or weak Dirichlet conditions) end up here? But we should be able to use the same code here, shouldn't we? While there could be a case where FE_Q goes into the contiguous unknown section, I think we should be able to rule that part out. It seems the matrix_vector_faces_xx tests miss the part of FE_Q combined with hanging nodes. We do not have to do it in this PR, but it should happen soon. I can do that next week.

I had to implement this in this PR. Even though this feature is not tested here, it is used in the MeltPoolDG project.

I furthermore have run successfully all tests in the MeltPoolDG project, which extensively uses AMR and is a good indication that the new implementation behaves as expected.

FYI @mschreter

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second commit looks exactly as I thought it should, 👍

Comment on lines 116 to 117
default_value = 0,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kronbichler I have introduced here a default value. Else I got some warnings in the tests in structures like this:

(direction == 0) ? dealii::internal::MatrixFreeFunctions::ConstraintTypes::face_y : 0;

Is the name fine or do you prefer another name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this does not necessarily mean that we have no constraints at all (in which case I would have chosen unconstrained), but only that we do not add new content through a binary 'or' operation, right? In that case, I am fine with the name.

Copy link
Member Author

@peterrum peterrum Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I like unconstrained, since this is the way we would use it in most cases (and we could replace all the comparisons with 0). The concrete instance that caused trouble looks like this:

            const bool constrained_face =
              (constraint_mask[v] &
               (((direction == 0) ? dealii::internal::MatrixFreeFunctions::
                                      ConstraintTypes::face_y :
                                    dealii::internal::MatrixFreeFunctions::
                                      ConstraintTypes::default_value) |
                ((direction == 1) ? dealii::internal::MatrixFreeFunctions::
                                      ConstraintTypes::face_x :
                                    dealii::internal::MatrixFreeFunctions::
                                      ConstraintTypes::default_value)));

(i.e., we add new content through a binary 'or' operation)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then we go with unconstrained?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. I have renamed it!

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rebase this (to be sure regarding #11108), then we should merge.

@peterrum
Copy link
Member Author

Can you rebase this

Done!

for (unsigned int j = 0; j < dofs_per_component;
++j)
if (1e-10 <
std::abs(values_dofs[j][0] &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is something wrong with the parentheses here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed the same thing and put up a patch in #12619.

This is very weird. The OSX testers both fail on other branches but they both got green checkmarks here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only realised because I accidentally built the examples and clang raised a warning (-Wabsolute-value).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Matrix-free enhancements
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

None yet

4 participants