Use tensor product evaluation for MappingQGeneric::fill_fe_face_values #9867

kronbichler · 2020-04-10T06:54:02Z

MappingQGeneric already supports the fast tensorized evaluation of quantities on cells via the code

Lines 1555 to 1565 in f40be01

    
           // do the actual tensorized evaluation 
        
           SelectEvaluator<dim, -1, 0, n_comp, VectorizedArray<double>>:: 
        
             evaluate(data.shape_info, 
        
                      data.values_dofs.begin(), 
        
                      data.values_quad.begin(), 
        
                      data.gradients_quad.begin(), 
        
                      data.hessians_quad.begin(), 
        
                      data.scratch.begin(), 
        
                      evaluate_values, 
        
                      evaluate_gradients, 
        
                      evaluate_hessians);

We should introduce the same for the FEFaceValues (geometry part). The reason for this is that the constructor of FEFaceValues is very expensive for higher mapping degrees and many quadrature points. More precisely, if we have a MappingQGeneric description of degree p and r quadrature points per direction, the initialization in 3D evaluates (p+1)^3 polynomials on r^2 points, storing a table of (p+1)^3 r^2 values. What makes things worse is that we store this information on all 6 faces and with all 8 possible values for the face orientation, so a total of 48 (p+1)^3 r^2 data items need to be evaluated by TensorProductPolynomials and stored. To give a number, for p=15 and r=20 this equates to 630 MB - just that part of FEFaceValues! Parallelization won't help because we construct a separate FEFaceValues on all threads and all MPI ranks. This level of memory consumption makes this a blocker. The PR #9656 circumvented this problem for the setup of MatrixFree, which is one of the primary users of high mapping degrees and quadrature formulas with many degrees, but it would of course be better that we also provide this in other contexts, rather than saying we can't use deal.II for those degrees and quadrature formulas.

This needs the following ingredients:

We need to extract the reordering tables for adjusting for the face orientation from the matrix-free context to a static function:

dealii/include/deal.II/matrix_free/mapping_info.templates.h

Lines 86 to 107 in f40be01

    
           const unsigned int n = quadrature_1d.size(); 
        
           face_orientations.reinit(8, n * n); 
        
           for (unsigned int j = 0, i = 0; j < n; ++j) 
        
             for (unsigned int k = 0; k < n; ++k, ++i) 
        
               { 
        
                 // face_orientation=true,  face_flip=false, face_rotation=false 
        
                 face_orientations[0][i] = i; 
        
                 // face_orientation=false, face_flip=false, face_rotation=false 
        
                 face_orientations[1][i] = j + k * n; 
        
                 // face_orientation=true,  face_flip=true,  face_rotation=false 
        
                 face_orientations[2][i] = (n - 1 - k) + (n - 1 - j) * n; 
        
                 // face_orientation=false, face_flip=true,  face_rotation=false 
        
                 face_orientations[3][i] = (n - 1 - j) + (n - 1 - k) * n; 
        
                 // face_orientation=true,  face_flip=false, face_rotation=true 
        
                 face_orientations[4][i] = j + (n - 1 - k) * n; 
        
                 // face_orientation=false, face_flip=false, face_rotation=true 
        
                 face_orientations[5][i] = k + (n - 1 - j) * n; 
        
                 // face_orientation=true,  face_flip=true,  face_rotation=true 
        
                 face_orientations[6][i] = (n - 1 - j) + k * n; 
        
                 // face_orientation=false, face_flip=true,  face_rotation=true 
        
                 face_orientations[7][i] = (n - 1 - k) + j * n; 
        
               }

We need to check whether the quadrature formula is a tensor product at the right spot. We make the check here:

dealii/source/fe/mapping_q_generic.cc

Line 707 in f40be01

tensor_product_quadrature = q.is_tensor_product();

but here we have already expanded into all quadrature variants without tensor product here due to:

dealii/source/fe/mapping_q_generic.cc

Lines 2724 to 2726 in f40be01

    
           data.initialize_face(this->requires_update_flags(update_flags), 
        
                                QProjector<dim>::project_to_all_faces(quadrature), 
        
                                quadrature.size());

For background information, the expansion into the 48 cases happens here (it is even worse for the subface values):

dealii/source/base/quadrature.cc

Lines 871 to 886 in f40be01

    
           for (const auto &mutation : q) 
        
             { 
        
               // project to each face and append 
        
               // results 
        
               for (unsigned int face = 0; face < n_faces; ++face) 
        
                 { 
        
                   project_to_face(mutation, face, help); 
        
                   std::copy(help.begin(), help.end(), std::back_inserter(q_points)); 
        
                 } 
        
               // next copy over weights 
        
               for (unsigned int face = 0; face < n_faces; ++face) 
        
                 std::copy(mutation.get_weights().begin(), 
        
                           mutation.get_weights().end(), 
        
                           std::back_inserter(weights)); 
        
             }

Once we have convinced ourselves that we have a tensor product quadrature formula, which needs some restructuring in the internals of MappingQGeneric, we can set up a ShapeInfo similarly to the cell case.

For the evaluation part, we would make a call in analogy to this one:

dealii/include/deal.II/matrix_free/mapping_info.templates.h

Lines 1973 to 1993 in f40be01

    
           // now let the matrix-free evaluators provide us with the 
        
           // data on faces 
        
           FEFaceEvaluationSelector<dim, 
        
                                    -1, 
        
                                    0, 
        
                                    dim, 
        
                                    double, 
        
                                    VectorizedDouble>:: 
        
             evaluate(shape_info, 
        
                      cell_points.data(), 
        
                      face_quads.data(), 
        
                      face_grads.data(), 
        
                      scratch_data.data(), 
        
                      true, 
        
                      true, 
        
                      face_no, 
        
                      GeometryInfo<dim>::max_children_per_cell, 
        
                      faces[face].face_orientation > 8 ? 
        
                        faces[face].face_orientation - 8 : 
        
                        0, 
        
                      my_data.descriptor[0].face_orientations);

Note that we here feed the orientation table and the right orientation, and we can also provide an index for the subface values.

The text was updated successfully, but these errors were encountered:

kronbichler · 2020-04-10T06:54:55Z

FYI @nfehn

kronbichler · 2023-04-07T07:58:36Z

Note that #15035 can address this from another angle: While slightly less efficient than the full tensor product path, that PR addresses the fundamental problem of that function with high orders. With some optimizations that @bergbauer is already planning for in terms of the faces, we might end up with an algorithm of complexity mapping_degree^2 x n_q_points, which is not much worse than the best case mapping_degree x n_q_points, but without having to check the face quadrature formula at all.

Once #15035 is addressed, we should revisit this PR and evaluate the performance and memory consumption, it looks like we can skip the high memory consumption mentioned in the original post.

kronbichler added Matrix-free Mapping labels Apr 10, 2020

kronbichler mentioned this issue Apr 7, 2023

Optimize mapping computation in MappingQ #15035

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use tensor product evaluation for MappingQGeneric::fill_fe_face_values #9867

Use tensor product evaluation for MappingQGeneric::fill_fe_face_values #9867

kronbichler commented Apr 10, 2020 •

edited

kronbichler commented Apr 10, 2020

kronbichler commented Apr 7, 2023 •

edited

Use tensor product evaluation for MappingQGeneric::fill_fe_face_values #9867

Use tensor product evaluation for MappingQGeneric::fill_fe_face_values #9867

Comments

kronbichler commented Apr 10, 2020 • edited

kronbichler commented Apr 10, 2020

kronbichler commented Apr 7, 2023 • edited

kronbichler commented Apr 10, 2020 •

edited

kronbichler commented Apr 7, 2023 •

edited