Avoid a couple FP subtractions.

By noting that the existing code performs dim subtractions of terms that are each a product of two values, we can reorder things in such a way that we first accumulate the products (which is a dot product) and then subtract the result. This should allow for some vectorization. The performance gain is almost certainly completely negligible, but it makes the code marginally easier to read. The reason why the indices involved here allow for this is because 'jacobian_pushed_forward_grads[i]' happens to be a Tensor<3,dim> and 'shape_gradients[k][i]' is a Tensor<1,dim>. So the types are so that their product is in fact equivalent to the summation of the last index as was written before.
dealii · Aug 19, 2019 · b0edf30 · b0edf30
1 parent 28e0ebe
commit b0edf30
Showing 1 changed file with 3 additions and 4 deletions.
diff --git a/include/deal.II/fe/fe_poly.templates.h b/include/deal.II/fe/fe_poly.templates.h
@@ -268,10 +268,9 @@ FE_Poly<PolynomialType, dim, spacedim>::fill_fe_values(
 
       for (unsigned int k = 0; k < this->dofs_per_cell; ++k)
         for (unsigned int i = 0; i < quadrature.size(); ++i)
-          for (unsigned int j = 0; j < spacedim; ++j)
-            output_data.shape_hessians[k][i] -=
-              mapping_data.jacobian_pushed_forward_grads[i][j] *
-              output_data.shape_gradients[k][i][j];
+          output_data.shape_hessians[k][i] -=
+            output_data.shape_gradients[k][i] *
+            mapping_data.jacobian_pushed_forward_grads[i];
     }
 
   if (flags & update_3rd_derivatives &&