BarycentricPolynomial::value() optimizations - get rid of `pow` #14117

kronbichler · 2022-07-08T09:56:42Z

We currently do this:

dealii/include/deal.II/base/polynomials_barycentric.h

Lines 682 to 690 in 5559d13

    
           const auto indices = index_to_indices(i, coefficients.size()); 
        
           const auto coef    = coefficients(indices); 
        
           if (coef == Number()) 
        
             continue; 
        
           auto temp = Number(1); 
        
           for (unsigned int d = 0; d < dim + 1; ++d) 
        
             temp *= std::pow(b_point[d], indices[d]); 
        
           result += coef * temp;

Notice std::pow. Since the power is not known at compile time, the code steps into the very slow pow function. I believe there is some equivalent to Horner that applies powers by multiplication one at a time, probably also without unrolling the indices in index_to_indices, to avoid this bottleneck.

Found while looking at the simplex benchmark mentioned in #14068.

The text was updated successfully, but these errors were encountered:

fernandohv3279 · 2024-01-04T21:22:31Z

Hello. I would like to try to implement this, however I don't understand the problem completely. If the goal is to eliminate std::pow, would it be enough to use a for loop?

As for a Horner-like implementation, it would be necessary to have a Horner representation of the polynomial. This could be computed at the beginning, when the polynomial is instantiated, is that a correct approach?

Thank you, and apologies for the dumb questions, I am only stating to get familiar with the project.

drwells · 2024-01-04T22:05:40Z

Glad to hear you're interested!

If the goal is to eliminate std::pow, would it be enough to use a for loop?

That might not be the optimal solution since the loop bounds won't be known at compile-time. Another possibility would be to use some kind of switch statement on the exponent and then call the correct version of Utilities::fixed_power<N>().

As for a Horner-like implementation, it would be necessary to have a Horner representation of the polynomial. This could be computed at the beginning, when the polynomial is instantiated, is that a correct approach?

I have no idea. I've looked in the past and I don't know what the optimal way to evaluate these kinds of multivariate polynomials is. Something like Horner might be excessively difficult since there are multiple independent variables.

I don't think either of those are dumb questions. If this fix was obvious someone else probably would have done it.

bangerth · 2024-01-05T16:26:03Z

I think the underlying reason why std::pow is slow is because it has to deal with fractional powers. But here, we only have integer powers (and positive integers on top), and one can imagine a much easier implementation that is based on recursive doubling.

A separate optimization is probably that we compute powers of b_point[d] multiple times, because of the outer loop over i here. One could imagine seeing of that could be done more elegantly (or whether that's necessary at all).

blaisb · 2024-01-05T21:13:36Z

In this case could we not make our own run-time pow(double,int) implementation that is specialized for integers? That's relatively easy to code and would give us significant benefits. I think there is a reason why this is not in the standard, but I remember benchmark a simple case where calculating a*a vs std::pow(a,2.) led to a 50-70x different in the runtime.

bangerth · 2024-01-05T21:15:59Z

Yes, that's why we already have #13321 :-)

blaisb · 2024-01-05T21:16:59Z

I even commented on this issue last year... Damn...
Well someone could make that function. I can give it a jab in a few weeks.

masterleinad · 2024-01-05T21:42:02Z

Well someone could make that function. I can give it a jab in a few weeks.

You mean something else than Utilities::fixed_power or Utilities::pow?

blaisb · 2024-01-10T13:36:33Z

Well question is do we need something else than Utilities::pow and Utilities::fixed_power. Both should be able to do the trick no? Fixed at compile time and utilities::pow at run time

bangerth · 2024-01-10T16:21:53Z

Yes, Utilities::pow() is all you need now that it allows floating point numbers as arguments (#16439).

kronbichler added Starter project Finite Element Simplices labels Jul 8, 2022

kronbichler mentioned this issue Jul 8, 2022

Remove the cell dof indices cache #14068

Merged

kronbichler mentioned this issue Apr 5, 2024

Avoid std::pow in BarycentricPolynomials #16855

Merged

masterleinad closed this as completed in #16855 Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BarycentricPolynomial::value() optimizations - get rid of `pow` #14117

BarycentricPolynomial::value() optimizations - get rid of `pow` #14117

kronbichler commented Jul 8, 2022

fernandohv3279 commented Jan 4, 2024

drwells commented Jan 4, 2024 •

edited

bangerth commented Jan 5, 2024

blaisb commented Jan 5, 2024 •

edited

bangerth commented Jan 5, 2024

blaisb commented Jan 5, 2024

masterleinad commented Jan 5, 2024

blaisb commented Jan 10, 2024

bangerth commented Jan 10, 2024

BarycentricPolynomial::value() optimizations - get rid of pow #14117

BarycentricPolynomial::value() optimizations - get rid of pow #14117

Comments

kronbichler commented Jul 8, 2022

fernandohv3279 commented Jan 4, 2024

drwells commented Jan 4, 2024 • edited

bangerth commented Jan 5, 2024

blaisb commented Jan 5, 2024 • edited

bangerth commented Jan 5, 2024

blaisb commented Jan 5, 2024

masterleinad commented Jan 5, 2024

blaisb commented Jan 10, 2024

bangerth commented Jan 10, 2024

BarycentricPolynomial::value() optimizations - get rid of `pow` #14117

BarycentricPolynomial::value() optimizations - get rid of `pow` #14117

drwells commented Jan 4, 2024 •

edited

blaisb commented Jan 5, 2024 •

edited