-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/easing library #947
Conversation
Some performance measurements on an STM32F3 Cortex-M4 running at 8MHz showed some performance issues of pwm_test.c. Making sure all constants are single precision float values allows the compiler to use the hardware FPU for some of the calculations. A quick and dirty measurment of the easing function execution time shows the execution time improvements for some of the functions: | original | eliminate double promo | ----------+----------+-----------+-------------+----------+ | min | max | min | max | ----------+----------+-----------+-------------+----------+ sine | 330 | 480 | 330 | 480 | bounce | 50 | 130 | 11 | 15 | circular | 340 | 370 | 220 | 250 | quadratic | 160 | 250 | 14 | 75 | cubic | 1230 | 1320 | 1120 | 1240 | quartic | 1230 | 1320 | 1120 | 1240 | quintic | 1230 | 1320 | 1120 | 1240 | ----------+----------+-----------+-------------+----------+ All times in micro seconds.
Looking at the execution times of the different easing functions the long haul are the ones using pow(). Unfortunately there is no float equivalent. Manually unrolling the ones with a fixed exponent is tedious but it seems worthwhile. Some rough execution time measurments on an STM32F3 Cortex-M4 with hardware FPU (again using pwm_test.c): | original | float expr | no pow ----------+------+------+------+------+------+------ | min | max | min | max | min | max ----------+------+------+------+------+------+------ sine | 330 | 480 | 330 | 480 | 330 | 480 bounce | 50 | 130 | 11 | 15 | 11 | 15 circular | 340 | 370 | 220 | 250 | 150 | 190 quadratic | 160 | 250 | 14 | 75 | 9 | 12 cubic | 1230 | 1320 | 1120 | 1240 | 11 | 11 quartic | 1230 | 1320 | 1120 | 1240 | 11 | 11 quintic | 1230 | 1320 | 1120 | 1240 | 11 | 11 ----------+------+------+------+------+------+------ All times are in micro seconds.
Looks good to me. Have you tested these for mathematical correctness? |
The math is identical to the original implementation. Is there anything specific you're looking for? |
@@ -43,12 +43,12 @@ static inline float exponential_out(float step, float max_steps, float max_val) | |||
{ | |||
return (step == max_steps) ? | |||
max_val : | |||
max_val - pow(max_val, 1.0 - (float)step/max_steps); | |||
max_val - pow(max_val, 1.0f - (float)step/max_steps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are moving everything to float
you could as well get rid of those (float)
casts. This one would be the same as max_val - pow(max_val, 1.0f - step/max_steps)
. This happens across much of the refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the original implementation had that cast there, it was promoted to a double
anyway. I presume that early on step
and max_step
were integer values and it was a remnant of refactoring. I can certainly clean those up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentations where inherited from a previous implementation I had while validating this mathematically.
|
||
return max_val + max_val / 2 * pow(ratio -2, 5); | ||
ratio -= 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird identation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the weird identation comes from the original source code having tab characters in some places and github uses tabstop=8. I didn't clean up the tabs because it introduces a lot of noise into the code review but can certainly do that.
|
||
return max_val / 2 * (pow(ratio - 2, 3) + 2); | ||
ratio -= 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird identation
} | ||
|
||
/* Cubic */ | ||
static inline float cubic_in(float step, float max_steps, float max_val) | ||
{ | ||
float ratio = step / max_steps; | ||
|
||
return max_val * pow(ratio, 3); | ||
return max_val * (ratio * ratio * ratio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you have a pow
or sqrt
on the code, your variables are automatically promoted to double
and the operation is done at a higher precision, then what is done now by simply multiplying float
s. I am not sure how much this would impact on precision but I assume this is why @mlaz might have asked the previous question about the accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood - but I cannot answer that question without a reference. The interface does not suggest that there is double
precision involved, it's all based on float
so as a user that's as much as I can expect. In order to answer the question if this refactoring falsifies the result we would need a definition as to what level of deviation is acceptable.
In addition none of the computations uses accumulation which is where float
's really start bleeding. All values are monotonically increasing or decreasing and the result is converted to a float. So if the initial value and the result fits into a float
then every intermediary value will also fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that at some point, not using the the functions will in fact change the results, we just need to make sure wont break the functions. The method I used to validate these mathematically is to look at the function here and here, then implementing it on wolfram (or some other equivalent tool) run the lib's code in order to produce some values an compare them to what I get on wolfram.
A general comment on the "precision" question. I love the library, use it all the time ever since the PR was posted the first time. Yet, as is it is IMHO not very practical because it's resource requirements are too high. The reason I started looking into this was because the output of Having this library is insanely cool but I cannot afford 0.5-1ms execution times in a timer interrupt callback. Maybe I'm missing a use case here - is there one that requires |
@mlampert as you can see this lib is far from being optimized. Some references here: |
A bit of topic, but I'm wondering if it would be possible to combine a lookup table (generated by the easing function) with access to the easydma feature of the nrf52840? (perhaps ST also has the same kind of feature?) |
Can't claim I understand the rationale but it sounds like we don't want to move ahead with this change. I close the PR and we can create a new one once we know where we're going with this. |
FWIW Id rather have performance over accuracy. Easing is generally used for animation and is not mission critical. But if we want both options, separate functions or a separate dependency seems viable. |
@jacob i think we may agree here that in this particular case accuracy is having growth ratios near to what we have now. I never opposed to having these changes merged. We just need to check if by any means the behaviour is consistent. |
I ran into some performance issues on an 8MHz STM32F3 (Cortex-M4) when playing with the easing library. I refactored the library a bit and think the performance improvements are worth the somewhat lesser readability.