Backport fmath performance and other fixes from OSL #2495
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A variety of minor rewrites of certain math functions to ensure they
generate better machine code and auto-vectorize more cleanly.
Lots of inline -> OIIO_FORCEINLINE
Improve clamp used in fast_sin and cos
fast_safe_pow improve code gen with OIIO_UNLIKELY
Introduce
OIIO_FMATH_SIMD_FRIENDLY
to allow application switchingbetween implementations that give the best scalar performance versus
sacrificing scalar perf to have the best SIMD vectorization of loops
containing the fmath function. This is anticipated to be very rare,
and of course we strive to be simultaneously fastest on scalar & simd,
but we have one or two cases where such a tradeoff exists.
Lots of additional fmath benchmarks to help us judge how good they are
compared to std functions.
New safe_fmod which not only prevents division by zero but even in other
cases is much faster than std::fmod.
New fast_neg is faster than
-float
, in cases where you are ok with-(0.0f) being 0.0f instead of actual floating point -0.0f. If you have
no idea what I'm talking about or why it matters, you definitely will
like this function!
Most of these improvements were backported from OSL, made by Alex Wells,
Intel.