Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing the non-normalized multiplications / divisions #72

Merged
merged 3 commits into from
Jul 1, 2024

Conversation

lipchev
Copy link
Contributor

@lipchev lipchev commented Jun 15, 2024

As per my last comment from #49 here is are the proposed optimizations

  • Multiply: removed the GCD reductions
  • Divide: only reducing the denominators
  • Divide: is now also normalizing the signs (dividing by a negative fraction no longer results in the denominator being negative)
  • Updated the Multiply/Divide tests using the StrictTestComparer
  • Updated the ConsecutiveMathOperationBenchmarks with an additional method for testing a mix of operations (+, -, *, /)
  • Remove the ToDecimalWithTrailingZeros and associated usages
  • Update the examples in the Readme.md

- Multiply: removed the GCD reductions
- Divide: only reducing the denominators
@lipchev
Copy link
Contributor Author

lipchev commented Jun 15, 2024

Having done all sorts of benchmarks, with different types of reduction here is my summary:

  1. Reducing the resulting terms is more likely to make any future operations faster: e.g. the next addition would otherwise require proportionally larger multiplications / additions (see the NonReducedOperations benchmark) .
  2. Operations with fractions are generally faster when they have the same denominators
  3. GCD reductions are not very effective unless the terms are reduced significantly or one of the terms is reduced to 1 (which avoids one BigInteger multiplication somewhere down the line)

From 2) we know that Add and Subtract should not be reduced, while Multiply don't offer much reduction potential due to 3). The only good candidate is the Divide operator, for which I've tested the 3 options:
a) Don't reduce anything
b) Reduce the denominators before multiplying them with the numerators (the current implementation)
c) Cross-reduce the two terms before multiplying them (or what's left of them anyway)

On an individual operation basis, a) and b) , depending on the operands, exhibit relatively similar performance overall (with one being faster for some values and slower on others)- however, when taking into account the future operations (see 1.)- performing N number of consecutive (random) operations with the division as per 1) resulted in ~30% more allocations (and increasingly slower times). I've also tested against the version from master (doing the double-reduction in the Multiply) - and it was about the same as a), but having significantly more allocations than b) - which was to be expected (as per 3.).

Finally, compared to the ~30% reduction in allocations that b) offers, c) reduces the overall allocations by no more than 1-2%, at the cost of significant reduction the per-operation performance (more than 10%).

@lipchev
Copy link
Contributor Author

lipchev commented Jun 15, 2024

Let's start with the NonNormalizedMathBenchmarks on .NET Framework, as things are easier to see (values represent absolute differences in ns between master and this):

image

As you can see- the individual Division operation maybe faster or slower, depending on the operands with things being more or less even.

As for the Multiply operator, apparently the original implementation really didn't like doing those extra reductions...

Here are the same results on .net8:

image

@lipchev
Copy link
Contributor Author

lipchev commented Jun 15, 2024

Here are the of all operations (absolute times in ns) starting with .NET Framework:

image

And here is the .NET 8.0 version (basically the same graph, with a different scale for the y axis):

image

@lipchev
Copy link
Contributor Author

lipchev commented Jun 15, 2024

And finally - the ConsecutiveMathOperationBenchmarks (the reason why we bother with reductions at all).

I've slightly modified the benchmark parameters from last time - this time I didn't bother with the 100 consecutive operations as the fully-normalized operations are already blowing up the charts with 50 operations. Instead I've added an extra-precision case (up to 6 digits).

I've also introduced a new benchmark which picks up the next operation at random (+, -, *, /)- this is the benchmark that demonstrates the importance of having some reduction to counteract the unconstrained expansion of the Multiply operator (or rather the difference would become visible if we were to remove the reduction).

Here's the overview:

image

And here are the allocations (as before, the additions all cap at 32B):

image

The picture with .NET 8.0 is similar (with the fully-normalized operations loosing even harder):

image

image

Just a few observations in conclusion:

  1. Looking at the allocation differences between .NET Framework and .NET 8.0 on the Multiplications benchmarks- I attribute this to an improvement of the GreatestCommonDenominator (with some intermediate allocations being eliminated), but I haven't actually verified it.
  2. If 1. is correct then we could interpret the similarity between the allocations of the Multiplications in .NET8.0 of the Reduced and NonReduced fractions as an indication that the reduction in size of the terms, that result from the multiplication of two random numbers, isn't significant.
  3. Looking at the performance difference between X number of consecutive divisions vs X number of consecutive multiplications we might be tempted to conclude that doing no reductions at all would be better, but that would not be necessarily true in the case of X number of random operations.

@lipchev
Copy link
Contributor Author

lipchev commented Jun 16, 2024

Ok, let me bring in one more contender to the race- in the following graphs (.NET8.0) I've tested the non-normalized fractions against the equivalent decimal operations (the red-bars). At the same time, I've changed the random seed so that we have another data-point (just making sure that the previous results weren't somehow affected by the particular selection of values/order of operations):

image

image

image

image

Caveats:

  1. Multiplying something like 123.456789m * 987.654321m * 132.46798m * ... 50 times isn't likely to compute without rounding errors (I actually checked the values- and they only happen to differ on the last digit 🤷 )
  2. Dividing something like 123.456789m / 987.654321m / 132.46798m / ... 50 times is very likely to produce a non-terminating decimal along the way (here we're getting a difference in the second to last digit)
  3. The way I've implemented the random-operation selection (storing and invoking the Func<a,b,c>) is not ideal- it's probably adding a little more (constant) overhead than necessary.

Still, all-in-all, I'd say we're comfortably within the same performance magnitude as the decimal (2x-5x) with results on .NET Framework being slightly worse, in comparison (something like 3x-6x).
Obviously, we're getting a hit on the allocation front, but that should be manageable (assuming that it isn't that often that one of the terms gets to more than Int32.MaxValue).

@danm-de
Copy link
Owner

danm-de commented Jun 16, 2024

There is just the matter of ToDecimalWithTrailingZeros : given that there isn't any particularly meaningful schema regarding the number of zeros resulting from the Divide operator what do you think we should about it?

In fact, I don't see the need for this method. decimal cannot be used as a round-trip value (e.g. for serialization scenarios) because it cannot represent periodic values nor NaN/Infinity (The conversion from Fraction to decimal is not a bidirectional mapping)

I also cannot understand the reasoning behind interpreting trailing zeros as an indicator of the precision of the calculation (especially because it does not remain stable during calculations [1]). For display in the graphical interface (or other output that is read by humans), I would always prefer appropriate formatting in ToString rather than relying on the scaling stored in the data type.

[1]
image

- updated the Readme.md (replaced the incorrect usage of the term "improper" when referring to the non-reduced fractions)
@lipchev
Copy link
Contributor Author

lipchev commented Jun 16, 2024

There is just the matter of ToDecimalWithTrailingZeros : given that there isn't any particularly meaningful schema regarding the number of zeros resulting from the Divide operator what do you think we should about it?

In fact, I don't see the need for this method. decimal cannot be used as a round-trip value (e.g. for serialization scenarios) because it cannot represent periodic values nor NaN/Infinity (The conversion from Fraction to decimal is not a bidirectional mapping)

Ok, I've removed it.

As I was updating the Readme.md section (wasn't sure if the examples were still valid) - I stumbled upon this Wikipedia section: Proper_and_improper_fractions, and from what I'm reading it appears that the term improper has a specific meaning (nothing to do with negative denominators):

In general, a common fraction is said to be a proper fraction, if the absolute value of the fraction is strictly less than one—that is, if the fraction is greater than −1 and less than 1.[14][15] It is said to be an improper fraction, or sometimes top-heavy fraction,[16] if the absolute value of the fraction is greater than or equal to 1. Examples of proper fractions are 2/3, −3/4, and 4/9, whereas examples of improper fractions are 9/4, −4/3, and 3/3.

They are several synonyms for the Irreducible fraction but I couldn't find anything about denoting the fraction that IS NOT in it's simplest form.. I guess we could keep using (non) normalized in its common sense. 😄

@danm-de
Copy link
Owner

danm-de commented Jul 1, 2024

They are several synonyms for the Irreducible fraction but I couldn't find anything about denoting the fraction that IS NOT in it's simplest form.. I guess we could keep using (non) normalized in its common sense. 😄

I remember that I was already looking for a suitable term back then. In my local language we (surprisingly) don't have a term for it either.

@danm-de
Copy link
Owner

danm-de commented Jul 1, 2024

Thank you for all your work. I don't think there is anyone other than you who has worked so intensively on the project.

My colleagues and I have not examined performance behavior nearly as closely as you have now.

@danm-de danm-de merged commit cf05fcb into danm-de:master Jul 1, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants