Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for NaN and Infinity #26

Closed
lipchev opened this issue Mar 30, 2024 · 21 comments
Closed

Support for NaN and Infinity #26

lipchev opened this issue Mar 30, 2024 · 21 comments

Comments

@lipchev
Copy link
Contributor

lipchev commented Mar 30, 2024

Support for NaN and Infinity

Hey,
I've been digging into the code and I think we can add support for NaN and +/- Infinity. Here's what I'm thinking:

  1. Update GetReducedFraction(..) to return 0/0 (NaN) for 0 in the numerator and denominator. Right now, it's returning Zero.
  2. For a positive numerator and 0 in the denominator, return +1/0 (+Inf).
  3. For a negative numerator and 0 in the denominator, return -1/0 (-Inf).

We might need to tweak FractionState to {Zero, Unknown, Normalized} to avoid default(Fraction) being NaN. Also, in that case, Reduce() should check for FractionState.Zero and return a true Zero (0/1).
There might be a few more checks needed (like in the constructor from double), but I think it's doable.

I'm ready to make these changes and submit a PR if you're on board with this.

Let me know what you think!

@danm-de
Copy link
Owner

danm-de commented Mar 31, 2024

Thanks for this suggestion. Perhaps I should say something about the history of this project. Fractions started as a private PoC for calculating drug dosage for clinical applications. The software of my employer at the time had major problems in displaying calculated volume rates, active ingredient rates and summations of critical active ingredient quantities over a specific period of time.

(Sorry for the German language in the pictures - I just paste parts of my old slides ;-)

image

The data type double has too large rounding errors and is therefore disqualified for drug dosages prescribed in the nanogram range. The data type decimal is less susceptible to the typical errors of floating-point arithmetic due to the base of 10 - BUT - periodic decimals (which could occur briefly during computation) continued to be a problem.

image

So, the rounding errors were still an issue when using the decimal data type. And the doctors and nurses freaked out... Instead of an infusion rate of 5mg/kg/h, the UI said 4.997mg/kg/h OR even worse 0.998 pill instead of simply 1 pill.

image

That's how the Fraction data type came into play. Together with a unit conversion system (a project that I am unfortunately not allowed to publish as open source), we were able to display exact calculations in the blink of an eye. Performance was a key feature. In the weekly or monthly view, 100 to several 1000 calculations per second are performed. The data types double and decimal were fast enough here - Fraction had to be able to keep up. Delays in UI rendering were not accepted by medical staff. So all the code has been optimized for the highest possible execution speed.

We thought about whether we also wanted to represent complex numbers, infinity, etc. However, we simply couldn't think of any use cases for this in our "real world" application. That's why it is not available (yet). However, I don't have any problems if the data type is improved by the new features.

Point 1 of your proposal poses a problem: Since Fraction is a struct the default value must result in 0. This is the expected behavior of all number types in .NET (default(int) == 0, default(long) == 0L, ...).

I don't have a PhD in mathematics - so I can only make unvalidated assumptions here. I don't know if it makes strategic sense to represent infinity and/or NaN in the numerator or denominator. I would probably drill the 'FractionState' enum as a bit field and put the additional states there. We could stay compatible with serialized data, because IS_NORMALIZED = 0x00000001. On the other hand, it would be stupid if we had to check the bit field BEFORE every mathematical operation... :-/

@lipchev
Copy link
Contributor Author

lipchev commented Apr 10, 2024

Right, those are exactly the same concerns we have in UnitsNet which has historically struggled with the tradeoff between the double type and it's binary representation vs the decimal and its limited range.

I'm sure you can see how it all fits together seeing this one line of code (which is now evaluating as expected):
(Mass.FromGrams(1) / 3).ToUnit(MassUnit.SolarMass).ToUnit(MassUnit.Gram) * 3 == Mass.FromGrams(1)
I don't remember exactly but that's probably something like a conversion factor of "E+28".

As for the NaN / Infinity use cases- there are (IMO) some valid reasons for having them in UnitsNet, however- given that this is arguably a breaking change for the clients of this library, you should consider whether you think it's worth having it or not.

Regarding the performance, I'm pretty sure it wouldn't be any slower- in any case, we should be able to validate that using the baseline benchmarks from #29 .

In the meanwhile, there are a few extra features that I could offer- I'll create separate PRs for each of them, feel free to reject any you think you don't need.

@danm-de
Copy link
Owner

danm-de commented Apr 11, 2024

Thank you very much for the pull requests. I've already merged most of these into the master branch. Maybe you want to take a look before I release version 8.0.0 - I've made a few changes. You've added a few TODOs regarding string allocations. I "doubled" these methods and switched from String to ReadOnlySpan. I've also renamed the Round methods, which returned a BigInteger instead of Fraction.

@danm-de
Copy link
Owner

danm-de commented Apr 11, 2024

Regarding NaN/Infinity - the breaking change for the current users is manageable. In order to remain SemVer compatible, I would then increase the major number again and we would have to explicitly point this out in the changelog.

@lipchev
Copy link
Contributor Author

lipchev commented Apr 11, 2024

This all looks to me. As for the release, it's up to you- I am currently preparing another pull request for a DecimalFractionFormatter:

/// <summary>
///     Provides functionality to format the value of a Fraction object into a decimal string representation following the
///     standard numeric formats, as implemented by the double type.
/// </summary>

Here is the current implementation
https://github.com/angularsen/UnitsNet/blob/50524cd8f88d727bfb40091328b4585535f89f55/UnitsNet/QuantityValue.ToString.cs#L60

And here are the tests:
https://github.com/angularsen/UnitsNet/blob/50524cd8f88d727bfb40091328b4585535f89f55/UnitsNet.Tests/DefaultQuantityValueFormatterTests.cs

This is obviously an optional feature, so shouldn't matter for most people- so it's up to you.

Additionally, I was thinking of adding the [DataContract]/[DataMember] annotations, but was waiting for finilizing the public API contract, so that we don't add more breaking changes later on (like changing the State enum)

And another "extra-extra" feature (that would probably be suitable for an extension nuget) is a an expression parser that I wrote:
https://github.com/angularsen/UnitsNet/blob/50524cd8f88d727bfb40091328b4585535f89f55/CodeGen/Helpers/ExpressionAnalyzer/ExpressionEvaluator.cs

In short it is parsing and expanding strings containing expressions like "123e-4 * (x*123.4) + 42", similar to what this does:
https://github.com/miroiu/string-math

I haven't written any tests for it yet, but have successfully parsed all the expressions from our json files:
https://github.com/angularsen/UnitsNet/tree/master/Common/UnitDefinitions

@danm-de
Copy link
Owner

danm-de commented Apr 12, 2024

I like the approach of the DecimalFractionFormatter. That's actually exactly what I wanted to do - but never found the time to do it. Possibly a necessary building block for Fractions to be able to fully implement this: https://learn.microsoft.com/en-us/dotnet/standard/generics/math (INumber<T>). Although there would still be quite a long way to go.

Regarding the DataContract/DataMember attributes, what are they for? Fraction is a struct type and therefore cannot be deserialized easily. Maybe there are new possibilities with System.Text.Json that I don't know yet?

Expression parsing is the task of a CAS (Computer Algebra System). I think Fraction should remain as a simple data type (just like long, int or decimal). Of course, with the necessary features to be able to use it in a variety of ways.

Since 8.0.0 is allowed to contain breaking changes - what features do you require? (counting your pull-requests you seem to have a lot in mind with it) Unfortunately, we can't easily change the behavior of the ToString() and IFormattable.ToString(String, IFormatProvider) methods 😕 that would break a lot of dependent projects. The symbols currently used ('G', 'd', ..) contradict the Microsoft standard.

I'd wait for your 'GO' before releasing a new NuGet package.

@lipchev
Copy link
Contributor Author

lipchev commented Apr 12, 2024

Nah, I don't expect a change of the default ToString() behavior- this is meant as an option- in our case our Quantities (e.g. Mass / Volume etc) have their own ToString() implementation which formats the Fraction using the DecimalFormatter (as we have to keep it backwards-compatible). This is the use case I foresee for other people as well.

The [DataContact]/[DataMember] are set on the private fields and this makes WCF (yes, there are still people using that) to have automatic proxy generation out of the box. This can naturally be used to serialize into an XML file if necessary. Here are some tests we've got that should make it clear.

I've got another small PR with some performance improvements on the FromDouble method that I'd like to push before I get on with the NaN stuff. I'll create the PR in the next hour or so..

@lipchev
Copy link
Contributor Author

lipchev commented Apr 12, 2024

As for the release version, it's up to you, but given that we haven't introduced any breaking changes yet, I'd probably release it as the "latest stable v7" so that people who want to opt-out of the NaN stuff can stick with v7.5 (or whatever the number is) and still benefit from the good stuff we're bringing in...

@danm-de
Copy link
Owner

danm-de commented Apr 29, 2024

I do not think we need any more improvements in the v7.x branch. I've merged the NaN pull-request into the master.

@danm-de
Copy link
Owner

danm-de commented Apr 29, 2024

The master-branch is up-to-date with your latest pull-requests.

I'll wait with 8.x.x NuGet packages until we agree that master is stable.

@danm-de
Copy link
Owner

danm-de commented May 1, 2024

To pick up our old conversation in #51. I just see that it is not easily possible to remove FractionState (or replace the access to the property with an _normalizationNotApplied backing field.

Specifically, the two constructors are in conflict:

private Fraction(BigInteger numerator, BigInteger denominator, FractionState state)

and

public Fraction(BigInteger numerator, BigInteger denominator, bool normalize)

@lipchev
Copy link
Contributor Author

lipchev commented May 1, 2024

Yes, i had to switch the order (will post the PR in a bit):

    /// <summary>
    ///     Initializes a new instance of the Fraction struct with the specified numerator and denominator.
    /// </summary>
    /// <param name="normalizationNotApplied">Indicates whether the fraction is not normalized.</param>
    /// <param name="numerator">The numerator of the fraction.</param>
    /// <param name="denominator">The denominator of the fraction.</param>
    private Fraction(bool normalizationNotApplied, BigInteger numerator, BigInteger denominator) {
        _numerator = numerator;
        _denominator = denominator;
        _normalizationNotApplied = normalizationNotApplied;
    }

@danm-de
Copy link
Owner

danm-de commented May 1, 2024

I am working on another proposal right now.

@danm-de
Copy link
Owner

danm-de commented May 1, 2024

Yes, i had to switch the order (will post the PR in a bit):

OK, then I can discard my branch - the changes correspond to mine.

@danm-de
Copy link
Owner

danm-de commented May 6, 2024

Are there any changes that need to be implemented for the 8.0.0 version, or can I update the documentation and prepare the release?

@lipchev
Copy link
Contributor Author

lipchev commented May 6, 2024

I haven't got anything prepared, but some house cleaning might not be a bad idea- a solution reformat, and re-comment (I think I saw somewhere that "..NaN is not supported").
Generally speaking we still haven't implemented the INumber stuff, but that could wait.

@danm-de
Copy link
Owner

danm-de commented May 6, 2024

Regarding the clean-up work - I'll take a look at it later this week.

@danm-de
Copy link
Owner

danm-de commented Jul 1, 2024

@lipchev
If you have no further objections, I would like to release version 8.0.0. If you are okay with that, please take another look at the changelog to see if you are missing anything there.

@lipchev
Copy link
Contributor Author

lipchev commented Jul 1, 2024

@lipchev If you have no further objections, I would like to release version 8.0.0. If you are okay with that, please take another look at the changelog to see if you are missing anything there.

I think there are just some parts of the main Readme.md that could be improved: there is currently no mention of Nan/Infnity on it, as well as some of the overloads such as FromDecimal(x, reduceTerms:false) are missing.. I'll see about making another PR (now that everything's merged)- but all in all, I don't see a reason not to release.. 👍

@lipchev
Copy link
Contributor Author

lipchev commented Jul 2, 2024

@lipchev If you have no further objections, I would like to release version 8.0.0. If you are okay with that, please take another look at the changelog to see if you are missing anything there.

I think there are just some parts of the main Readme.md that could be improved: there is currently no mention of Nan/Infnity on it, as well as some of the overloads such as FromDecimal(x, reduceTerms:false) are missing.. I'll see about making another PR (now that everything's merged)- but all in all, I don't see a reason not to release.. 👍

Lol, I was just finishing up the Readme.md section about "Working with NaN and Infinity when I realized that there is actually a bug in the Divide operator- when dividing a fraction by infinity the result should be Fraction.Zero (or NaN).. I don't know if that got in during the optimization phase or it was there from the start, since there are actually no tests for dividing_with_infinity - what we do in When_dividing_with_infinity is actually dividing infinity instead of with infinity (which must be a carry-over from When_multiplying_with_infinity, where the operation is commutative).

@danm-de
Copy link
Owner

danm-de commented Jul 3, 2024

@lipchev
The new version 8.0.0 has been released. Thanks for your work! If the project is used as an integral part of other open source projects, the current organization must be reconsidered. Currently I'm the only person with write permissions (both in GitHub and NuGet). Just in case, I don't want to be a bottleneck when it comes to future code reviews etc.

@danm-de danm-de closed this as completed Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants