Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some cleanup of the System.Number class #20619

Merged
merged 10 commits into from Oct 29, 2018
Merged

Some cleanup of the System.Number class #20619

merged 10 commits into from Oct 29, 2018

Conversation

@tannergooding
Copy link
Member

@tannergooding tannergooding commented Oct 25, 2018

This is the first part of https://github.com/dotnet/corefx/issues/33053, and does some initial cleanup of the System.Number class:

  • Resolve the general formatting errors that were showing up in Number.Formatting and Number.Parsing
  • Removing duplicated code from the Parse/TryParse calls (doing the same for Format/TryFormat is not as easy)
  • Some minor renames/relocations
  • Updating NumberBuffer to take custom-sized digit buffers
}

private static void FormatCurrency(ref ValueStringBuilder sb, ref NumberBuffer number, int nMaxDigits, NumberFormatInfo info)
private static unsafe void FormatCurrency(ref ValueStringBuilder sb, bool sign, int scale, char* dig, int nMaxDigits, NumberFormatInfo info)
Copy link
Member Author

@tannergooding tannergooding Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Format* functions previously took a ref NumberBuffer; they were changed to take the sign, scale, and digit pointer individually so they could be shared between the IntegerBuffer and NumberBuffer implementations.

Loading

}
}

internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)
Copy link
Member Author

@tannergooding tannergooding Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ref NumberBuffer number parameters for the integer case where changed to ref IntegerBuffer integer

Loading

@@ -12,6 +12,11 @@ internal static partial class Number
{
private static unsafe void Dragon4(double value, int precision, ref NumberBuffer number)
{
const double Log10V2 = 0.30102999566398119521373889472449;
Copy link
Member Author

@tannergooding tannergooding Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were moved out of the NumberBuffer.cs file because they are only used by Dragon4

Loading

@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 25, 2018

CC. @danmosemsft, @eerhardt, @jkotas.

I explicitly called out the few cases where it wasn't just copy/paste. I made this PR separate since it is just the initial refactoring and will make the actual implementation PR easier to review.

Loading

// * 19 for int64
// * 20 for uint64
// * 39 for int128/uint128
internal const int MaxDigits = 50;
Copy link
Member

@stephentoub stephentoub Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the comment, will a subsequent PR lower the value of MaxDigits to 39?

Loading

Copy link
Member Author

@tannergooding tannergooding Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one can be fixed in a follow-up PR, but is not required for correctness.

Loading

// * 113 for Single
// * 768 for Double
// * 11563 for Quad
internal const int MaxDigits = 50;
Copy link
Member

@stephentoub stephentoub Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, the comment makes this sound wrong. I assume it'll be fixed subsequently as part of all of this work?

Loading

Copy link
Member Author

@tannergooding tannergooding Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will be fixed in a follow-up PR.

Loading

@jkotas
Copy link
Member

@jkotas jkotas commented Oct 25, 2018

+4,879 −3,775

The PR is adding a net new 1000+ lines. Where is it coming from? Is it a duplicated 1000 lines? I was expecting much smaller duplication out from this.

Loading

@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 26, 2018

Split into 3 commits to make this easier to review:

  • The first just formats the document (to get rid of the noise from copying/pasting the code between files)
  • The second contains the refactoring (splitting NumberBuffer into IntegerBuffer and FloatingPointBuffer)
  • The third splits the code into the respective files

Loading

@@ -1460,9 +1457,471 @@ internal static unsafe char ParseFormatSpecifier(ReadOnlySpan<char> format, out
'\0';
}

internal static unsafe void NumberToString(ref ValueStringBuilder sb, ref NumberBuffer number, char format, int nMaxDigits, NumberFormatInfo info)
internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)
Copy link
Member Author

@tannergooding tannergooding Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas, the bulk of the "new lines" comes from duplicating BufferToString and BufferToStringFormat.

I'll think about this some more and see if we can reduce this duplication.

Loading

return false;
}

private static unsafe bool ParseNumber(ref char* str, char* strEnd, NumberStyles styles, ref FloatingPointBuffer floatingPoint, NumberFormatInfo info, bool parseDecimal)
Copy link
Member Author

@tannergooding tannergooding Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the other big "duplicated" code, but it will likely need to change a bit between the two implementations. I'll also give it some more thought, to see if we can share some of the code here.

Loading

@jkotas
Copy link
Member

@jkotas jkotas commented Oct 26, 2018

It may be interesting to look at storing pointer to the buffer in the structure - it is an extra indirection in a few places, but it should not really show up on the radar. Or just use the big buffer everywhere.

Loading

@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 26, 2018

It may be interesting to look at storing pointer to the buffer in the structure - it is an extra indirection in a few places, but it should not really show up on the radar. Or just use the big buffer everywhere.

Right. I was considering just using the big buffer everywhere for float/double. It shouldn't be expensive with the .locals init stripping we do and is what the MSVCRT does already.

Loading

}
}

internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)
Copy link
Member Author

@tannergooding tannergooding Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented here, the majority of the code duplication comes from NumberBufferToString, NumberBufferToStringFormat, and ParseNumber.

After looking more closely at the methods, I don't believe there is much that can be done with code sharing. These methods depend on RoundIntegerBuffer and RoundFloatingPointBuffer on each code path which will end up dealing with the buffers in very different ways. The former thinks in terms of value, scale and uses "normal" rounding logic; while the latter needs to think in terms of mantissa, exponent, and uses IEEE-compliant rounding logic (which will ultimately make the end formatting/parsing logic differ between these as well).

However, I think there is a bit of duplication we can remove in other parts of these code files. For example, we have quite a bit of duplication between several of the X and TryX code paths, namely due to the X code path needing to decide if it wants to throw an OverflowException or FormatException (which could be handled by returning an enum or a secondary bool).

Loading

Copy link
Member Author

@tannergooding tannergooding Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, actually, we might be able to do it by keeping a single buffer:

ref struct NumberBuffer
{
    public int precision;
    public int scale;
    public int exponent; // New field, only used by floating-point buffers
    public bool sign;
    public NumberBufferKind kind;
    public Span<char> digits; // Allocated at creation so it is sized as needed
}

The rounding only happens in a few places, at the end of parsing and isn't in the hot-path, so an additional check on kind shouldn't hurt. We only need scale for integer types, but we can make use of both scale and exponent for floating-point (the former can still be used to help with the fast path check).

Loading

@tannergooding tannergooding changed the title Splitting the Number.Parsing and Number.Formatting code into "Number" and "Integer" Some cleanup of the System.Number class Oct 26, 2018
@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 26, 2018

@jkotas, @stephentoub. This has changed a bit.

We are back to having a single NumberBuffer type, but it now takes a custom-sized digit buffer to handle the differences between integer and floating-point values.

The NumberBuffer type will need to carry an additional exponent field that will be unused for Integer parsing/formatting, but I don't believe that will be a problem.

Loading

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Oct 26, 2018

Could you do some perf testing to ensure that some of these refactorings don't impact things like int.TryParse?

Loading

// DriftFactor = 1 - Log10V2 - epsilon (a small number account for drift of floating point multiplication)
private const double DriftFactor = 0.69;
// We need 1 additional byte, per length, for the terminating null
private const int DecimalNumberBufferLength = 50 + 1;
Copy link
Member Author

@tannergooding tannergooding Oct 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept DecimalNumberBufferLength at the current value (50 + 1). However, I believe we can make it 30 (29 digits + 1 for rounding).

Loading

Copy link

@pentp pentp Oct 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the rounding digit is 5 then up to 20 additional digits are checked in NumberBufferToDecimal (so 50 in total). The number 20 looks arbitrary, theoretically it would be correct to check the entire string...

Loading

Copy link
Member Author

@tannergooding tannergooding Oct 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pentp, maybe I am misremembering, but I thought decimal only supported 29 digits max. That is, it has 96-bits for the mantissa and it uses 8-bits for the exponent, which is strictly restricted to a value between 0 and 28, inclusive.

This should mean that you need to consider 29 digits, plus 1 for rounding. Anything more than that shouldn't be impactful.

Loading

Copy link

@pentp pentp Oct 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It supports only 29 digits, but if the rounding digit is 5 and the number is even, then it checks the following digits to decide if it should round up or down.

Loading

Copy link
Member Author

@tannergooding tannergooding Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pentp, So you still only need 30 digits, but you also need a bool that indicates whether or not there was any trailing non-zero digits in the rest of the input string (same as double and float), correct?

Loading

Copy link

@pentp pentp Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

Loading

@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 28, 2018

Here are the perf results from my workbox:

Numbers for the Integer tests generally look good some are marginally faster/slower, but all looked to be within the tolerance ranges (running a few times, and looking at benchview, some of the tests aren't super stable). The floating-point tests are similar, but most tend to look slightly on the slower side for the parsing (which all went through TryParse); the formatting all tended to be faster.

Loading

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Oct 29, 2018

Thank you for checking.

Loading

@tannergooding
Copy link
Member Author

@tannergooding tannergooding commented Oct 29, 2018

@jkotas, @stephentoub. Any other feedback here?

Loading

jkotas
jkotas approved these changes Oct 29, 2018
Copy link
Member

@jkotas jkotas left a comment

Thanks

Loading

@tannergooding tannergooding merged commit aef0bc2 into dotnet:master Oct 29, 2018
31 checks passed
Loading
dotnet-maestro-bot pushed a commit to dotnet-maestro-bot/corefx that referenced this issue Oct 29, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
dotnet-maestro-bot pushed a commit to dotnet-maestro-bot/corert that referenced this issue Oct 29, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
tannergooding added a commit to dotnet/corefx that referenced this issue Oct 29, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
jkotas added a commit to dotnet/corert that referenced this issue Oct 29, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
kouvel added a commit to kouvel/coreclr that referenced this issue Nov 3, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success
A-And added a commit to A-And/coreclr that referenced this issue Nov 20, 2018
* Formatting Number.Formatting.cs and Number.Parsing.cs

* Removing some duplicated parsing code by having the Parse method call TryParse

* Moving two constants from NumberBuffer to Dragon4

* Rename FloatPrecision to SinglePrecision

* Updating the casing of the NumberBuffer fields

* Updating NumberBuffer to allow taking a custom-sized digit buffer.

* Updating the various NumberBufferLength constants to be the exact needed lengths

* Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit.

* Fixing TryParseNumber to use the correct maxDigCount

* Ensure the TryParseSingle out result is assigned on success
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants