-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Description
I expect BigInteger to behave like other integer types when printed, but for bin and hex that's currently not the case.
This can mess with naive algorithms such as trying to find the bits in a given range using Substring, since the string is not always of the same length.
It's also somewhat annoying when debugging objects that print themselves using BigInteger, e.g., I have a "bit vector" class and a quick glance at something starting with 0... makes me go "wait, I expected this value to be negative when interpreted as a signed integer...".
Reproduction Steps
using System;
using System.Numerics;
Console.WriteLine(9.ToString("D1"));
Console.WriteLine(new BigInteger(9).ToString("D1"));
Console.WriteLine(1.ToString("B1"));
Console.WriteLine(new BigInteger(1).ToString("B1"));
Console.WriteLine(15.ToString("X1"));
Console.WriteLine(new BigInteger(15).ToString("X1"));Run (e.g. on dotnetfiddle) and you get:
9
9
1
01
F
0F
Expected behavior
BigInteger's bin and hex printing should be consistent with both its own decimal printing and standard integer types' bin/hex/dec printing.
Actual behavior
There is always a leading zero, even when the result can fit exactly in the requested number of digits.
Regression?
Yes for bin, no for hex, changing the dotnetfiddle to .NET 8 outputs:
9
9
1
1
F
0F
Known Workarounds
Checking the length and truncating the string when needed
Configuration
.NET 9, x64, Windows 11. I don't have other OSes to try on and I don't know what dotnetfiddle runs on but I assume this isn't arch- or OS-dependent.
Other information
No response
Activity
elgonzo commentedon May 15, 2025
It's like this since forever, at least for the hex-formatting. While i personally also find this inconsistent with how the
Xspecifier works for other integral types and lament the documentation not mentioning anything about it as far as i can tell (also since forever), there is a method to the "madness".The underlying reason for the behavior seems to be the ability to roundtrip the BigInteger type using the
Xspecifier like it would also be possible for other integral types. The problem with doing what you expect however is that BigInteger can hold negative numbers while also having anungodlyarbitrarily large value range.How would you represent for example
-1as a hex value here?For
longthat would be easy:FFFFFFFFFFFFFFFF, becauselongis just 64-bit, so the resulting hex string is still relatively short.For a as of now still hypothetical Int256 type, the hex-string conversion would be a longer
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFbut still be a (borderline) manageable string length. Parsing a hex number with fewer digits than that into one of these types is relatively easy by implying leading zeros that would "fill" the missing hex positions.But for
BigInteger, this simple approach is impractical. What would be the hex-formatted two's complement of-1? We can't do it like forint,longor even Int256. I don't know what a practical limit for the BigInteger value range is, but i guess the hex representation of the two's complement of a negative value would amount to many moreFcharacters that would fill more than just a few text lines if we would choose the same approach as forint/long/etc...I don't think i am going out on a limb here by claiming that nobody really wants the hex-formatting of a negative BigInteger number being a desolate landscape of
Fcharacters filling a whole paragraph without a good reason to do so :-)Therefore, to be able to distinguish between positive values and the two's complement of negative values while keeping the hex strings of manageable size for values of small-ish orders of magnitude, the hex-formatting of any positive value that would also be a two's complement of a negative value gets prefixed with a
0to avoid it being mistaken as a two's complement of a negative value:new BigInteger(-1).ToString("X1")results in the string "F"new BigInteger(15).ToString("X1")results in the string "0F" (two hex digits required to distinguish the value 15 from the two's complement of -1)This is also the underlying working principle of the
BigInteger.Parsemethod:BigInteger.Parse("0F", NumberStyles.HexNumber)will yield the BigInteger value 15.BigInteger.Parse("F", NumberStyles.HexNumber)will yield the BigInteger value -1.The same reasoning would also apply to binary formatting. In .NET 8, BigInteger couldn't parse binary numbers, but this was made possible in .NET 9. So, the binary formatting of BigInteger needed to be updated to allow roundtripping of binary-formatted numbers.
For the hex-formatting not always. Only for such positive values whose hex representation would also be a valid two's complement of a negative value.
huoyaoyuan commentedon May 15, 2025
The output of .NET 8 is incorrect. The binary and hexadecimal representation of
BigIntegeris always considered as signed, using the specified MSB as sign.1in binary always means -1, and01in binary means 1.KalleOlaviNiemitalo commentedon May 16, 2025
Common Lisp would simply print the negative hexadecimal number with a minus sign, just like in any other radix.
printfin C uses "%x" for unsigned integers only, and I suppose .NET inherited the ToString "X" behaviour from there. If compatibility now prevents the behaviour of BigInteger.ToString("X") from being changed, perhaps one can instead define a different format string for signed hexadecimal numbers.SolalPirelli commentedon May 17, 2025
Thanks all, perhaps this is more of a documentation bug then?
AFAIK this isn't explicitly documented anywhere.
In fact I interpreted the encouragement to use "R" as a specifier to round-trip BigInteger to mean other formats may not roundtrip (here, "Recommended for the BigInteger type").
I guess the analogy with other integer types has flaws no matter what behavior is chosen, especially since BigInteger doesn't have the traditional signed/unsigned distinction other types have.
tannergooding commentedon May 28, 2025
A lot of this is covered under https://learn.microsoft.com/en-us/dotnet/api/system.numerics.biginteger.tostring?view=net-9.0
There's potentially more explicit documentation that could be provided and changes are welcome.
A lot of the format specifiers are standardized across types. It is not guaranteed that all formats or format specifies will produce a roundtrippable value in all scenarios.
Ris meant to guarantee this regardless of type. For the built-in numeric types, we try to ensure the default case is roundtrippable and that we generally produce something roundtrippable for other values when no precision is specified.While many languages support something like
-F(or-0xFfor actual language syntax) to mean-15, this is not as common to see in the wild. Hex and binary are often used to get the "raw bits" and defaulting to giving-Finstead ofF1(orFFF1, etc) is potentially misleading and confusing to people typically working with binary/hex.tannergooding commentedon May 28, 2025
All big integers are signed. The nuance is that it isn't of fixed width and so you don't know how many leading sign bits are needed.
This causes it to default to the shortest 2's complement sequence, so for binary
0b1is-1(you only need the sign bit) and0b01is+1(you need a sign bit and the magnitude after).dotnet-policy-service commentedon May 28, 2025
This issue has been marked
needs-author-actionand may be missing some important information.Add note on BigInteger bin/hex formatting of positive values
7 remaining items