BigInteger.ToString always has a leading zero in bin and hex #115618

New issue

Closed

dotnet/docs

#46473

Labels

area-System.Numericsneeds-author-action

SolalPirelli

opened

on May 15, 2025

Description

I expect BigInteger to behave like other integer types when printed, but for bin and hex that's currently not the case.

This can mess with naive algorithms such as trying to find the bits in a given range using Substring, since the string is not always of the same length.

It's also somewhat annoying when debugging objects that print themselves using BigInteger, e.g., I have a "bit vector" class and a quick glance at something starting with 0... makes me go "wait, I expected this value to be negative when interpreted as a signed integer...".

Reproduction Steps

using System;
using System.Numerics;


Console.WriteLine(9.ToString("D1"));
Console.WriteLine(new BigInteger(9).ToString("D1"));

Console.WriteLine(1.ToString("B1"));
Console.WriteLine(new BigInteger(1).ToString("B1"));

Console.WriteLine(15.ToString("X1"));
Console.WriteLine(new BigInteger(15).ToString("X1"));

Run (e.g. on dotnetfiddle) and you get:

Expected behavior

BigInteger's bin and hex printing should be consistent with both its own decimal printing and standard integer types' bin/hex/dec printing.

Actual behavior

There is always a leading zero, even when the result can fit exactly in the requested number of digits.

Regression?

Yes for bin, no for hex, changing the dotnetfiddle to .NET 8 outputs:

Known Workarounds

Checking the length and truncating the string when needed

Configuration

.NET 9, x64, Windows 11. I don't have other OSes to try on and I don't know what dotnetfiddle runs on but I assume this isn't arch- or OS-dependent.

Other information

No response

dotnet-policy-service

added

added

added

and removed

It's like this since forever, at least for the hex-formatting. While i personally also find this inconsistent with how the X specifier works for other integral types and lament the documentation not mentioning anything about it as far as i can tell (also since forever), there is a method to the "madness".

The underlying reason for the behavior seems to be the ability to roundtrip the BigInteger type using the X specifier like it would also be possible for other integral types. The problem with doing what you expect however is that BigInteger can hold negative numbers while also having an ~~ungodly~~ arbitrarily large value range.

How would you represent for example -1 as a hex value here?

For long that would be easy: FFFFFFFFFFFFFFFF, because long is just 64-bit, so the resulting hex string is still relatively short.
For a as of now still hypothetical Int256 type, the hex-string conversion would be a longer FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF but still be a (borderline) manageable string length. Parsing a hex number with fewer digits than that into one of these types is relatively easy by implying leading zeros that would "fill" the missing hex positions.

But for BigInteger, this simple approach is impractical. What would be the hex-formatted two's complement of -1? We can't do it like for int, long or even Int256. I don't know what a practical limit for the BigInteger value range is, but i guess the hex representation of the two's complement of a negative value would amount to many more F characters that would fill more than just a few text lines if we would choose the same approach as for int/long/etc...

I don't think i am going out on a limb here by claiming that nobody really wants the hex-formatting of a negative BigInteger number being a desolate landscape of F characters filling a whole paragraph without a good reason to do so :-)

Therefore, to be able to distinguish between positive values and the two's complement of negative values while keeping the hex strings of manageable size for values of small-ish orders of magnitude, the hex-formatting of any positive value that would also be a two's complement of a negative value gets prefixed with a 0 to avoid it being mistaken as a two's complement of a negative value:

new BigInteger(-1).ToString("X1") results in the string "F"
new BigInteger(15).ToString("X1") results in the string "0F" (two hex digits required to distinguish the value 15 from the two's complement of -1)

This is also the underlying working principle of the BigInteger.Parse method:

BigInteger.Parse("0F", NumberStyles.HexNumber) will yield the BigInteger value 15.
BigInteger.Parse("F", NumberStyles.HexNumber) will yield the BigInteger value -1.

The same reasoning would also apply to binary formatting. In .NET 8, BigInteger couldn't parse binary numbers, but this was made possible in .NET 9. So, the binary formatting of BigInteger needed to be updated to allow roundtripping of binary-formatted numbers.

There is always a leading zero,

For the hex-formatting not always. Only for such positive values whose hex representation would also be a valid two's complement of a negative value.

huoyaoyuan

Member

The output of .NET 8 is incorrect. The binary and hexadecimal representation of BigInteger is always considered as signed, using the specified MSB as sign. 1 in binary always means -1, and 01 in binary means 1.

KalleOlaviNiemitalo

Common Lisp would simply print the negative hexadecimal number with a minus sign, just like in any other radix.

printf in C uses "%x" for unsigned integers only, and I suppose .NET inherited the ToString "X" behaviour from there. If compatibility now prevents the behaviour of BigInteger.ToString("X") from being changed, perhaps one can instead define a different format string for signed hexadecimal numbers.

SolalPirelli

Author

Thanks all, perhaps this is more of a documentation bug then?

AFAIK this isn't explicitly documented anywhere.
In fact I interpreted the encouragement to use "R" as a specifier to round-trip BigInteger to mean other formats may not roundtrip (here, "Recommended for the BigInteger type").

I guess the analogy with other integer types has flaws no matter what behavior is chosen, especially since BigInteger doesn't have the traditional signed/unsigned distinction other types have.

tannergooding

Member

Thanks all, perhaps this is more of a documentation bug then?

A lot of this is covered under https://learn.microsoft.com/en-us/dotnet/api/system.numerics.biginteger.tostring?view=net-9.0

There's potentially more explicit documentation that could be provided and changes are welcome.

In fact I interpreted the encouragement to use "R" as a specifier to round-trip BigInteger to mean other formats may not roundtrip

A lot of the format specifiers are standardized across types. It is not guaranteed that all formats or format specifies will produce a roundtrippable value in all scenarios. R is meant to guarantee this regardless of type. For the built-in numeric types, we try to ensure the default case is roundtrippable and that we generally produce something roundtrippable for other values when no precision is specified.

Common Lisp would simply print the negative hexadecimal number with a minus sign, just like in any other radix.

While many languages support something like -F (or -0xF for actual language syntax) to mean -15, this is not as common to see in the wild. Hex and binary are often used to get the "raw bits" and defaulting to giving -F instead of F1 (or FFF1, etc) is potentially misleading and confusing to people typically working with binary/hex.

tannergooding

Member

I guess the analogy with other integer types has flaws no matter what behavior is chosen, especially since BigInteger doesn't have the traditional signed/unsigned distinction other types have.

All big integers are signed. The nuance is that it isn't of fixed width and so you don't know how many leading sign bits are needed.

This causes it to default to the shortest 2's complement sequence, so for binary 0b1 is -1 (you only need the sign bit) and 0b01 is +1 (you need a sign bit and the magnitude after).