[API Proposal]: Support binary format specifier 'b' and 'B' (standard numeric format strings) #83619

RaphaelTetreault · 2023-03-17T22:03:01Z

Background and motivation

.NET/C# supports a wide range of standard numeric format specifiers when calling .ToString on an integer value. Binary numeric formatting is currently only available by calling Convert.ToString([integer], base = 2). It would be convenient if int.ToString([format specifier][precision specifier]) would accept the unused character b and B as the format specifier followed by an optional precision specifier to denote the minimum number of binary digits to display.

API Proposal

namespace System.Globalization;

[Flags]
public partial enum NumberStyles
{
    AllowBinarySpecifier = 0x00000400,
    BinaryNumber = AllowLeadingWhite | AllowTrailingWhite | AllowBinarySpecifier,
}

API Usage

The API would be like so:

string byteBinary = ((byte)42).ToString("B");
string intBinary = (42).ToString("b16");

such that:

Console.WriteLine(byteString);
Console.WriteLine(intString);

will output:

101010
0000000000101010

Alternative Designs

Enable all integer types' (Int8, Int16, Int32, Int64, Int128, UInt8, UInt16, UInt32, UInt64, UInt128) ToString function to accept the format specifier b and B with an optional precision specifier in order to output the value's binary representation as a string.

Risks

There should be none as the b and B format specifiers are not currently in use.
https://learn.microsoft.com/en-us/dotnet/standard/base-types/standard-numeric-format-strings

The text was updated successfully, but these errors were encountered:

dotnet-issue-labeler · 2023-03-17T22:03:06Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost · 2023-03-18T04:47:36Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

.NET supports a wide range of standard numeric format specifiers when calling .ToString on an integer value. Binary numeric formatting is currently only available by calling Convert.ToString([integer], base = 2). It would be convenient if int.ToString([format specifier][precision specifier]) would accept the unused character b and B as the format specifier followed by an optional precision specifier to denote the minimum number of binary digits to display.

API Proposal

n/a

API Usage

The API would be like so:

string byteBinary = ((byte)42).ToString("B");
string intBinary = (42).ToString("b16");

such that:

Console.WriteLine(byteString);
Console.WriteLine(intString);

will output:

101010
0000000000101010

Alternative Designs

Enable all integer types' (Int8, Int16, Int32, Int64, Int128, UInt8, UInt16, UInt32, UInt64, UInt128) ToString function to accept the format specifier b and B with an optional precision specifier in order to output the value's binary representation as a string.

Risks

There should be none as the b and B format specifier is not currently in use.
https://learn.microsoft.com/en-us/dotnet/standard/base-types/standard-numeric-format-strings

Author:	RaphaelTetreault
Assignees:	-
Labels:	`api-suggestion`, `area-System.Numerics`, `untriaged`
Milestone:	-

huoyaoyuan · 2023-03-21T09:00:15Z

https://learn.microsoft.com/en-us/dotnet/fsharp/whats-new/fsharp-6#formatting-for-binary-numbers

F# added support for %B formatting recently.

tannergooding · 2023-03-23T15:07:22Z

The design can/should be that all integer types support the B format specifier. We will not want to limit this to only a subset of types/values.

stephentoub · 2023-03-23T15:28:02Z

Would we need to consider the parsing direction as well? e.g. NumberStyles.AllowBinarySpecifier (even though the existing AllowHexSpecifier is, IMHO, poorly named)

tannergooding · 2023-03-23T17:02:28Z

It's probably worth considering them together, yes. I updated the OP to include AllowBinarySpecifier and BinaryNumber -- CC. @RaphaelTetreault as an FYI that I updated it

RaphaelTetreault · 2023-03-23T17:31:37Z

@tannergooding Thanks for the notice. My understanding here is that it would be of value to also implement binary string integer parsing? If so, there are at least three relevant issues currently open related to that.

[API Proposal]: Convert.ToString Methods #61719
Allow int.ToString and Parse to support radixes other than base-10 and base-16 #50491
Support plan for parsing of Binary numeric literal strings, e.g., Convert.Int32("0b1011") #19642

tannergooding · 2023-03-23T17:39:03Z

Yes, but those are either of Convert.ToString or under a new API surface.

In this case, your proposal actually fits the existing semantics e.g. int.ToString("B") is binary like int.ToString("X") is hex.

It then makes sense to consider the inverse (parsing) at the same time and that simply requires two new members to NumberStyleOptions.

I'll close the other three in favor of this proposal.

terrajobst · 2023-04-11T18:59:56Z

Video

Looks good as proposed
- b and B are the formatting modifiers

namespace System.Globalization;

[Flags]
public partial enum NumberStyles
{
    AllowBinarySpecifier = 0x00000400,
    BinaryNumber = AllowLeadingWhite | AllowTrailingWhite | AllowBinarySpecifier,
}

stephentoub · 2023-04-16T18:41:38Z

stephentoub · 2023-04-19T09:59:57Z

Everything here is done except for BigInteger. It still needs formatting and parsing
support added.

lateapexearlyspeed · 2023-05-01T10:31:20Z

Hi, I am trying to work on BigInteger part, considering BigInteger is not fixed-length numeric, so I would ask binary format definition so that its binary format can be unique to represent one numeric (include its sign). Following is just an example to indicate one possible format and purpose, please clarify eventual definition:

No digits
- positive number:
  - When highest bit in byte is not 1, eg: 54:
    11_0110
  - When highest bit in byte is 1, eg: 214:
    0_1101_0110 - use additional bit '0' in next higher byte to distinguish from negative -42 (1101_0110)
- negative number, eg: -42:
  1101_0110
Has specified digits and larger than minimal necessary bits:
- positive number, eg: 22, and requires 7 digits:
  001_0110 - just add additional two '0' until total is 7 digits
- negative number, eg: -42, and requires 9 digits:
  1111_1111_1101_0110 - need to add eight '1', to distinguish from positive 470 (1_1101_0110)

RaphaelTetreault · 2023-05-05T19:32:46Z

The precision specifier asks for the desired number of digits. As mentioned, what this means is any positive number needs a leading 0 while a negative needs a leading 1. Making an assumption about how BigInteger works, I would think it just prints all required digits of the underlying byte buffer and appends a 0 if needed (negative numbers would always end with 1). I see that most reasonable given it is the minimum number of bits required to unambiguously represent a non-fixed integer.

+234 == 0b11101010, thus output 0_11101010
-106 == 0b11101010, thus output   11101010

The other option which might cause incompatibility with binary parsing would be to substitute the appended (conditional) 0 or 1 with a fixed + or - sign. I'm not sure what the implications would be for fixed-length integer parsing, eg:

int.Parse("+1111", NumberStyles.BinaryNumber); // +15
int.Parse("-1111", NumberStyles.BinaryNumber); //  -1

Would that be allowed/supported?

Back to the original example but using signs, BigInteger could behave like so when printed:

+234 == 0b11101010, thus output +11101010 // 8 bits
-106 == 0b11101010, thus output  -1101010 // 7 bits

This style of signed binary is sort of like what Google Search does. You can try searching +234 in binary and -106 in binary and you will see negative signs on negative values, with no sign being interpreted as positive.

If it's worth anything, my vote goes to the former formatting.

tannergooding · 2023-05-05T19:38:42Z

There is an existing behavior for how hex formatting works and we'd need to be consistent with it.

https://source.dot.net/#System.Runtime.Numerics/System/Numerics/BigNumber.cs,260b817fae02d08e,references

lateapexearlyspeed · 2023-05-06T07:06:16Z

Hi @RaphaelTetreault thanks, yes I agree we should not prefix signs char in binary.

-106 == 0b11101010, thus output 11101010

Just confirm, the output binary should be 2's complement so -106 should output 1001_0110

What I would confirm is binary format definition for non-fixed length's BigInteger so that its binary format can be unique to represent one numeric in both "no digits" (no precision specifier) case and "digits" (with precision specifier) case, so that in any cases the output binary string can be parsed back to BigInteger.

@tannergooding yes I noticed existing hex formatting definition previously, and this sample binary proposal was made by just aligning existing hex formatting as possible and it gives examples for 5 cases. However, because I feel there are still some detailed difference between hex and binary so I would need confirm if it is proper or need to update. Could you help have a look at it ?

Just highlight one of differences here:

because -921's binary (2's complement)'s highest 1 (non-sign bit) is 10th bit (from 0), which is lower than 13th bit, so current hex format of BigInteger is: C67 rather than FC67 (because one hex char covers 4 bits), so what about binary output for BigInteger for -921 ? (In my sample proposal, it will output 1111_1100_0110_0111 to align one whole byte rather than 1 or 4 bits).
Something like these consideration (including case of specifying digits )..

huoyaoyuan · 2023-05-08T19:05:57Z

Notes related to this when working with #28657:

The decimal parsing/formatting of BigInteger will easily be shared. The binary/hexadecimal won't be shared because the difference between BigInteger and small integers (arithmetic operations creating new values have huge cost). A rewritten path for Hex/Bin is desired for BigInteger.

tannergooding · 2023-05-08T19:10:13Z

You'd need to return the shortest string. Hex returns C67 because that's sufficient and it returns at most 0 leading zero. It is always a multiple of 4-bits because 4 is the smallest unit needed for every hex digit.

In the case of binary, this would be at most 1 leading zero.

lateapexearlyspeed · 2023-05-09T10:28:04Z

So let me try to confirm understanding of desired binary format.

based on:

You'd need to return the shortest string

and existing hex format convention (prefixing zero for positive numeric (for some cases), to distinguish from negative numeric with same bits), one of "shortest" string formats for binary is as following:

-921's shortest binary is: 100 0110 0111 (minimal 11 digits by only keep 1 high one)
to distinguish from it, +1127's binary will be: 0100 0110 0111 (minimal 12 digits, because we prefix 1 zero to distinguish from above)

This way, binary string of positive numeric will always prefix additional zero, for the shortest negative numeric binary format.

If not correct, please just provide desired binary content for -921, thanks @tannergooding :)

Ref:

tannergooding · 2023-05-09T15:21:12Z

That sounds correct. This then matches that for hex, 0..7 is 0..7 because the smallest unit is 4 bits and therefore the sign is present and 0. While 8..F is -8..-15, because the sign bit is set. Thus to represent 8-15 you need 08..0F

For binary, then since the smallest unit is 1 bits, you therefore always need a leading 0 for positive numbers, except 0 itself:

etc
-4 = 100
-3 = 11
-2 = 10
-1 = 1
0 = 0
1 = 01
2 = 010
3 = 011
4 = 0100
etc

lateapexearlyspeed · 2023-05-10T10:01:07Z

Almost same understanding now, except of using 2's complement ? I can see existing hex format output of BigInteger used it, so:

hex 8..F is -8..-15

not always be, eg, hex '9' should be -7

If new binary format should also use 2's complement, then binary examples of negative numbers as above should be (note that -3):

-4 = 100
-3 = 101
-2 = 10
-1 = 1

Should we also use 2's complement for binary ?

tannergooding · 2023-05-10T14:33:59Z

not always be, eg, hex '9' should be -7

Yes, sorry. This should have said "hex 8..F is -8..-1".

if new binary format should also use 2's complement, then binary examples of negative numbers as above should be (note that -3):

Yes. 11 would be an alternative representation of -1, much as FF is also -1 for hex. Just messed up my mental math when writing it down 😄

lateapexearlyspeed · 2023-06-01T14:52:55Z

Hi, PR is ready for review now, could you please help review, thanks !

tannergooding · 2023-07-24T22:44:12Z

The BigInteger support is, unfortunately, not going to make it for .NET 8. We had several bug fixes around BigInteger crop up and several other higher priority features that needed to land.

We can finish reviewing/merging the remaining PR anytime after main opens for .NET 9 next month.

lateapexearlyspeed · 2023-08-23T11:29:01Z

Hi @tannergooding just kindly remind PR can be reviewed now as main branch should already open for .NET 9, thanks !

RaphaelTetreault added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 17, 2023

ghost added the untriaged New issue has not been triaged by the area owner label Mar 17, 2023

huoyaoyuan added the area-System.Numerics label Mar 18, 2023

tannergooding added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Mar 23, 2023

This was referenced Mar 23, 2023

Allow int.ToString and Parse to support radixes other than base-10 and base-16 #50491

Closed

[API Proposal]: Convert.ToString Methods #61719

Closed

Support plan for parsing of Binary numeric literal strings, e.g., Convert.Int32("0b1011") #19642

Closed

terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 11, 2023

tannergooding mentioned this issue Apr 12, 2023

[API Proposal]: Add Support for ToBase In BigInteger Parsing #84684

Closed

stephentoub assigned stephentoub and tannergooding Apr 15, 2023

stephentoub mentioned this issue Apr 15, 2023

Add binary support to number formatting #84889

Merged

stephentoub unassigned tannergooding Apr 18, 2023

stephentoub mentioned this issue Apr 18, 2023

Add binary parsing support to integer types #84998

Merged

stephentoub removed their assignment Apr 19, 2023

stephentoub added the help wanted [up-for-grabs] Good issue for external contributors label Apr 19, 2023

stephentoub added this to the 8.0.0 milestone Apr 19, 2023

lateapexearlyspeed mentioned this issue Apr 26, 2023

Format/Parse binary from/to BigInteger #85392

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 26, 2023

tannergooding modified the milestones: 8.0.0, Future Jul 24, 2023

tannergooding closed this as completed in #85392 Oct 12, 2023

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 12, 2023

ghost locked as resolved and limited conversation to collaborators Nov 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API Proposal]: Support binary format specifier 'b' and 'B' (standard numeric format strings) #83619

[API Proposal]: Support binary format specifier 'b' and 'B' (standard numeric format strings) #83619

RaphaelTetreault commented Mar 17, 2023 •

edited by tannergooding

Loading

dotnet-issue-labeler bot commented Mar 17, 2023

ghost commented Mar 18, 2023

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

huoyaoyuan commented Mar 21, 2023

tannergooding commented Mar 23, 2023

stephentoub commented Mar 23, 2023

tannergooding commented Mar 23, 2023

RaphaelTetreault commented Mar 23, 2023

tannergooding commented Mar 23, 2023

terrajobst commented Apr 11, 2023 •

edited by dotnet-api-review bot

Loading

stephentoub commented Apr 16, 2023 •

edited

Loading

stephentoub commented Apr 19, 2023 •

edited

Loading

lateapexearlyspeed commented May 1, 2023

RaphaelTetreault commented May 5, 2023 •

edited

Loading

tannergooding commented May 5, 2023 •

edited

Loading

lateapexearlyspeed commented May 6, 2023 •

edited

Loading

huoyaoyuan commented May 8, 2023

tannergooding commented May 8, 2023

lateapexearlyspeed commented May 9, 2023

tannergooding commented May 9, 2023 •

edited

Loading

lateapexearlyspeed commented May 10, 2023

tannergooding commented May 10, 2023

lateapexearlyspeed commented Jun 1, 2023

tannergooding commented Jul 24, 2023

lateapexearlyspeed commented Aug 23, 2023 •

edited

Loading

[API Proposal]: Support binary format specifier 'b' and 'B' (standard numeric format strings) #83619

[API Proposal]: Support binary format specifier 'b' and 'B' (standard numeric format strings) #83619

Comments

RaphaelTetreault commented Mar 17, 2023 • edited by tannergooding Loading

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

dotnet-issue-labeler bot commented Mar 17, 2023

ghost commented Mar 18, 2023

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

huoyaoyuan commented Mar 21, 2023

tannergooding commented Mar 23, 2023

stephentoub commented Mar 23, 2023

tannergooding commented Mar 23, 2023

RaphaelTetreault commented Mar 23, 2023

tannergooding commented Mar 23, 2023

terrajobst commented Apr 11, 2023 • edited by dotnet-api-review bot Loading

stephentoub commented Apr 16, 2023 • edited Loading

stephentoub commented Apr 19, 2023 • edited Loading

lateapexearlyspeed commented May 1, 2023

RaphaelTetreault commented May 5, 2023 • edited Loading

tannergooding commented May 5, 2023 • edited Loading

lateapexearlyspeed commented May 6, 2023 • edited Loading

huoyaoyuan commented May 8, 2023

tannergooding commented May 8, 2023

lateapexearlyspeed commented May 9, 2023

tannergooding commented May 9, 2023 • edited Loading

lateapexearlyspeed commented May 10, 2023

tannergooding commented May 10, 2023

lateapexearlyspeed commented Jun 1, 2023

tannergooding commented Jul 24, 2023

lateapexearlyspeed commented Aug 23, 2023 • edited Loading

RaphaelTetreault commented Mar 17, 2023 •

edited by tannergooding

Loading

terrajobst commented Apr 11, 2023 •

edited by dotnet-api-review bot

Loading

stephentoub commented Apr 16, 2023 •

edited

Loading

stephentoub commented Apr 19, 2023 •

edited

Loading

RaphaelTetreault commented May 5, 2023 •

edited

Loading

tannergooding commented May 5, 2023 •

edited

Loading

lateapexearlyspeed commented May 6, 2023 •

edited

Loading

tannergooding commented May 9, 2023 •

edited

Loading

lateapexearlyspeed commented Aug 23, 2023 •

edited

Loading