Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upproposal: permit blank (_) separator in integer number literals #28493
Comments
griesemer
added
the
Go2
label
Oct 30, 2018
gopherbot
added this to the Proposal milestone
Oct 30, 2018
gopherbot
added
the
Proposal
label
Oct 30, 2018
Oct 30, 2018
This was referenced
This comment has been minimized.
This comment has been minimized.
What would be the reason for allowing the underscore to appear anywhere instead of limiting it to (multiples of) commonly used places? E.g.,
ISTM being too permissive only degrades readability. |
This comment has been minimized.
This comment has been minimized.
@ericlagergren It seems more Go-like to leave the grammar more free-form with regards to placement of |
This comment has been minimized.
This comment has been minimized.
I agree. Though, it’s always possible to make the language more permissive in the future without breaking existing programs. The reverse isn’t true.
Envoyé de mon iPhone
… Le 30 oct. 2018 à 16:27, Damian Gryski ***@***.***> a écrit :
@ericlagergren It seems more Go-like to leave the grammar more free-form with regards to placement of _, but leave the enforcement of style (how many digits in each group) to tooling and code review.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This comment has been minimized.
This comment has been minimized.
@ericlagergren What's a commonly used placement of _? That's really difficult to impossible to answer and shouldn't be decided by the language. Also, note that other languages also don't try to pin that down. |
This comment has been minimized.
This comment has been minimized.
Well, if you asked *me*, I’d say:
decimals: 3
hex: 2, 4, ...
octal: ??
binary: 2, 4, 8, ...
But, I get your point. :)
Envoyé de mon iPhone
… Le 30 oct. 2018 à 16:51, Robert Griesemer ***@***.***> a écrit :
@ericlagergren What's a commonly used placement of _? That's really difficult to impossible to answer and shouldn't be decided by the language. Also, note that other languages also don't try to pin that down.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This comment has been minimized.
This comment has been minimized.
alanfo
commented
Oct 31, 2018
I think this is an excellent idea if we are to have binary literals (which otherwise become unreadable after one or two bytes) and also helps with the readability of other long numbers. Several other languages have a similar feature. As far as placement is concerned, we'll just have to leave that to the good taste of gophers :) |
This comment has been minimized.
This comment has been minimized.
I don’t think this improves readability. For example, which is more readable? x := 100000000 The latter is, IMO, the more descriptive, leverages the qualities of untyped constants, the lineage of the time package, and exists today without new syntax. |
This comment has been minimized.
This comment has been minimized.
alanfo
commented
Oct 31, 2018
Well, exact powers of 10 can always be dealt with in some other manner. One could also do this: z2 := int(1e8) But what about these: // binary literals // decimal literals To my eyes, the underscored literals are much easier to read. |
This comment has been minimized.
This comment has been minimized.
Are there any concrete examples of existing code where this would help readability significantly? Adding a separator does make sense for big numbers but are they common enough to warrant changing the language for? I've spent a bit of time working with Swift code and in my limited experience I've seen that underscores tend to get used inconsistently. I think they end up making it harder (for me anyway) to parse the code. Go already has quite a few ways to write constants, I'm not sure we need more. |
This comment has been minimized.
This comment has been minimized.
alanfo
commented
Oct 31, 2018
When programming in other languages which have digit separators (C#, Java, Kotlin), I generally split binary literals into nybbles and add thousand separators for decimal literals >= 1,000. That more or less mimics how I'd write them down on paper (with space and comma as respective separators). Of course, that doesn't mean I find longer numbers without separators unreadable - I'm OK up to about twice those lengths - but, when you have something, you tend to use it. Very long decimal literals don't crop up very often (except perhaps powers of 10) but when they do it really is good to have the digit separator in the toolbox :) If this proposal is accepted, then no doubt there will be people who abuse it or use it inconsistently but I don't really think it is practical to limit placement in some way. Other languages don't seem to bother either. |
This comment has been minimized.
This comment has been minimized.
deanveloper
commented
Oct 31, 2018
•
This comment has been minimized.
This comment has been minimized.
RalphCorderoy
commented
Oct 31, 2018
I think it would be better to rule out adjacent or trailing underscores, and to always want at least one digit between the base indicator and the first underscore, e.g. 0_7 is invalid, effectively ruling out a leading underscore on the value's digits.
|
This comment has been minimized.
This comment has been minimized.
@RalphCorderoy Note that the proposed syntax already requires at least one digit after the base indicator before the first blank (_). Based on that rule, 0_7 would be decimal 7; but for this specific case of octals, I agree it should be invalid. I'm not convinced about ruling out adjacent separators. For one, other languages permit it, and not permitting it would truly make the literal syntax quite a bit more complex. As is, it simply follows the same pattern we already have for identifiers. Also, consider as a counter-example: |
This comment has been minimized.
This comment has been minimized.
RalphCorderoy
commented
Oct 31, 2018
Hi @griesemer, the double underscore for a 16-bit boundary is a good example of why it should be permitted. With leading underscores, including octal, ruled out, is there a reason to permit trailing ones? It's allowed in identifiers, but this change is about separating digits for readability. The lexing of |
This comment has been minimized.
This comment has been minimized.
deanveloper
commented
Oct 31, 2018
•
Definitely should not be permitted. It looks like the proposed syntax prevents this though.
I can't think of any reasons. Also, identifiers allow leading underscores as well, and we've ruled those out so I'm not sure if that logic applies |
This comment has been minimized.
This comment has been minimized.
I don't think this is the case. Currently I'm aware the compatibility guarantee is narrowed to "continues to compile", but that's not the same as "fully backwards compatible", as that should IMHO include not breaking also any of the existing tools. |
This comment has been minimized.
This comment has been minimized.
@cznic As you say, the compatibility guarantee only says that valid code will continue to work; it makes no particular guarantees about invalid code. The cost of updating tools to support new language features is something that has to be considered with any language change, but that is a separate issue from backward compatibility. |
This comment has been minimized.
This comment has been minimized.
While doing embedded programming, I often use hexadecimal constants and I feel that 0x0f00_0000 is a big readability improvement over 0x0f000000, as it's too easy to miss a digit. |
This comment has been minimized.
This comment has been minimized.
seaskyways
commented
Nov 1, 2018
1+ |
This comment has been minimized.
This comment has been minimized.
deanveloper
commented
Nov 1, 2018
•
For simple, small(er) numbers, this argument holds. But it starts breaking down once we start talking about numbers that are larger than a billion. I'd rather not have to look back and count 9 zeroes just to verify that The argument also starts to break down when we start to talk about numbers which aren't as simple as 100 million, for instance, a large prime number 10107476689
The argument also breaks down when you're not talking about decimal notation, which has already been brought up so I won't go into depth, but it's very useful to be able to break up binary/hex into nybbles/bytes. PS - the |
This comment has been minimized.
This comment has been minimized.
tomvanwoow
commented
Nov 1, 2018
I don't believe it is a good idea to enforce restrictions on where the underscores must go, as I can imagine some people preferring |
This comment has been minimized.
This comment has been minimized.
theodesp
commented
Nov 5, 2018
It looks like this proposal is very similar to Python's PEP-515 Ideally, we would also like to have rules for fmt also. For example:
|
This comment has been minimized.
This comment has been minimized.
deanveloper
commented
Nov 5, 2018
@theodesp again I bring up international number formatting Is there any reason to put it in fmt? We do not have comma separators in fmt today, and I think for good reason. Number formats are localized, so I'm not sure if it's a good idea for fmt. Maybe it'd be good for the text package, but we already have comma separation there, so I don't see much use Also, the # parameter is already taken by "alternate form". For instance, it adds a leading "0x" for "%x". Not sure if you knew this or not. If you did, you forgot to put it in your outputs, and there is no "%#d". If you didn't, well I hope you learned something new |
This comment has been minimized.
This comment has been minimized.
theodesp
commented
Nov 5, 2018
I knew it, this was just an example extension to the existing flags. I use fmt because its the most obvious choice and I don't think the underscores are part of any sort of internationalization as the scope of this proposal is totally different. |
This comment has been minimized.
This comment has been minimized.
deanveloper
commented
Nov 5, 2018
The main point that I was making was that separating by 3s is localized - the Indian number format does not do this. Go forces the "." separator in fmt because it's part of the language, but separating numbers by 3s is not part of the language, so I'm not sure if that's in the scope of fmt |
rsc
modified the milestones:
Proposal,
Go1.13
Jan 30, 2019
This comment has been minimized.
This comment has been minimized.
As a reminder, we introduced a new process for these Go 2-related language changes in our blog post blog.golang.org/go2-here-we-come. We are going to tentatively accept a proposal, land changes at the start of a cycle, get experience using it, and then make the final acceptance decision three months later, at the freeze. For Go 1.13, this would mean landing a change when the tree opens February, and making the final decision when the tree freezes in May. We are going to tentatively accept this proposal for Go 1.13 and plan to land its implementation when the tree opens. The issue state for "tentative accept" will be marked Proposal-Accepted but left open and milestoned to the Go release (Go1.13 here). At the freeze we will revisit the issue and close it if it is finally accepted. |
This comment has been minimized.
This comment has been minimized.
After rethinking my initial implementation a bit, I need to revise my #28493 (comment): My initial implementation was trying overly hard to only accept valid literals and stop tokenization early. In retrospect this was clearly the wrong approach and led to an overly complex solution. It's much better to accept a more lenient number literal syntax (separators everywhere, etc.) and check after the fact (basically what https://go-review.googlesource.com/c/go/+/160243 is doing for strconv). The resulting implementation is massively simpler (simpler in fact than what was there before and more reliably testable) and also leads to better error messages from the compiler. For instance, instead of tokenizing In short, separators are actually straight-forward to handle correctly. |
This comment has been minimized.
This comment has been minimized.
gopherbot
commented
Feb 4, 2019
Change https://golang.org/cl/161098 mentions this issue: |
This comment has been minimized.
This comment has been minimized.
gopherbot
commented
Feb 5, 2019
Change https://golang.org/cl/161199 mentions this issue: |
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 11, 2019
pushed a commit
that referenced
this issue
Feb 12, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 18, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
added a commit
to nebulabox/go
that referenced
this issue
Feb 20, 2019
This comment has been minimized.
This comment has been minimized.
gopherbot
commented
Feb 20, 2019
Change https://golang.org/cl/163079 mentions this issue: |
griesemer commentedOct 30, 2018
•
edited
This has come up before, specifically during discussions of #19308 and #28256. I am writing this down so we have a place for discussing this independently. I have no strong feelings about this proposal either way.
Proposal
This is a fully backward-compatible proposal for permitting the use of a blank (_) as a separator in number literals. Specifically, we change the integer literal syntax such that we also allow a "_" after the first digit (the change is the extra "_"):
And we change the floating-point number literal syntax correspondingly (the change is the extra "_"):
For complex number literals the change is implied by the change to floating-point literals.
Examples:
but also:
Discussion
The notation follows more or less the syntax used in other languages (e.g., Swift).
The implementation is straight-forward (a minor change to the number scanner). As the examples show, the separator, if judiciously used, may improve readability; or it may degrade it significantly.
A tool (go vet) might impose restrictions on how the separator is used for readability.