New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose underscores in numeric literals #76
Propose underscores in numeric literals #76
Conversation
Costs and Drawbacks | ||
------------------- | ||
* Implementation costs are mostly related to lexers. | ||
* Maintenance costs are related to compatibility. Compatibility can be handled with language extension like ``NumericUnderscores``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably make the proposal right away suggest this language extension, as a language extensions is required for such changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could fold it into BinaryLiterals
or something like this, but yes, I generally agree that a new extension is probably best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback.
I agree with the language extension.
I will update the proposal for specifying language extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explicitly described the language extension at this commit.
I support this proposal.
There’s a bunch of details to make it work nicely but it sounds like
everyone’s got those well In hand.
Heck, do we have a candidate patch yet? I might try if there’s not one
already!
…On Sat, Sep 30, 2017 at 10:46 AM Ben Gamari ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In proposals/0000-numeric-underscores.rst
<#76 (comment)>
:
> + isUnderMillion = (< 1_000_000)
+
+ clip64M x
+ | x > 0x3ff_ffff = 0x3ff_ffff
+ | otherwise = x
+
+ test8bit x = (0b01_0000_0000 .&. x) /= 0
+
+Effect and Interactions
+-----------------------
+I believe that this proposal will improve the readability, quality and expressiveness of native numeric literals without degrading performance.
+
+Costs and Drawbacks
+-------------------
+* Implementation costs are mostly related to lexers.
+* Maintenance costs are related to compatibility. Compatibility can be handled with language extension like ``NumericUnderscores``.
We could fold it into BinaryLiterals or something like this, but yes, I
generally agree that a new extension is probably best.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwiqeBleqj6HbjTd6CsJpaSE-l52jks5snlQ9gaJpZM4PpdoI>
.
|
|
||
.. code-block:: none | ||
|
||
x = 10 * 000 * 000 :: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, for primitive types, like Int
GHC does constant fold the expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for helpful information.
|
||
.. code-block:: none | ||
|
||
decimal → digit{[_ | digit]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would allow trailing underscore, i.e. 1000_1000_
, do we want such thing? If we do, then there should be an example in proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
For float only, I assumed the trailing underscore.
For example, the following description is possible.
gravity = 6.674_08_e−11
i'd personally thing we dont want trailing underscores, but i'm open to
being convinc
…On Sat, Sep 30, 2017 at 12:51 PM, Oleg Grenrus ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In proposals/0000-numeric-underscores.rst
<#76 (comment)>
:
> +I believe that this proposal will improve the readability, quality and expressiveness of native numeric literals without degrading performance.
+
+Costs and Drawbacks
+-------------------
+* Implementation costs are mostly related to lexers.
+* Maintenance costs are related to compatibility. Compatibility can be handled with language extension like ``NumericUnderscores``.
+* I think the user's learning curve is not a problem. They will soon get used to it.
+* Syntax highlighting for text editors and code browsers is affected.
+
+Alternatives
+------------
+For example, these expressions are current alternatives:
+
+.. code-block:: none
+
+ x = 10 * 000 * 000 :: Int
FWIW, for primitive types, like Int GHC does constant fold the expression.
------------------------------
In proposals/0000-numeric-underscores.rst
<#76 (comment)>
:
> +I propose an extension to the existing syntax of numeric literals.
+
+Current syntax:
+
+.. code-block:: none
+
+ decimal → digit{digit}
+ octal → octit{octit}
+ hexadecimal → hexit{hexit}
+ binary → binit{binit}
+
+New syntax (this proposal):
+
+.. code-block:: none
+
+ decimal → digit{[_ | digit]}
This would allow trailing underscore, i.e. 1000_1000_, do we want such
thing? If we do, then there should be an example in proposal.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#76 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwvJMMzZ2SZefNFFCSHH1SovQ03bPks5snnGEgaJpZM4PpdoI>
.
|
@cartazio, Thank you, I'm glad. |
phadej is also mentioned. |
I studied alternative proposal for the trailing underscore. (1) current proposal:
(2) alternative proposal for trailing underscore:
Examples with (2) alternative proposal:
|
Strong 👍 from me. We often do things like
first one is weird (GHC can evaluate that in compile time but still), second one is dangerous if you forget to update the commend after updating the code. Rust has this feature and I often use it. |
The first "alternative" currently has multiplication by zero, which looks like a mistake. I'm generally in favor of the proposal. I don't think trailing underscores are useful. |
Previous comment edited to remove bogus information. Sorry for the confusion. |
Agreed that numerals with leading underscore should not be considered as a number. Trailing underscores don't seem useful but at least harmless so no strong feelings about that. |
I misread the proposal before; it seems leading underscores are already
excluded.
…On Oct 2, 2017 2:26 AM, "Ömer Sinan Ağacan" ***@***.***> wrote:
Agreed that numerals with leading underscore should not be considered as a
number. Trailing underscores don't seem useful but at least harmless.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABzi_f31-mO6ud8vGDFqA0pvFEAHO9Pxks5soIIZgaJpZM4PpdoI>
.
|
I'm happy with this. |
I'm strongly +1 on this too. |
Hi everyone, I'm glad responses! Thank you very much.
I chose the option to remove the trailing underscore. Please tell me if something is wrong. |
I will wait for a few weeks of discussion period :) By the way, I wrote a patch for my studies [1]. |
woah, awesome!
i got a bit sick for a week + buried in mentoring a colleague, so i've not
yet had a chance to dig into this properly, :/
i will endeavor to spend some time this weekend/next week to review both
your draft patch and the current state of affairs in the doc :)
…On Sat, Oct 14, 2017 at 5:25 AM, Takenobu Tani ***@***.***> wrote:
I will wait for a few weeks of discussion period :)
By the way, I wrote a patch for my studies [1].
However, I am glad that Carter or someone will write a more suitable patch.
Especially, it would be nice if someone could express Lexer simpler.
[1] ***@***.***:wip/numeric-underscores
<ghc/ghc@master...takenobu-hs:wip/numeric-underscores>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwtyA_l5WQoUqUSQthvMMzvVBMkd7ks5ssH4fgaJpZM4PpdoI>
.
|
@cartazio thank you, I am grateful for your help :) |
I did a first read of your patch.
One question is: how do we test for failures / disallowed patterns ?
…On Fri, Oct 20, 2017 at 8:02 AM Takenobu Tani ***@***.***> wrote:
@cartazio <https://github.com/cartazio> thank you, I am grateful for your
help :)
I am not in a hurry, so please do it when you have time.
Please take care of yourself.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwil5EpMfK9OOgAu74l8NLLUzP33Rks5suIvQgaJpZM4PpdoI>
.
|
Thank you, I will add a testcase for failures / disallowed patterns in |
Hi, I added the testcase in this [1]. |
Four weeks have passed since I submitted this proposal. cc @nomeata |
Committee decision process started. Thanks for the ping, @bitemyapp, I missed the earlier one among too many GitHub notifications. |
@nomeata, thank you very much :) |
This or some honed refinement is also likely merited for Inclusion in H2020
@Takenobu could you explain the qualms you mentioned earlier about the
lexer changes you wrote? I must confess I didn’t quite understand it at the
Time :)
…On Thu, Nov 2, 2017 at 8:35 AM Takenobu Tani ***@***.***> wrote:
@nomeata <https://github.com/nomeata>, thank you very much :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwkxE7XPlXDFat0vCfOXB1Jdan4wKks5sybb2gaJpZM4PpdoI>
.
|
@cartazio thank you always. |
I think we should allow underscores between the base specifier and the digits for binary, octal and hex literals. For example, I think it would be handy to write |
@yav thanks for the advice. Current syntax (Haskell Report 2010 + BinaryLiterals + HexFloatLiterals):
This proposal (revised):
The lines of |
Possibly expositional question :
Would the spec change be cleaner / easier to write if we factor it into
“here’s all the valid liberals sans underscore” plus “here’s the extended
space of literal we also accept”?
I’m totally uncertain if my suggestion makes sense or just adds complexity
:)
…On Wed, Nov 8, 2017 at 7:11 AM Takenobu Tani ***@***.***> wrote:
@yav <https://github.com/yav> thanks for the advice.
I considered it as follows.
Current syntax (Haskell Report 2010 + BinaryLiterals + HexFloatLiterals):
decimal → digit{digit}
octal → octit{octit}
hexadecimal → hexit{hexit}
binary → binit{binit} -- BinaryLiterals
integer → decimal
| 0 (o | O) octal
| 0 (x | X) hexadecimal
| 0 (b | B) binary -- BinaryLiterals
float → decimal . decimal [exponent]
| decimal exponent
| 0 (x | X) hexadecimal . hexadecimal [bin_exponent] -- HexFloatLiterals
| 0 (x | X) hexadecimal bin_exponent -- HexFloatLiterals
exponent → (e | E) [+ | -] decimal
bin_exponent → (p | P) [+ | -] decimal -- HexFloatLiterals
This proposal (revised):
decimal → digit[{_ | digit} digit]
octal → octit[{_ | octit} octit]
hexadecimal → hexit[{_ | hexit} hexit]
binary → binit[{_ | binit} binit]
integer → decimal
| 0 (o | O) [_] octal -- *** for 0o_123
| 0 (x | X) [_] hexadecimal -- *** for 0x_ffff
| 0 (b | B) [_] binary -- *** for 0b_0101
float → decimal . decimal [_] [exponent]
| decimal [_] exponent
| 0 (x | X) [_] hexadecimal . hexadecimal [[_]bin_exponent] -- *** for 0x_ff.ff
| 0 (x | X) [_] hexadecimal [_] bin_exponent -- *** for 0x_ffp1
The lines of *** are the parts you pointed out.
Please tell me if there is any misunderstanding.
If this is correct, I will update the proposal file.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwioPIyAl_yqmIk6-o6tRNma5x0mTks5s0ZqMgaJpZM4PpdoI>
.
|
@takenobu-hs Yep, that's what I was thinking. |
@yav thanks for confirmation. @cartazio thanks for a clear perspective. This proposal (revised):
Please let me know if it is insufficient. |
@takenobu-hs I think we should allow multiple underscores next to each other, and at least one of your examples in the proposal suggests that you also think so. To accommodate that I think that Also, would you mind updating the actual proposal with the current version of the syntax, so that we have the whole proposal in a single place. Here's another suggestion: I would drop the "Current specification of numeric literals" section. It is enough to just have "current syntax" and "new syntax", or even just "new syntax". |
Add underscores between the base specifier and the digits. (from yav) Modify underscore before the exponent and after the base specifier to multiple. Correspond to HexFloatLiterals extention. Reconstruction of section on current specification. (from yav) Reconstruction of new proposal notation (from cartazio)
@yav thanks for the review.
In the current proposal, we could allow multiple underscores in decimal, octal, hexadecimal, and binary. In addition, I changed the underscore before the exponent and after the base specifier to multiple in this revision.
Thanks. I pushed the actual proposal with the current version of the syntax.
Thanks. That makes the proposal easier to understand. Please review the revised version. |
@yav I updated |
Ok, I've marked this as accepted. @takenobu-hs you should consider submitting your patch to phabricator for review. I had a quick look at it and I have just one major suggestion: instead of having two sets of definitions for the literals (one with underscores and one without), it might be simpler to have only the definitions with underscores, and then have a separate function that validates the literals, ensuring that there are no underscores unless the extension is enabled. |
Hi everyone thanks for a lot of advice and comments.
I will submit the patch to phabricator after improving the following.
Thanks for valuable suggestions. That was my concern.
It's beautiful. I like your unified definition. By the way, should I create a ticket for this proposal? |
Sure, that’d be helpful. |
I created the ticket (https://ghc.haskell.org/trac/ghc/ticket/14473). In addition, I added the ticket number to Thanks. |
Implement the proposal of underscores in numeric literals. Underscores in numeric literals are simply ignored. The specification of the feature is available here: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/000 9-numeric-underscores.rst For a discussion of the various choices: ghc-proposals/ghc-proposals#76 Implementation detail: * Added dynamic flag * `NumericUnderscores` extension flag is added for this feature. * Alex "Regular expression macros" in Lexer.x * Add `@numspc` (numeric spacer) macro to represent multiple underscores. * Modify `@decimal`, `@decimal`, `@binary`, `@octal`, `@hexadecimal`, `@exponent`, and `@bin_exponent` macros to include `@numspc`. * Alex "Rules" in Lexer.x * To be simpler, we have only the definitions with underscores. And then we have a separate function (`tok_integral` and `tok_frac`) that validates the literals. * Validation functions in Lexer.x * `tok_integral` and `tok_frac` functions validate whether contain underscores or not. If `NumericUnderscores` extensions are not enabled, check that there are no underscores. * `tok_frac` function is created by merging `strtoken` and `init_strtoken`. * `init_strtoken` is deleted. Because it is no longer used. * Remove underscores from target literal string * `parseUnsignedInteger`, `readRational__`, and `readHexRational} use the customized `span'` function to remove underscores. * Added Testcase * testcase for NumericUnderscores enabled. NumericUnderscores0.hs and NumericUnderscores1.hs * testcase for NumericUnderscores disabled. NoNumericUnderscores0.hs and NoNumericUnderscores1.hs * testcase to invalid pattern for NumericUnderscores enabled. NumericUnderscoresFail0.hs and NumericUnderscoresFail1.hs Test Plan: `validate` including the above testcase Reviewers: goldfire, bgamari Reviewed By: bgamari Subscribers: carter, rwbarton, thomie GHC Trac Issues: #14473 Differential Revision: https://phabricator.haskell.org/D4235
The proposal has been accepted; the following discussion is mostly of historic interest.
This is a proposal to add underscores to numeric literals.
Underscores (
_
) in numeric literals are simply ignored.Rendered