Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Stylize strings as values. #1733

Merged
merged 1 commit into from Oct 19, 2019

Conversation

rkirsling
Copy link
Member

#1725 included a new section on Value Notation; this PR enacts it by updating `"foo"`*"foo"* throughout the spec. This means that all ECMAScript language values shall now be stylized as such.

<p><emu-xref href="#sec-date.prototype.toisostring"></emu-xref>: If the year cannot be represented using the Date Time String Format specified in <emu-xref href="#sec-date-time-string-format"></emu-xref> a RangeError exception is thrown. Previous editions did not specify the behaviour for that case.</p>
<p><emu-xref href="#sec-date.prototype.tostring"></emu-xref>: Previous editions did not specify the value returned by `Date.prototype.toString` when this time value is *NaN*. ECMAScript 2015 specifies the result to be the String value *"Invalid Date"*.</p>
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the `"source"` property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `"/"`.</p>
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the *"source"* property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `/`.</p>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The removal of quotes around the / here is just to (hopefully?) simplify conflict resolution with #1724.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say you should prefer a coherent commit over avoiding merge conflicts.

@rkirsling rkirsling changed the title Editorial: Represent strings as values. Editorial: Stylize strings as values. Oct 11, 2019
@ljharb ljharb requested a review from a team October 11, 2019 04:46
@ljharb ljharb requested review from syg and jmdyck October 11, 2019 05:05
Copy link
Collaborator

@jmdyck jmdyck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the three spots noted, this looks good to me.

spec.html Outdated
@@ -4758,7 +4758,7 @@ <h1>Runtime Semantics: MV</h1>
The MV of <emu-grammar>StrUnsignedDecimalLiteral ::: DecimalDigits ExponentPart</emu-grammar> is the MV of |DecimalDigits| times 10<sub>ℝ</sub><sup>_e_</sup>, where _e_ is the MV of |ExponentPart|.
</li>
</ul>
<p>Once the exact MV for a String numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0, then the rounded value is *+0* unless the first non white space code point in the String numeric literal is `"-"`, in which case the rounded value is *-0*. Otherwise, the rounded value must be the Number value for the MV (in the sense defined in <emu-xref href="#sec-ecmascript-language-types-number-type"></emu-xref>), unless the literal includes a |StrUnsignedDecimalLiteral| and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th digit position. A digit is significant if it is not part of an |ExponentPart| and</p>
<p>Once the exact MV for a String numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0, then the rounded value is *+0* unless the first non white space code point in the String numeric literal is *"-"*, in which case the rounded value is *-0*. Otherwise, the rounded value must be the Number value for the MV (in the sense defined in <emu-xref href="#sec-ecmascript-language-types-number-type"></emu-xref>), unless the literal includes a |StrUnsignedDecimalLiteral| and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th digit position. A digit is significant if it is not part of an |ExponentPart| and</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a code point, not a String value, so should be changed to one of:

  • `-`
  • U+002D (HYPHEN-MINUS)
  • `-` U+002D (HYPHEN-MINUS)

(There's precedent for each.)

spec.html Outdated
@@ -19025,7 +19025,7 @@ <h2>Syntax</h2>
<emu-clause id="sec-directive-prologues-and-the-use-strict-directive">
<h1>Directive Prologues and the Use Strict Directive</h1>
<p>A <dfn id="directive-prologue">Directive Prologue</dfn> is the longest sequence of |ExpressionStatement|s occurring as the initial |StatementListItem|s or |ModuleItem|s of a |FunctionBody|, a |ScriptBody|, or a |ModuleBody| and where each |ExpressionStatement| in the sequence consists entirely of a |StringLiteral| token followed by a semicolon. The semicolon may appear explicitly or may be inserted by automatic semicolon insertion. A Directive Prologue may be an empty sequence.</p>
<p>A <dfn id="use-strict-directive">Use Strict Directive</dfn> is an |ExpressionStatement| in a Directive Prologue whose |StringLiteral| is either the exact code unit sequences `"use strict"` or `'use strict'`. A Use Strict Directive may not contain an |EscapeSequence| or |LineContinuation|.</p>
<p>A <dfn id="use-strict-directive">Use Strict Directive</dfn> is an |ExpressionStatement| in a Directive Prologue whose |StringLiteral| is either the exact code unit sequences *"use strict"* or *'use strict'*. A Use Strict Directive may not contain an |EscapeSequence| or |LineContinuation|.</p>
Copy link
Collaborator

@jmdyck jmdyck Oct 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is odd, in particular the phrase:
whose |StringLiteral| is either the exact code unit sequences ...
A |StringLiteral| is a Parse Node, or (by extension) the source text that it matches, but it's not a code unit sequence. So I recommend:

  • change code unit to code point (or code unit sequence to source text),
  • revert the backtick-to-asterisk change, and
  • maybe insert of after either.

Copy link
Member Author

@rkirsling rkirsling Oct 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good; this is the line I was most conflicted over myself.

spec.html Outdated
@@ -31624,7 +31624,7 @@ <h1>Runtime Semantics: Canonicalize ( _ch_ )</h1>
<pre><code class="javascript">["baaabaac", "ba", undefined, "abaac"]</code></pre>
</emu-note>
<emu-note>
<p>In case-insignificant matches when _Unicode_ is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, `"&szlig;"` (U+00DF) to `"SS"`. It may however map a code point outside the Basic Latin range to a character within, for example, `"&#x17f;"` (U+017F) to `"s"`. Such characters are not mapped if _Unicode_ is *false*. This prevents Unicode code points such as U+017F and U+212A from matching regular expressions such as `/[a-z]/i`, but they will match `/[a-z]/ui`.</p>
<p>In case-insignificant matches when _Unicode_ is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, *"&szlig;"* (U+00DF) to *"SS"*. It may however map a code point outside the Basic Latin range to a character within, for example, *"&#x17f;"* (U+017F) to *"s"*. Such characters are not mapped if _Unicode_ is *false*. This prevents Unicode code points such as U+017F and U+212A from matching regular expressions such as `/[a-z]/i`, but they will match `/[a-z]/ui`.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are code points, not string values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular line is some remaining confusion from #1724, as I thought we decided not to remove the quotes, but I ought to have asked that more explicitly over there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we made an explicit decision on this case in #1724, I just failed to flag it as code points there.

@jmdyck
Copy link
Collaborator

jmdyck commented Oct 13, 2019

The net effect is good now, but I think, for clarity, it would be better to restructure the PR as two commits:

  1. Handling the 3 code point exceptions. (The commit message could describe it as a belated addition to Editorial: Use consistent notation for code points. #1724.)
  2. All the backtick-to-asterisk cases.

@ljharb, what do you think?

@rkirsling
Copy link
Member Author

rkirsling commented Oct 13, 2019

PRs are generally squash-merged though, right? So in that sense, it might be better to move the third commit here over to #1724 (since the merge button hasn't been hit on it yet) and ensure that those lines aren't touched in this PR.

Edit: I now see that #1734 is an example of a PR where we didn't squash, so I suppose either way should work.

@jmdyck
Copy link
Collaborator

jmdyck commented Oct 13, 2019

Ah! I totally forgot that #1724 was still open. Moving those cases there makes sense.

@rkirsling
Copy link
Member Author

rkirsling commented Oct 13, 2019

Moved third commit to #1724 (and dropped the problematic lines from the first commit here).

@rkirsling rkirsling requested a review from jmdyck October 14, 2019 08:13
Copy link
Collaborator

@jmdyck jmdyck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there!

<p><emu-xref href="#sec-date.prototype.toisostring"></emu-xref>: If the year cannot be represented using the Date Time String Format specified in <emu-xref href="#sec-date-time-string-format"></emu-xref> a RangeError exception is thrown. Previous editions did not specify the behaviour for that case.</p>
<p><emu-xref href="#sec-date.prototype.tostring"></emu-xref>: Previous editions did not specify the value returned by `Date.prototype.toString` when this time value is *NaN*. ECMAScript 2015 specifies the result to be the String value *"Invalid Date"*.</p>
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the `"source"` property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `"/"`.</p>
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the *"source"* property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `/`.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say you should prefer a coherent commit over avoiding merge conflicts.

spec.html Outdated
@@ -31793,7 +31793,7 @@ <h1>CharacterClassEscape</h1>
<p>The production <emu-grammar>UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _s_ be SourceText of |LoneUnicodePropertyNameOrValue|.
1. If ! UnicodeMatchPropertyValue(`"General_Category"`, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the &ldquo;Property value and aliases&rdquo; column of <emu-xref href="#table-unicode-general-category-values"></emu-xref>, then
1. If ! UnicodeMatchPropertyValue(*"General_Category"*, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the &ldquo;Property value and aliases&rdquo; column of <emu-xref href="#table-unicode-general-category-values"></emu-xref>, then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another! UnicodeMatchPropertyValue "takes two parameters ..., each of which is a List of Unicode code points". So here, "General_Category" represents a List of code points, not a String value. (Ditto the occurrence on the next line.) This case should probably be handled in #1724.

Copy link
Member Author

@rkirsling rkirsling Oct 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I don't think the next line applies though, as it's referring to this (that's also the reason for “True” three lines after that).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the next line applies though

Okay, I can go along with that.

@rkirsling
Copy link
Member Author

Removed those two changes from this PR.

@Ms2ger
Copy link
Contributor

Ms2ger commented Oct 15, 2019

Are there plans to fix up all the active proposals?

@rkirsling
Copy link
Member Author

rkirsling commented Oct 15, 2019

@Ms2ger That is a very good question. It seems quite a hurdle to address all proposal repos at all stages at once. Perhaps we should enforce this just at the point of the actual 262 PR?

In particular, if you'd like me to help with updating #1668, I could do that.

ljharb pushed a commit to rkirsling/ecma262 that referenced this pull request Oct 16, 2019
@ljharb ljharb self-assigned this Oct 16, 2019
@syg
Copy link
Contributor

syg commented Oct 17, 2019

Are there plans to fix up all the active proposals?

Surely not for proposal spec text. The old convention doesn't impede readability and understanding. But of course we should take care to apply the new convention when merging proposals into master.

Copy link
Contributor

@syg syg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More a rubberstamp than review.

@ljharb ljharb requested a review from jmdyck October 17, 2019 18:16
@ljharb ljharb merged commit 55707d0 into tc39:master Oct 19, 2019
@rkirsling rkirsling deleted the use-asterisks-for-strings branch October 19, 2019 12:56
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Nov 14, 2019
The change from backticks to asterisks occurred a few weeks ago via PR tc39#1733.
Looks like the merge of PR tc39#1479 accidentally reverted it for this one line.
ljharb pushed a commit to jmdyck/ecma262 that referenced this pull request Nov 14, 2019
…als" (tc39#1778)

 - "Set X to be Y" -> "Set X to Y"
 - The change from backticks to asterisks occurred a few weeks ago via PR tc39#1733. Looks like the merge of PR tc39#1479 accidentally reverted it for this one line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants