Skip to content

Commit

Permalink
Normative: Extend ECMA-262 syntax into a superset of JSON
Browse files Browse the repository at this point in the history
 - Narrow the StringLiteral restriction in sec-line-terminators
  • Loading branch information
gibson042 authored and ljharb committed May 7, 2018
1 parent beddafc commit cef8d49
Showing 1 changed file with 20 additions and 15 deletions.
35 changes: 20 additions & 15 deletions spec.html
Expand Up @@ -9569,7 +9569,7 @@ <h2>Syntax</h2>

<emu-clause id="sec-line-terminators">
<h1>Line Terminators</h1>
<p>Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (<emu-xref href="#sec-automatic-semicolon-insertion"></emu-xref>). A line terminator cannot occur within any token except a |StringLiteral|, |Template|, or |TemplateSubstitutionTail|. Line terminators may only occur within a |StringLiteral| token as part of a |LineContinuation|.</p>
<p>Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (<emu-xref href="#sec-automatic-semicolon-insertion"></emu-xref>). A line terminator cannot occur within any token except a |StringLiteral|, |Template|, or |TemplateSubstitutionTail|. &lt;LF&gt; and &lt;CR&gt; line terminators cannot occur within a |StringLiteral| token except as part of a |LineContinuation|.</p>
<p>A line terminator can occur within a |MultiLineComment| but cannot occur within a |SingleLineComment|.</p>
<p>Line terminators are included in the set of white space code points that are matched by the `\\s` class in regular expressions.</p>
<p>The ECMAScript line terminator code points are listed in <emu-xref href="#table-33"></emu-xref>.</p>
Expand Down Expand Up @@ -10165,7 +10165,7 @@ <h1>Static Semantics: MV</h1>
<emu-clause id="sec-literals-string-literals">
<h1>String Literals</h1>
<emu-note>
<p>A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), U+2028 (LINE SEPARATOR), U+2029 (PARAGRAPH SEPARATOR), and U+000A (LINE FEED). Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded as defined in <emu-xref href="#sec-utf16encoding"></emu-xref>. Code points belonging to the Basic Multilingual Plane are encoded as a single code unit element of the string. All other code points are encoded as two code unit elements of the string.</p>
<p>A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), and U+000A (LINE FEED). Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded as defined in <emu-xref href="#sec-utf16encoding"></emu-xref>. Code points belonging to the Basic Multilingual Plane are encoded as a single code unit element of the string. All other code points are encoded as two code unit elements of the string.</p>
</emu-note>
<h2>Syntax</h2>
<emu-grammar type="definition">
Expand All @@ -10181,11 +10181,15 @@ <h2>Syntax</h2>

DoubleStringCharacter ::
SourceCharacter but not one of `"` or `\` or LineTerminator
&lt;LS&gt;
&lt;PS&gt;
`\` EscapeSequence
LineContinuation

SingleStringCharacter ::
SourceCharacter but not one of `'` or `\` or LineTerminator
&lt;LS&gt;
&lt;PS&gt;
`\` EscapeSequence
LineContinuation

Expand Down Expand Up @@ -10228,7 +10232,7 @@ <h2>Syntax</h2>
</emu-grammar>
<p>The definition of the nonterminal |HexDigit| is given in <emu-xref href="#sec-literals-numeric-literals"></emu-xref>. |SourceCharacter| is defined in <emu-xref href="#sec-source-text"></emu-xref>.</p>
<emu-note>
<p>A line terminator code point cannot appear in a string literal, except as part of a |LineContinuation| to produce the empty code points sequence. The proper way to cause a line terminator code point to be part of the String value of a string literal is to use an escape sequence such as `\\n` or `\\u000A`.</p>
<p>&lt;LF&gt; and &lt;CR&gt; cannot appear in a string literal, except as part of a |LineContinuation| to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as `\\n` or `\\u000A`.</p>
</emu-note>

<emu-clause id="sec-string-literals-static-semantics-stringvalue">
Expand Down Expand Up @@ -10275,6 +10279,12 @@ <h1>Static Semantics: SV</h1>
<li>
The SV of <emu-grammar>DoubleStringCharacter :: SourceCharacter but not one of `"` or `\` or LineTerminator</emu-grammar> is the UTF16Encoding of the code point value of |SourceCharacter|.
</li>
<li>
The SV of <emu-grammar>DoubleStringCharacter :: &lt;LS&gt;</emu-grammar> is the code unit value 0x2028.
</li>
<li>
The SV of <emu-grammar>DoubleStringCharacter :: &lt;PS&gt;</emu-grammar> is the code unit value 0x2029.
</li>
<li>
The SV of <emu-grammar>DoubleStringCharacter :: `\` EscapeSequence</emu-grammar> is the SV of the |EscapeSequence|.
</li>
Expand All @@ -10284,6 +10294,12 @@ <h1>Static Semantics: SV</h1>
<li>
The SV of <emu-grammar>SingleStringCharacter :: SourceCharacter but not one of `'` or `\` or LineTerminator</emu-grammar> is the UTF16Encoding of the code point value of |SourceCharacter|.
</li>
<li>
The SV of <emu-grammar>SingleStringCharacter :: &lt;LS&gt;</emu-grammar> is the code unit value 0x2028.
</li>
<li>
The SV of <emu-grammar>SingleStringCharacter :: &lt;PS&gt;</emu-grammar> is the code unit value 0x2029.
</li>
<li>
The SV of <emu-grammar>SingleStringCharacter :: `\` EscapeSequence</emu-grammar> is the SV of the |EscapeSequence|.
</li>
Expand Down Expand Up @@ -35734,7 +35750,7 @@ <h1>JSON.parse ( _text_ [ , _reviver_ ] )</h1>
1. Let _JText_ be ? ToString(_text_).
1. Parse _JText_ interpreted as UTF-16 encoded Unicode points (<emu-xref href="#sec-ecmascript-language-types-string-type"></emu-xref>) as a JSON text as specified in ECMA-404. Throw a *SyntaxError* exception if _JText_ is not a valid JSON text as defined in that specification.
1. Let _scriptText_ be the string-concatenation of `"("`, _JText_, and `");"`.
1. Let _completion_ be the result of parsing and evaluating _scriptText_ as if it was the source text of an ECMAScript |Script|, but using the alternative definition of |DoubleStringCharacter| provided below. The extended PropertyDefinitionEvaluation semantics defined in <emu-xref href="#sec-__proto__-property-names-in-object-initializers"></emu-xref> must not be used during the evaluation.
1. Let _completion_ be the result of parsing and evaluating _scriptText_ as if it was the source text of an ECMAScript |Script|. The extended PropertyDefinitionEvaluation semantics defined in <emu-xref href="#sec-__proto__-property-names-in-object-initializers"></emu-xref> must not be used during the evaluation.
1. Let _unfiltered_ be _completion_.[[Value]].
1. Assert: _unfiltered_ is either a String, Number, Boolean, Null, or an Object that is defined by either an |ArrayLiteral| or an |ObjectLiteral|.
1. If IsCallable(_reviver_) is *true*, then
Expand All @@ -35748,17 +35764,6 @@ <h1>JSON.parse ( _text_ [ , _reviver_ ] )</h1>
</emu-alg>
<p>This function is the <dfn>%JSONParse%</dfn> intrinsic object.</p>
<p>The `length` property of the `parse` function is 2.</p>
<p>JSON allows Unicode code units 0x2028 (LINE SEPARATOR) and 0x2029 (PARAGRAPH SEPARATOR) to directly appear in String literals without using an escape sequence. This is enabled by using the following alternative definition of |DoubleStringCharacter| when parsing _scriptText_ in step 4:</p>
<emu-grammar type="definition">
DoubleStringCharacter ::
SourceCharacter but not one of `"` or `\` or U+0000 through U+001F
`\` EscapeSequence
</emu-grammar>
<ul>
<li>
The SV of <emu-grammar>DoubleStringCharacter :: SourceCharacter but not one of `"` or `\` or U+0000 through U+001F</emu-grammar> is the UTF16Encoding of the code point value of |SourceCharacter|.
</li>
</ul>
<emu-note>
<p>Valid JSON text is a subset of the ECMAScript |PrimaryExpression| syntax as modified by Step 4 above. Step 2 verifies that _JText_ conforms to that subset, and step 6 verifies that that parsing and evaluation returns a value of an appropriate type.</p>
</emu-note>
Expand Down

0 comments on commit cef8d49

Please sign in to comment.