Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplification of ES2015 RegExp semantics #89

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 58 additions & 93 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -29395,61 +29395,30 @@ <h1>RegExp.prototype.exec ( _string_ )</h1>
1. If _R_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _S_ be ToString(_string_).
1. ReturnIfAbrupt(_S_).
1. Return RegExpBuiltinExec(_R_, _S_).
1. Return RegExpExec(_R_, _S_).
</emu-alg>

<!-- es6num="21.2.5.2.1" -->
<emu-clause id="sec-regexpexec" aoid="RegExpExec">
<h1>Runtime Semantics: RegExpExec ( _R_, _S_ )</h1>
<p>The abstract operation RegExpExec with arguments _R_ and _S_ performs the following steps:</p>
<emu-alg>
1. Assert: Type(_R_) is Object.
1. Assert: Type(_S_) is String.
1. Let _exec_ be Get(_R_, `"exec"`).
1. ReturnIfAbrupt(_exec_).
1. If IsCallable(_exec_) is *true*, then
1. Let _result_ be Call(_exec_, _R_, &laquo;_S_&raquo;).
1. ReturnIfAbrupt(_result_).
1. If Type(_result_) is neither Object or Null, throw a *TypeError* exception.
1. Return _result_.
1. If _R_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Return RegExpBuiltinExec(_R_, _S_).
</emu-alg>
<emu-note>
<p>If a callable `exec` property is not found this algorithm falls back to attempting to use the built-in RegExp matching algorithm. This provides compatible behaviour for code written for prior editions where most built-in algorithms that use regular expressions did not perform a dynamic property lookup of `exec`.</p>
</emu-note>
</emu-clause>

<!-- es6num="21.2.5.2.2" -->
<emu-clause id="sec-regexpbuiltinexec" aoid="RegExpBuiltinExec">
<h1>Runtime Semantics: RegExpBuiltinExec ( _R_, _S_ )</h1>
<p>The abstract operation RegExpBuiltinExec with arguments _R_ and _S_ performs the following steps:</p>
<emu-clause id="sec-innerregexpexec" aoid="InnerRegExpExec">
<h1>Runtime Semantics: InnerRegExpExec ( _R_, _S_, _lastIndex_, _extraFlags_ )</h1>
<p>The abstract operation RegExpExec with arguments _R_, _S_, _lastIndex_ and _extraFlags_ performs the following steps:</p>
<emu-alg>
1. Assert: _R_ is an initialized RegExp instance.
1. Assert: Type(_S_) is String.
1. Let _length_ be the number of code units in _S_.
1. Let _lastIndex_ be ToLength(Get(_R_,`"lastIndex"`)).
1. ReturnIfAbrupt(_lastIndex_).
1. Let _global_ be ToBoolean(Get(_R_, `"global"`)).
1. ReturnIfAbrupt(_global_).
1. Let _sticky_ be ToBoolean(Get(_R_, `"sticky"`)).
1. ReturnIfAbrupt(_sticky_).
1. Let _flags_ be the concatenation of _R_'s [[OriginalFlags]] internal slot and _extraFlags_.
1. If _flags_ contains `"g"`, let _global_ be *true*, else let _global_ be *false*.
1. If _flags_ contains `"y"`, let _sticky_ be *true*, else let _sticky_ be *false*.
1. If _global_ is *false* and _sticky_ is *false*, let _lastIndex_ be 0.
1. Let _matcher_ be the value of _R_'s [[RegExpMatcher]] internal slot.
1. Let _flags_ be the value of _R_'s [[OriginalFlags]] internal slot.
1. If _flags_ contains `"u"`, let _fullUnicode_ be *true*, else let _fullUnicode_ be *false*.
1. Let _matchSucceeded_ be *false*.
1. Repeat, while _matchSucceeded_ is *false*
1. If _lastIndex_ &gt; _length_, then
1. Let _setStatus_ be Set(_R_, `"lastIndex"`, 0, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Return *null*.
1. Return { [[Matches]]: *null*, [[LastIndex]]: 0 }
1. Let _r_ be _matcher_(_S_, _lastIndex_).
1. If _r_ is ~failure~, then
1. If _sticky_ is *true*, then
1. Let _setStatus_ be Set(_R_, `"lastIndex"`, 0, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Return *null*.
1. Return { [[Matches]]: *null*, [[LastIndex]]: 0 }
1. Let _lastIndex_ be AdvanceStringIndex(_S_, _lastIndex_, _fullUnicode_).
1. Else,
1. Assert: _r_ is a State.
Expand All @@ -29458,9 +29427,6 @@ <h1>Runtime Semantics: RegExpBuiltinExec ( _R_, _S_ )</h1>
1. If _fullUnicode_ is *true*, then
1. _e_ is an index into the _Input_ character list, derived from _S_, matched by _matcher_. Let _eUTF_ be the smallest index into _S_ that corresponds to the character at element _e_ of _Input_. If _e_ is greater than or equal to the length of _Input_, then _eUTF_ is the number of code units in _S_.
1. Let _e_ be _eUTF_.
1. If _global_ is *true* or _sticky_ is *true*,
1. Let _setStatus_ be Set(_R_, `"lastIndex"`, _e_, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Let _n_ be the length of _r_'s _captures_ List. (This is the same value as <emu-xref href="#sec-notation"></emu-xref>'s _NcapturingParens_.)
1. Let _A_ be ArrayCreate(_n_ + 1).
1. Assert: The value of _A_'s `"length"` property is _n_ + 1.
Expand All @@ -29480,7 +29446,24 @@ <h1>Runtime Semantics: RegExpBuiltinExec ( _R_, _S_ )</h1>
1. Assert: _captureI_ is a List of code units.
1. Let _capturedValue_ be a string consisting of the code units of _captureI_.
1. Perform CreateDataProperty(_A_, ToString(_i_) , _capturedValue_).
1. Return _A_.
1. If _global_ is *true* or _sticky_ is *true*,
1. Return { [[Matches]]: _A_, [[LastIndex]]: _e_ }.
1. Otherwise, return { [[Matches]]: _A_ }.
</emu-alg>
</emu-clause>

<emu-clause id="sec-regexpexec" aoid="RegExpExec">
<h1>Runtime Semantics: RegExpExec ( _R_, _S_ )</h1>
<p>The abstract operation RegExpExec with arguments _R_ and _S_ performs the following steps:</p>
<emu-alg>
1. Let _lastIndex_ be ToLength(Get(_R_,`"lastIndex"`)).
1. ReturnIfAbrupt(_lastIndex_).
1. Let _result_ be InnerRegExpExec(_R_, _S_, _lastIndex_, `""`).
1. ReturnIfAbrupt(_result_).
1. If _result_ has a [[LastIndex]] entry,
1. Let _setStatus_ be Set(_R_, `"lastIndex"`, _result_.[[LastIndex]], *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Return _result_.[[Matches]].
</emu-alg>
</emu-clause>

Expand Down Expand Up @@ -29511,22 +29494,14 @@ <h1>get RegExp.prototype.flags</h1>
<emu-alg>
1. Let _R_ be the *this* value.
1. If Type(_R_) is not Object, throw a *TypeError* exception.
1. If _rx_ does not have a [[OriginalFlags]] internal slot, throw a *TypeError* exception.
1. Let _result_ be the empty String.
1. Let _global_ be ToBoolean(Get(_R_, `"global"`)).
1. ReturnIfAbrupt(_global_).
1. If _global_ is *true*, append `"g"` as the last code unit of _result_.
1. Let _ignoreCase_ be ToBoolean(Get(_R_, `"ignoreCase"`)).
1. ReturnIfAbrupt(_ignoreCase_).
1. If _ignoreCase_ is *true*, append `"i"` as the last code unit of _result_.
1. Let _multiline_ be ToBoolean(Get(_R_, `"multiline"`)).
1. ReturnIfAbrupt(_multiline_).
1. If _multiline_ is *true*, append `"m"` as the last code unit of _result_.
1. Let _unicode_ be ToBoolean(Get(_R_, `"unicode"`)).
1. ReturnIfAbrupt(_unicode_).
1. If _unicode_ is *true*, append `"u"` as the last code unit of _result_.
1. Let _sticky_ be ToBoolean(Get(_R_, `"sticky"`)).
1. ReturnIfAbrupt(_sticky_).
1. If _sticky_ is *true*, append `"y"` as the last code unit of _result_.
1. Let _flags_ be the value of _R_'s [[OriginalFlags]] internal slot.
1. If _flags_ contains `"g"`, append `"g"` as the last code unit of _result_.
1. If _flags_ contains `"i"`, append `"i"` as the last code unit of _result_.
1. If _flags_ contains `"m"`, append `"m"` as the last code unit of _result_.
1. If _flags_ contains `"u"`, append `"u"` as the last code unit of _result_.
1. If _flags_ contains `"y"`, append `"y"` as the last code unit of _result_.
1. Return _result_.
</emu-alg>
</emu-clause>
Expand Down Expand Up @@ -29566,15 +29541,15 @@ <h1>RegExp.prototype [ @@match ] ( _string_ )</h1>
<emu-alg>
1. Let _rx_ be the *this* value.
1. If Type(_rx_) is not Object, throw a *TypeError* exception.
1. If _rx_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _S_ be ToString(_string_)
1. ReturnIfAbrupt(_S_).
1. Let _global_ be ToBoolean(Get(_rx_, `"global"`)).
1. ReturnIfAbrupt(_global_).
1. Let _flags_ be the value of _R_'s [[OriginalFlags]] internal slot.
1. If _flags_ contains `"g"`, let _global_ be *true*, else let _global_ be *false*.
1. If _global_ is *false*, then
1. Return RegExpExec(_rx_, _S_).
1. Else _global_ is *true*,
1. Let _fullUnicode_ be ToBoolean(Get(_rx_, `"unicode"`)).
1. ReturnIfAbrupt(_fullUnicode_).
1. If _flags_ contains `"u"`, let _fullUnicode_ be *true*, else let _fullUnicode_ be *false*.
1. Let _setStatus_ be Set(_rx_, `"lastIndex"`, 0, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Let _A_ be ArrayCreate(0).
Expand Down Expand Up @@ -29625,18 +29600,18 @@ <h1>RegExp.prototype [ @@replace ] ( _string_, _replaceValue_ )</h1>
<emu-alg>
1. Let _rx_ be the *this* value.
1. If Type(_rx_) is not Object, throw a *TypeError* exception.
1. If _rx_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _S_ be ToString(_string_).
1. ReturnIfAbrupt(_S_).
1. Let _lengthS_ be the number of code unit elements in _S_.
1. Let _functionalReplace_ be IsCallable(_replaceValue_).
1. If _functionalReplace_ is *false*, then
1. Let _replaceValue_ be ToString(_replaceValue_).
1. ReturnIfAbrupt(_replaceValue_).
1. Let _global_ be ToBoolean(Get(_rx_, `"global"`)).
1. ReturnIfAbrupt(_global_).
1. Let _flags_ be the value of _R_'s [[OriginalFlags]] internal slot.
1. If _flags_ contains `"g"`, let _global_ be *true*, else let _global_ be *false*.
1. If _global_ is *true*, then
1. Let _fullUnicode_ be ToBoolean(Get(_rx_, `"unicode"`)).
1. ReturnIfAbrupt(_fullUnicode_).
1. If _flags_ contains `"u"`, let _fullUnicode_ be *true*, else let _fullUnicode_ be *false*.
1. Let _setStatus_ be Set(_rx_, `"lastIndex"`, 0, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Let _results_ be a new empty List.
Expand Down Expand Up @@ -29705,18 +29680,14 @@ <h1>RegExp.prototype [ @@search ] ( _string_ )</h1>
<emu-alg>
1. Let _rx_ be the *this* value.
1. If Type(_rx_) is not Object, throw a *TypeError* exception.
1. If _rx_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _S_ be ToString(_string_).
1. ReturnIfAbrupt(_S_).
1. Let _previousLastIndex_ be Get(_rx_, `"lastIndex"`).
1. ReturnIfAbrupt(_previousLastIndex_).
1. Let _status_ be Set(_rx_, `"lastIndex"`, 0, *true*).
1. ReturnIfAbrupt(_status_).
1. Let _result_ be RegExpExec(_rx_, _S_).
1. Let _result_ be InnerRegExpExec(_rx_, _S_, 0, `""`).
1. ReturnIfAbrupt(_result_).
1. Let _status_ be Set(_rx_, `"lastIndex"`, _previousLastIndex_, *true*).
1. ReturnIfAbrupt(_status_).
1. If _result_ is *null*, return -1.
1. Return Get(_result_, `"index"`).
1. Let _matches_ = _result_.[[Matches]].
1. If _matches_ is *null*, return -1.
1. Return Get(_matches_, `"index"`).
</emu-alg>
<p>The value of the `name` property of this function is `"[Symbol.search]"`.</p>
<emu-note>
Expand Down Expand Up @@ -29760,18 +29731,14 @@ <h1>RegExp.prototype [ @@split ] ( _string_, _limit_ )</h1>
<emu-alg>
1. Let _rx_ be the *this* value.
1. If Type(_rx_) is not Object, throw a *TypeError* exception.
1. If _rx_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _S_ be ToString(_string_).
1. ReturnIfAbrupt(_S_).
1. Let _C_ be SpeciesConstructor(_rx_, %RegExp%).
1. ReturnIfAbrupt(_C_).
1. Let _flags_ be ToString(Get(_rx_, `"flags"`)).
1. ReturnIfAbrupt(_flags_).
1. Let _flags_ be the value of _R_'s [[OriginalFlags]] internal slot.
1. If _flags_ contains `"u"`, let _unicodeMatching_ be *true*.
1. Else, let _unicodeMatching_ be *false*.
1. If _flags_ contains `"y"`, let _newFlags_ be _flags_.
1. Else, let _newFlags_ be the string that is the concatenation of _flags_ and `"y"`.
1. Let _splitter_ be Construct(_C_, &laquo;_rx_, _newFlags_&raquo;).
1. ReturnIfAbrupt(_splitter_).
1. Let _A_ be ArrayCreate(0).
1. Let _lengthA_ be 0.
1. If _limit_ is *undefined*, let _lim_ be 2<sup>53</sup>-1; else let _lim_ be ToLength(_limit_).
Expand All @@ -29780,22 +29747,19 @@ <h1>RegExp.prototype [ @@split ] ( _string_, _limit_ )</h1>
1. Let _p_ be 0.
1. If _lim_ = 0, return _A_.
1. If _size_ = 0, then
1. Let _z_ be RegExpExec(_splitter_, _S_).
1. Let _z_ be InnerRegExpExec(_splitter_, _S_, 0, `"y"`).
1. ReturnIfAbrupt(_z_).
1. If _z_ is not *null*, return _A_.
1. If _z_.[[Matches]] is not *null*, return _A_.
1. Assert: The following call will never result in an abrupt completion.
1. Perform CreateDataProperty(_A_, `"0"`, _S_).
1. Return _A_.
1. Let _q_ be _p_.
1. Repeat, while _q_ &lt; _size_
1. Let _setStatus_ be Set(_splitter_, `"lastIndex"`, _q_, *true*).
1. ReturnIfAbrupt(_setStatus_).
1. Let _z_ be RegExpExec(_splitter_, _S_).
1. Let _z_ be InnerRegExpExec(_splitter_, _S_, _q_, `"y"`).
1. ReturnIfAbrupt(_z_).
1. If _z_ is *null*, let _q_ be AdvanceStringIndex(_S_, _q_, _unicodeMatching_).
1. Else _z_ is not *null*,
1. Let _e_ be ToLength(Get(_splitter_, `"lastIndex"`)).
1. ReturnIfAbrupt(_e_).
1. If _z_.[[Matches]] is *null*, let _q_ be AdvanceStringIndex(_S_, _q_, _unicodeMatching_).
1. Else _z_.[[Matches]] is not *null*,
1. Let _e_ be _z_.[[LastIndex]].
1. If _e_ = _p_, let _q_ be AdvanceStringIndex(_S_, _q_, _unicodeMatching_).
1. Else _e_ &ne; _p_,
1. Let _T_ be a String value equal to the substring of _S_ consisting of the elements at indices _p_ (inclusive) through _q_ (exclusive).
Expand All @@ -29804,12 +29768,12 @@ <h1>RegExp.prototype [ @@split ] ( _string_, _limit_ )</h1>
1. Let _lengthA_ be _lengthA_ +1.
1. If _lengthA_ = _lim_, return _A_.
1. Let _p_ be _e_.
1. Let _numberOfCaptures_ be ToLength(Get(_z_, `"length"`)).
1. Let _numberOfCaptures_ be ToLength(Get(_z_.[[Matches]], `"length"`)).
1. ReturnIfAbrupt(_numberOfCaptures_).
1. Let _numberOfCaptures_ be max(_numberOfCaptures_-1, 0).
1. Let _i_ be 1.
1. Repeat, while _i_ &le; _numberOfCaptures_.
1. Let _nextCapture_ be Get(_z_, ToString(_i_)).
1. Let _nextCapture_ be Get(_z_.[[Matches]], ToString(_i_)).
1. ReturnIfAbrupt(_nextCapture_).
1. Perform CreateDataProperty(_A_, ToString(_lengthA_), _nextCapture_).
1. Let _i_ be _i_ +1.
Expand Down Expand Up @@ -29849,6 +29813,7 @@ <h1>RegExp.prototype.test( _S_ )</h1>
<emu-alg>
1. Let _R_ be the *this* value.
1. If Type(_R_) is not Object, throw a *TypeError* exception.
1. If _R_ does not have a [[RegExpMatcher]] internal slot, throw a *TypeError* exception.
1. Let _string_ be ToString(_S_).
1. ReturnIfAbrupt(_string_).
1. Let _match_ be RegExpExec(_R_, _string_).
Expand Down