Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: Update GetSubstitution to match reality #1732

Open
wants to merge 3 commits into
base: master
from

Conversation

@gibson042
Copy link
Contributor

commented Oct 9, 2019

$nn patterns fall back to $n when there aren't at least nn captures

Fixes gh-1426

Preview: https://deploy-preview-1732--ecma262-snapshots.netlify.com/#sec-getsubstitution

$nn patterns fall back to $n when there aren't at least nn captures

Fixes gh-1426
1. The replacement is the string-concatenation of the String `"$<"`, ! GetSubstitution(_matched_, _str_, _position_, _captures_, _namedCaptures_, _groupName_), and the String `">"`.
1. Else,
1. Let _capture_ be ? Get(_namedCaptures_, _groupName_).
1. If _capture_ is *undefined*, the replacement is the empty String. Otherwise, the replacement is ? ToString(_capture_).

This comment was marked as resolved.

Copy link
@ljharb

ljharb Oct 9, 2019

Member
Suggested change
1. If _capture_ is *undefined*, the replacement is the empty String. Otherwise, the replacement is ? ToString(_capture_).
1. If _capture_ is *undefined*, the replacement is the empty String. Otherwise, the replacement is ? ToString(_capture_).
1. If _capture_ is *undefined*, replace the text through `>` with the empty string.
1. Otherwise, replace the text through `>` with ? ToString(_capture_).
</emu-alg>
If _nn_ is `00` or the MV of _nn_ &gt; _m_, the replacement is the string-concatenation of the replacement for the first two matched code units (i.e., a replacement specified by one of the two preceding rows) and the remaining matched code units (i.e., a single |DecimalDigit|).

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 9, 2019

Collaborator
Suggested change
If _nn_ is `00` or the MV of _nn_ &gt; _m_, the replacement is the string-concatenation of the replacement for the first two matched code units (i.e., a replacement specified by one of the two preceding rows) and the remaining matched code units (i.e., a single |DecimalDigit|).
If _nn_ is `00` or the MV of _nn_ &gt; _m_, the replacement is the string-concatenation of the replacement for the first two matched code units (i.e., a replacement specified by one of the two preceding rows) and the remaining matched code unit (i.e., a single |DecimalDigit|).

("units" -> "unit")

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

I want to leave this, because the intent is to split the matched code units into two subsequences, replacing the first and preserving the second (which happens to be a single code unit, but is not required to be so for the logic to work).

spec.html Show resolved Hide resolved
1. Let _k_ be 0.
1. Repeat, while _k_ &lt; _replacementLength_
1. Let _replaceablePattern_ be the longest sequence of consecutive code units from _replacement_ beginning with the code unit at index _k_ such that _replaceablePattern_ matches the &ldquo;Code units&rdquo; column of a row in <emu-xref href="#table-45"></emu-xref>.
1. If _replaceablePattern_ is not empty,

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 9, 2019

Collaborator
Suggested change
1. If _replaceablePattern_ is not empty,
1. If _replaceablePattern_ is not empty, then

Also, this condition assumes that _replaceablePattern_ is empty if there's no match, but there's nothing that actually makes it so. If might be better to say If no such match is found, then (and swap the two arms).

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

The logic itself makes it so. "The longest sequence of consecutive code units from replacement beginning with the code unit at index k…" is either zero code units or more than zero code units. If it is more than zero, then replaceablePattern is not empty and we will append the corresponding replacement text to result. Otherwise, we will append the code unit at index k.

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 10, 2019

Collaborator

"The longest sequence of consecutive code units from replacement beginning with the code unit at index k…" is either zero code units or more than zero code units.

Based on that quote, it could be, but the part you elided is such that _replaceablePattern_ matches the "Code units" column of a row in [Table 53], and an empty sequence of code units does not match the "Code units" column of any row in the table.

If the MV of _n_ &gt; _m_, the replacement is the matched code units (equivalent to no replacement).
<br>
Otherwise, if the element in _captures_ at index equal to the MV of _n_ minus 1 is *undefined*, the replacement is the empty String.
<br>
Otherwise, the replacement is the element in _captures_ at index equal to the MV of _n_ minus 1.
Comment for lines +29953  – +29957

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 9, 2019

Collaborator

If you were to use an <emu-alg>, that would allow (something like) Let _nmv_ be the MV of _n_, which you could then use at 3 points.

It would also let you replace the element in _captures_ at index equal to the MV of _n_ minus 1 with _captures_[_nmv_ - 1].

For actual names, I think I'd recommend _N_ for the code unit (which matches the variable in column 1), and then _n_ for its MV.

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

I considered <emu-alg>, but ultimately preferred the appearance without it: https://deploy-preview-1732--ecma262-snapshots.netlify.com/#table-45 (and I'd get rid of the <emu-alg> for $<…> if I could find a good way to do so). As always, though, I'm willing to change if there's consensus.

As for N vs. n, we must use the latter because the former is not defined as the expansion of a lexical grammar production and thus has no MV.

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 10, 2019

Collaborator

As for N vs. n, we must use the latter because the former is not defined as the expansion of a lexical grammar production and thus has no MV.

Well, of course, you'd need to change the name in column 2 as well. Sorry that wasn't clear. Anyhow, the point there was to free up the name _n_ for the MV of _N_, but that isn't needed if you don't convert to <emu-alg>.

<br>
Otherwise, if the element in _captures_ at index equal to the MV of _nn_ minus 1 is *undefined*, the replacement is the empty String.
<br>
Otherwise, the replacement is the element in _captures_ at index equal to the MV of _nn_ minus 1.

This comment has been minimized.

Copy link
@jmdyck

jmdyck Oct 9, 2019

Collaborator

Ditto above comment re using <emu-alg>.

Copy link
Contributor Author

left a comment

Thanks for the review, @jmdyck. I've updated the PR with some of your suggestions and added explanations for why I didn't accept the others.

1. Let _k_ be 0.
1. Repeat, while _k_ &lt; _replacementLength_
1. Let _replaceablePattern_ be the longest sequence of consecutive code units from _replacement_ beginning with the code unit at index _k_ such that _replaceablePattern_ matches the &ldquo;Code units&rdquo; column of a row in <emu-xref href="#table-45"></emu-xref>.
1. If _replaceablePattern_ is not empty,

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

The logic itself makes it so. "The longest sequence of consecutive code units from replacement beginning with the code unit at index k…" is either zero code units or more than zero code units. If it is more than zero, then replaceablePattern is not empty and we will append the corresponding replacement text to result. Otherwise, we will append the code unit at index k.

1. If _capture_ is *undefined*, replace the text through `>` with the empty string.
1. Otherwise, replace the text through `>` with ? ToString(_capture_).
</emu-alg>
If _nn_ is `00` or the MV of _nn_ &gt; _m_, the replacement is the string-concatenation of the replacement for the first two matched code units (i.e., a replacement specified by one of the two preceding rows) and the remaining matched code units (i.e., a single |DecimalDigit|).

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

I want to leave this, because the intent is to split the matched code units into two subsequences, replacing the first and preserving the second (which happens to be a single code unit, but is not required to be so for the logic to work).

If the MV of _n_ &gt; _m_, the replacement is the matched code units (equivalent to no replacement).
<br>
Otherwise, if the element in _captures_ at index equal to the MV of _n_ minus 1 is *undefined*, the replacement is the empty String.
<br>
Otherwise, the replacement is the element in _captures_ at index equal to the MV of _n_ minus 1.

This comment has been minimized.

Copy link
@gibson042

gibson042 Oct 10, 2019

Author Contributor

I considered <emu-alg>, but ultimately preferred the appearance without it: https://deploy-preview-1732--ecma262-snapshots.netlify.com/#table-45 (and I'd get rid of the <emu-alg> for $<…> if I could find a good way to do so). As always, though, I'm willing to change if there's consensus.

As for N vs. n, we must use the latter because the former is not defined as the expansion of a lexical grammar production and thus has no MV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.