Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: correctly determine flags for static RegExp parsing #1464

Open
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@bakkot
Copy link
Contributor

bakkot commented Mar 2, 2019

Fixes #1343 according to the approach discussed therein; see that issue for full context, but briefly, the current spec text says to do a parse according to "the goal symbol Pattern", which is meaningless in the current spec because Pattern has flags, which must be provided. Determining these flags is a little involved.

As it says there, this is heavily duplicated with the runtime semantics for RegExpInitialize. If the editors think this is worth noting, or avoiding, I can try to come up with something.

I'd particularly appreciate reviews from @jmdyck and @waldemarhorwat.

@ljharb ljharb requested review from jmdyck and waldemarhorwat Mar 3, 2019

@ljharb
Copy link
Member

ljharb left a comment

This seems good to me (pending requested reviews), but it does seem good to maybe create a separate abstract operation IsValidBuiltInRegExpFlags for the repeated gimsuy logic, but I'm not sure how much else could be cleanly deduped.

@ljharb ljharb requested review from zenparsing and tc39/ecma262-editors Mar 3, 2019

1. If FlagText of _literal_ contains `"u"`,
1. If _P_ cannot be recognized using the goal symbol |Pattern[+U, +N]| of the ECMAScript RegExp grammar specified in <emu-xref href="#sec-patterns"></emu-xref>, return *false*.
1. Otherwise, return *true*.
1. Parse _P_ using the grammars in <emu-xref href="#sec-patterns"></emu-xref>. The goal symbol for the parse is |Pattern[~U, ~N]|. If the result of parsing contains a |GroupName|, reparse with the goal symbol |Pattern[~U, +N]| and use this result instead. If _P_ did not conform to the grammar, if any elements of _P_ were not matched by the parse, or if any Early Error conditions exist, return *false*. Otherwise, return *true*.

This comment has been minimized.

@zenparsing

zenparsing Mar 6, 2019

Member

This is a mouthful, and I can see it's basically duplicated over here. An improvement in the future might be to attempt some common factoring (but it might also be awkard due to runtime/static differences).

This comment has been minimized.

@jmdyck

jmdyck Mar 10, 2019

Collaborator

I've found a way to factor out the common bits. I can submit that as a follow-up.

Show resolved Hide resolved spec.html
Show resolved Hide resolved spec.html
1. If FlagText of _literal_ contains `"u"`,
1. If _P_ cannot be recognized using the goal symbol |Pattern[+U, +N]| of the ECMAScript RegExp grammar specified in <emu-xref href="#sec-patterns"></emu-xref>, return *false*.
1. Otherwise, return *true*.
1. Parse _P_ using the grammars in <emu-xref href="#sec-patterns"></emu-xref>. The goal symbol for the parse is |Pattern[~U, ~N]|. If the result of parsing contains a |GroupName|, reparse with the goal symbol |Pattern[~U, +N]| and use this result instead. If _P_ did not conform to the grammar, if any elements of _P_ were not matched by the parse, or if any Early Error conditions exist, return *false*. Otherwise, return *true*.

This comment has been minimized.

@jmdyck

jmdyck Mar 10, 2019

Collaborator

I've found a way to factor out the common bits. I can submit that as a follow-up.

@bakkot

This comment has been minimized.

Copy link
Contributor Author

bakkot commented Mar 10, 2019

@jmdyck Addressed those two; thanks!

Show resolved Hide resolved spec.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.