Skip to content

Parse regular expressions#4334

Merged
rv-jenkins merged 4 commits intodevelopfrom
parse-regex
May 15, 2024
Merged

Parse regular expressions#4334
rv-jenkins merged 4 commits intodevelopfrom
parse-regex

Conversation

@Scott-Guest
Copy link
Copy Markdown
Contributor

@Scott-Guest Scott-Guest commented May 9, 2024

Closes #4295

  • Implements a parser for the new regex syntax as given by the grammar in Specify and parse permitted regex syntax #4295, then re-emits for Flex.
  • Fix the regression-new test suite to also strip out absolute paths to domains.md from test outputs

All downstream semantics have already been updated to the new syntax, and if anything was missed, the new error messages should make the fix obvious.

In a follow up PR, I still plan to:

  • Update the documentation with the grammar in Specify and parse permitted regex syntax #4295.
  • Validate that all named lexical elements actually exist in the K definition. We still don't check this, and Flex will just error if the identifier doesn't exist.
  • Properly handle Unicode characters in regex
    • Flex is an 8-bit scanner, so any Unicode codepoint is treated as its sequence of bytes, and some translation is needed to get Unicode-aware semantics
    • Outside of a character class, parenthesize so r"😊*" becomes r"(\xF0\x9F\x98\x8A)*"
    • Inside a character class, convert to an explicit | so r"[😊ab]" becomes r"(\xF0\x9F\x98\x8A)|[ab]"
    • Character ranges and negated character classes don't have a straightforward translation, so just report an error

@Scott-Guest Scott-Guest self-assigned this May 9, 2024
@Scott-Guest Scott-Guest marked this pull request as ready for review May 14, 2024 19:30
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who needs Scala? 😛

Copy link
Copy Markdown
Contributor

@Baltoli Baltoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design here is really nice; great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants