Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUGFIX: Fusion parser fix multi line comment #4882

Merged
merged 4 commits into from Feb 11, 2024

Conversation

mhsdesign
Copy link
Member

@mhsdesign mhsdesign commented Feb 8, 2024

This fixes a bug where the Fusion parser would not parse following c-style comments correctly:

/**
comment with multiple stars even
**/

This happed when the ending count of * was even. So ending a comment with ***/ worked previously.

Now we use the "correct" regex from Jeffrey E.F. Friedl's book "Mastering Regular Expressions" Page 272 "Unrolling C Comments"
We already use his regex for string matching and it is really fast due to the unrolled loop. Faster than using the lazy quantifier ~^/\*.*?\*/~s.

I did a performance test with 1 million iteration on three different comment samples (with each sample having a dynamic part to clear possible caches):

Unrolled (this pr) Simple Lazy Quantifier
0.143725s 0.160235s
0.181047s 0.203759s
0.156254s 0.170144s

Additionally the error message for comments starting with /** was improved. Previously $nextLine->char(1) would return ** instead of just one * because wrongly implemented.

Upgrade instructions

Review instructions

Checklist

  • Code follows the PSR-2 coding style
  • Tests have been created, run and adjusted as needed
  • The PR is created against the lowest maintained branch
  • Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
  • Reviewer - The first section explains the change briefly for change-logs
  • Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

The character exclusion: `[^/*]` can be simplified to `[^/]`.

The star `*` can never be encountered here, because independent of the entry point we previously consume all stars (`\*+`).
Copy link
Member

@markusguenther markusguenther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can approve that it runs and does what is in the description. As I am not the fusion parser expert, I don't know if it is the best solution.

@mhsdesign mhsdesign merged commit 92bb4e2 into neos:8.0 Feb 11, 2024
7 checks passed
@mhsdesign mhsdesign deleted the bugfix/fusionParserCommentLexing branch February 11, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants