refactor(parser): reduce work parsing regexps #1999

overlookmotel · 2024-01-11T23:30:29Z

#1926 produced a small performance regression because when parsing a regexp, some work is repeated.

Hopefully this PR may reverse it.

Perhaps implementation is a little messy, but posting it now as want to see what Codspeed says.

codspeed-hq · 2024-01-11T23:42:13Z

CodSpeed Performance Report

Merging #1999 will not alter performance

_{Comparing overlookmotel:parser-regex (efee37e) with main (0a08686)}

Summary

✅ 14 untouched benchmarks

overlookmotel · 2024-01-11T23:48:15Z

Damn it! This makes no difference whatsoever!

Any idea why performance dropped slightly on #1926? Or was it just noise in the benchmarks?

Boshen · 2024-01-12T03:36:21Z

Damn it! This makes no difference whatsoever!

Any idea why performance dropped slightly on #1926? Or was it just noise in the benchmarks?

Just noise.

I forgot that regex has it's own standalone code path for getting the token, this PR is still good because it removes the weird re-parsing code from parse_regex.

There is no perf different because this is such a minor code path, and probably there's only a few regexes in the benchmark.

overlookmotel · 2024-01-12T11:21:57Z

Thanks. One question: You said on #1926:

In order to make things work nicely, the lexer will no longer recover
from a invalid regex.

Was that a desired outcome, or a compromise you had to make to get it working?

If the latter, I think the old behavior could now be restored after this PR by changing the breaks back into continues in the 2nd loop in read_regex:

oxc/crates/oxc_parser/src/lexer/mod.rs

Lines 871 to 888 in aa91fde

    
           while let Some(ch @ ('$' | '_' | 'a'..='z' | 'A'..='Z' | '0'..='9')) = self.peek() { 
        
               self.current.chars.next(); 
        
               if !ch.is_ascii_lowercase() { 
        
                   self.error(diagnostics::RegExpFlag(ch, self.current_offset())); 
        
                   break; 
        
               } 
        
               let flag = if let Ok(flag) = RegExpFlags::try_from(ch) { 
        
                   flag 
        
               } else { 
        
                   self.error(diagnostics::RegExpFlag(ch, self.current_offset())); 
        
                   break; 
        
               }; 
        
               if flags.contains(flag) { 
        
                   self.error(diagnostics::RegExpFlagTwice(ch, self.current_offset())); 
        
                   break; 
        
               } 
        
               flags |= flag; 
        
           }

Boshen · 2024-01-12T12:34:59Z

Was that a desired outcome, or a compromise you had to make to get it working?

A compromise.

But leaving it as a hard error is also fine, I never had any rules for recoverability 😅

As discussed in #1999 (comment), this PR restores some of regex parsing behavior to as it was prior to #1926.

oxc-project#1926 produced a small performance regression because when parsing a regexp, some work is repeated.

As discussed in oxc-project#1999 (comment), this PR restores some of regex parsing behavior to as it was prior to oxc-project#1926.

github-actions bot added the A-parser Area - Parser label Jan 11, 2024

refactor(parser): reduce work parsing regexps

efee37e

overlookmotel force-pushed the parser-regex branch from d4cf82f to efee37e Compare January 11, 2024 23:43

Boshen merged commit c731685 into oxc-project:main Jan 12, 2024
17 checks passed

overlookmotel deleted the parser-regex branch January 12, 2024 11:24

overlookmotel mentioned this pull request Jan 12, 2024

fix(parser): restore regex flag parsing #2007

Merged

Boshen pushed a commit that referenced this pull request Jan 12, 2024

fix(parser): restore regex flag parsing (#2007)

712e99c

As discussed in #1999 (comment), this PR restores some of regex parsing behavior to as it was prior to #1926.

IWANABETHATGUY pushed a commit to IWANABETHATGUY/oxc that referenced this pull request May 29, 2024

refactor(parser): reduce work parsing regexps (oxc-project#1999)

1af92cf

oxc-project#1926 produced a small performance regression because when parsing a regexp, some work is repeated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(parser): reduce work parsing regexps #1999

refactor(parser): reduce work parsing regexps #1999

overlookmotel commented Jan 11, 2024 •

edited

Loading

codspeed-hq bot commented Jan 11, 2024 •

edited

Loading

overlookmotel commented Jan 11, 2024

Boshen commented Jan 12, 2024

overlookmotel commented Jan 12, 2024

Boshen commented Jan 12, 2024

refactor(parser): reduce work parsing regexps #1999

refactor(parser): reduce work parsing regexps #1999

Conversation

overlookmotel commented Jan 11, 2024 • edited Loading

codspeed-hq bot commented Jan 11, 2024 • edited Loading

CodSpeed Performance Report

Merging #1999 will not alter performance

Summary

overlookmotel commented Jan 11, 2024

Boshen commented Jan 12, 2024

overlookmotel commented Jan 12, 2024

Boshen commented Jan 12, 2024

overlookmotel commented Jan 11, 2024 •

edited

Loading

codspeed-hq bot commented Jan 11, 2024 •

edited

Loading