Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Rework unesc for a 63+% performance boost to all of postcss. #239

Merged
merged 2 commits into from Apr 19, 2021

Conversation

samccone
Copy link
Contributor

@samccone samccone commented Apr 9, 2021

In profiling postcss I found that a significant amount of time was being
spent in unesc, this was due to the expensive regex checks that were
being performed on the fly for every selector in the codebase which looked to be performing quite poorly inside of modern node and v8.

Old

image


As an experiment and based on some prior experience with this class of slowdown I migrated the implementation to one that performs a scan through the string instead of running a regex replace. By testing this on my local application I instantly saw the work from this function go from > 900 ms to ~100ms.

New

image

This implementation passes all of the existing test cases and aims to mirror the prior implementation's implementation details :)


Based on my application I am seeing the major wins come from purgecss dropping my total application build by multiple seconds! 🔥


Perf testing

I set up a simple perf test here:
samccone@be99c23 which takes all of tailwind's css selectors and then unescapes them 100 times.

(left old, right new)
image

Previously this simple test took 20.17ms now it takes 7.1ms -- a 63% difference

Finally based on input I did a comparison between the new, old, and chrome's implementation for this work

original 
        avg: 4.765ms
        std: 2.5159ms
        max: 35ms
        min: 3ms
        
new 
        avg: 1.76ms
        std: 0.7915ms
        max: 12ms
        min: 1ms
        
chrome 
        avg: 4.383ms
        std: 1.7602ms
        max: 34ms
        min: 3ms

You can run the test case here https://github.com/samccone/postcss-selector-parser/blob/sjs/perf/perf/index.js, the testing is similar to what I did before but now I run the test cases 1000 times each.

Finally I tested this on tailwind and saw the wins propagating to the ecosystem with this patch!
https://twitter.com/samccone/status/1381423236253032454

@coveralls
Copy link

coveralls commented Apr 9, 2021

Coverage Status

Coverage increased (+0.04%) to 95.418% when pulling 5c6c988 on samccone:master into 96a85e3 on postcss:master.

Copy link

@mathiasbynens mathiasbynens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive investigation + speed-up! I hope you don’t mind I took a look and left some comments with further ideas — WDYT?

src/util/unesc.js Outdated Show resolved Hide resolved
src/util/unesc.js Outdated Show resolved Hide resolved
src/util/unesc.js Outdated Show resolved Hide resolved
@alexander-akait
Copy link
Collaborator

Great, let's improve it (from feedback above) and do release

@samccone samccone force-pushed the master branch 2 times, most recently from 0e00b18 to a95c285 Compare April 9, 2021 21:36
@samccone samccone changed the title 🚀 Rework unesc for a 80+% performance boost to all of postcss. 🚀 Rework unesc for a 95+% performance boost to all of postcss. Apr 9, 2021
@samccone
Copy link
Contributor Author

samccone commented Apr 9, 2021

Thanks @mathiasbynens and @alexander-akait for the review! All updated and sped up another 10% or so!

@alexander-akait
Copy link
Collaborator

@samccone @mathiasbynens Can we do small benchmarks, interesting how it faster/slow

@samccone
Copy link
Contributor Author

Hi @alexander-akait yes, at the bottom of the original commit message I added a perf section outlining the old and new implementation.

As far as reusing chrome's implementation with a regex this is possible but we would need to update the license of this project to include chrome's licence as well.

Finally we can totally do a performance test between chrome's and this implementation. Upon first pass chrome's implementation handles a nice edge case that we do not (invalid unicode), but does not handle a few other decoding cases that the existing test cases require. (Changing this while more correct would be a backwards incompatible change to this core part of postcss which would be a larger undertaking)

I will do some further investigation today! 🚀

@samccone samccone force-pushed the master branch 3 times, most recently from ea4f37a to 9a358c7 Compare April 11, 2021 18:48
@samccone
Copy link
Contributor Author

samccone commented Apr 11, 2021

Hi @alexander-akait and @mathiasbynens a few updates

  1. I updated the implementation to now (previous unhandled in the old implementation) handle the spec edgecase for 0 hex value escaping, lone surrogates, and out of bound values https://drafts.csswg.org/css-syntax/#maximum-allowed-code-point - this is the new commit. Along with these new edge-cases new tests were added as well.
  2. I ran new perf tests against chrome's implementation and found that our implementation in this CL is significantly faster.

The numbers are here:

original 
        avg: 4.765ms
        std: 2.5159ms
        max: 35ms
        min: 3ms
        
new 
        avg: 1.76ms
        std: 0.7915ms
        max: 12ms
        min: 1ms
        
chrome 
        avg: 4.383ms
        std: 1.7602ms
        max: 34ms
        min: 3ms

And the testing methodology can be viewed here https://github.com/samccone/postcss-selector-parser/blob/sjs/perf/perf/index.js#L160-L185

@samccone samccone changed the title 🚀 Rework unesc for a 95+% performance boost to all of postcss. 🚀 Rework unesc for a 63+% performance boost to all of postcss. Apr 11, 2021
Copy link
Collaborator

@alexander-akait alexander-akait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @mathiasbynens Can you look too? It would be nice if a couple more eyes looked at the code, thanks

src/util/unesc.js Outdated Show resolved Hide resolved
spent in [`unesc`](https://github.com/postcss/postcss-selector-parser/commits/master/src/util/unesc.js), this was due to the expensive regex checks that were
being performed on the fly for every selector in the codebase which looked to be performing quite poorly inside of modern node and v8.

![image](https://user-images.githubusercontent.com/883126/114136698-fdd98a80-98bf-11eb-8068-ace4f6f2274d.png)

----

As an experiment and based on some prior experience with this class of slowdown I migrated the implementation to one that performs a scan through the string instead of running a regex replace. By testing this on my local application I instantly saw the work from this function go from > 900 ms to ~100ms.

![image](https://user-images.githubusercontent.com/883126/114136734-0c27a680-98c0-11eb-82ab-f0c9529fd32d.png)

This implementation passes all of the existing test cases and aims to mirror the prior implementation's implementation details :)

-----

Based on my application I am seeing the major wins come from purgecss dropping my total application build by multiple seconds! 🔥
surrogates and out of bound codepoint values.
@schuay
Copy link

schuay commented Apr 13, 2021

To add some history, slow String.p.replace with global regexp and a callable replacer has been known for a long time, and unfortunately I haven't gotten around to fixing it yet - apologies for that. The previous situation was in fact in a very similar context, see nodejs/node#16986 (comment) and crbug.com/v8/7081.

I'm hoping to spend some quality time with regexp later this year. For priorization, it would help if you file a V8 bug report for this, just to point out that this is a real issue, still impacting real cases.

In the meantime, it makes sense to use perf workarounds. I know Leszek has been looking at one based on RegExp.p.exec (exec, unlike replace, is actually very fast in the common case), which I suppose he will post here soon.

@LeszekSwirski
Copy link

Right, the following:

function regexpexec(str) {
    let new_str = "";
    let prev_index = 0;
    let match;
    while ((match = unescapeRegExp.exec(str)) != null) {
        let escaped = match[1];
        let escapedWhitespace = match[2];
        if (match.index > prev_index) {
            new_str += str.slice(prev_index, match.index);
        }

        const high = "0x" + escaped - 0x10000;

        // NaN means non-codepoint
        // Workaround erroneous numeric interpretation of +"0x"
        // eslint-disable-next-line no-self-compare
        new_str += high !== high || escapedWhitespace
            ? escaped
            : high < 0
            ? // BMP codepoint
              String.fromCharCode(high + 0x10000)
            : // Supplemental Plane codepoint (surrogate pair)
              String.fromCharCode(
                  (high >> 10) | 0xd800,
                  (high & 0x3ff) | 0xdc00
              );
        prev_index = unescapeRegExp.lastIndex;
    }
    if (str.length > prev_index) {
        new_str += str.slice(prev_index, str.length);
    }
    return new_str;
}

gives me a 50% speedup over this manual string walk:

original
        avg: 2.592ms
        std: 0.5528ms
        max: 8ms
        min: 2ms

new
        avg: 1.578ms
        std: 0.5531ms
        max: 7ms
        min: 1ms

chrome
        avg: 2.872ms
        std: 0.4956ms
        max: 8ms
        min: 2ms

Regexp.exec
        avg: 1.002ms
        std: 0.1095ms
        max: 2ms
        min: 0ms

@alexander-akait
Copy link
Collaborator

/cc @samccone looks like we have new faster solution 😄

@samccone
Copy link
Contributor Author

samccone commented Apr 16, 2021

@LeszekSwirski unfortunately the solution you listed does not pass all of the test cases added in this change. This is not reflective of your solution being incorrect but rather is more reflective of the existing implementation having multiple bugs that were uncovered and fixed during this code-review process.

At this point I see several options

1. ignore the fixes to spec bugs that this change fixes and take the solution you wrote

(pros)

+ much faster than what is in place

(cons)

- carries forward the incorrect behavior of unesc forward

2. Take the current implementation

(pros)

+ much faster than what is in place 
+ more spec compliant and correct

(cons)

- not as fast as option 1

3. (maybe 2.a) follow option 2 and follow up with further perf improvements

(pros)

+ helpful to the ecosystem
+ much faster than what is currently released
+ more spec compliant

(cons)

- we know there is still room for improvement through further refactoring

@alexander-akait my personal recommendation would be to folllow option 3 with continued community investment to continue to iterate on the optimal solution.

Please all let me know how you would like to move forward.

@samccone
Copy link
Contributor Author

@schuay v8 bug filed! https://bugs.chromium.org/p/v8/issues/detail?id=11664

@alexander-akait alexander-akait merged commit 1012e3a into postcss:master Apr 19, 2021
@alexander-akait
Copy link
Collaborator

Big thanks

@LeszekSwirski
Copy link

@samccone Makes sense, although fixing that V8 performance issue of course won't help with the correctness issues you found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants