-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Performance on big files #4776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, I took some time to investigate and play with the code, and I've already got something to share. I'm using ProblemAccording to the DevTools Profiler flamechart, Proposed solutionSomeone might need to dive into ProblemAccording to the DevTools Profiler flamechart, Proposed solutionMemoize I've implemented that, and during the test file run, I got some counters: {
getStringWidthMemoizedCallCount: 49583,
getStringWidthMemoizedMapSize: 1352,
getStringWidthMemoizedHitCount: 48231,
getStringWidthMemoizedMissCount: 1352
} If I got them right, this eliminates almost 97.2% of the calls to I attempted to use a prototype-less object (to cut prototype chain lookup) primed with const getStringWidth = util.getStringWidth;
let getStringWidthMemoizedCallCount = 0;
let getStringWidthMemoizedMapSize = 0;
let getStringWidthMemoizedHitCount = 0;
let getStringWidthMemoizedMissCount = 0;
const getStringWidthMemoizedMap = Object.create(null, {
"": {
value: 0,
configurable: true
}
});
// force switch the object to dictionary mode
delete getStringWidthMemoizedMap[""];
function getStringWidthMemoized(str) {
++getStringWidthMemoizedCallCount;
let width = getStringWidthMemoizedMap[str];
if (width !== undefined) {
++getStringWidthMemoizedHitCount;
return width;
}
++getStringWidthMemoizedMissCount;
width = getStringWidth(str);
getStringWidthMemoizedMap[str] = width;
++getStringWidthMemoizedMapSize;
}
/* eslint-disable no-console,prettier/prettier */
process.on("exit", () => {
console.log({
getStringWidthMemoizedCallCount,
getStringWidthMemoizedMapSize,
getStringWidthMemoizedHitCount,
getStringWidthMemoizedMissCount,
});
});
/* eslint-enable no-console,prettier/prettier */ Problem
During the test file run, there's not more than 2 arguments actually passed into {
mapFastPathArgcMax: 2
} Proposed solutionRefactor to not use // Similar to FastPath.prototype.each, except that the results of the
// callback function invocations are stored in an array and returned at
// the end of the iteration.
FastPath.prototype.map = function map(callback, name1, name2, name3, name4) {
// /*, name1, name2, ... */) {
const s = this.stack;
const origLen = s.length;
let value = s[origLen - 1];
const argv = [name1, name2, name3, name4]; // arguments
const argc = argv.length;
for (let i = 0; i < argc; ++i) { // let i = 1
const name = argv[i]; // arguments[i]
if (name === undefined) {
break;
}
value = value[name];
s.push(name, value);
} |
One more thing I tried, maybe it's already been tried and is already optimized, please tell me if it is.
I'm a little confused which variant should theoretically be best-performing in this case: with arrays or with classes.
Proposed solutionRefactor to use a constructor function that initializes the properties in a predefined order. Use function Cmd(ind, mode, doc) {
this.ind = ind;
this.mode = mode;
this.doc = doc;
} cmds.push(new Cmd(makeIndent(ind, options), mode, doc.contents)); const cmds = [new Cmd(rootIndent(), MODE_BREAK, doc)]; |
Problem
{printAstToDocCacheHitCount: 206, printAstToDocCacheMissCount: 9083} |
This is awesome @sompylasar! 😎 A few comments:
Is there any reason you’re using an object as a map instead of a real
Is using the spread operator (
Sounds like it’s not helping much. Remove it 🔥🔥🔥 |
This is awesome! I can't wait to see if all those leads eventually become performance wins! For the Map inside of printAstToDoc, unfortunately we can't get rid of it. See #2259 for context. The low hit rate is misleading as it's preventing a sub-tree evaluation. |
One concern that I still have is that I could not get a good reproduction of the time measurements from subsequent runs on my machine, so I need (or someone could help out with this) to establish a way to benchmark multiple runs to get an average run time and then try to optimize that.
I previously saw some performance benefits from native objects' index operators compared to
Babel transpiles spread operator for "rest" argument to function x(...args) {
const x = args[0];
for (let i = 1, ic = args.length; i < ic; ++i) {}
} "use strict";
function x() {
var x = arguments.length <= 0 ? undefined : arguments[0];
for (var i = 1, ic = arguments.length; i < ic; ++i) {}
} But, I recently saw an article that voids the recommendation to not use
👍 |
Btw, the naive memoization of |
The only hot place for getStringWidth is in a single function: https://github.com/prettier/prettier/blob/dcf44ffbdc2f403de02f12516b0c6d5d5813b16f/src/doc/doc-printer.js |
If you look through getStringWidth function, it calls through multiple packages that each create new intermediate strings. Most of the strings that are fed through the printer will be pure ascii, so it's a lot of overhead. I don't believe that adding a cache is the right strategy here. I recommend we should do the following:
If it still shows up in the profile, we may want to do more radical things adding two kind of strings, one which contains syntax (and therefore is pure ascii and doesn't need this check) and another one which is user-defined and can contain unicode characters with varying widths. |
Yes, that's what I noticed, too.
👍 |
BTW, for those who interested why |
@vjeux I gave your recommendation about
This fast path that checks is still a full scan on every string that comes in (possibly with a What do you think? |
Most strings are tiny, O complexity doesn’t really apply here. Scanning a string to see if all characters are ascii is one of the fastest operation a cpu can do. Faster than allocating memory to put in a hash map and doing a lookup. But I don’t really trust my instinct on those things; just build it and see if it changes anything in the perf. For your comment about startup time, we keep the node process alive so I’m not super interested in this. |
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements. - `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
Yes, I hear you. Although I believe it should not copy the string into the hashmap if the key string is already in the JS heap (but no hard evidence of that).
Yes, that's why I started from measuring, and then tweaked the code to find opportunities to optimize.
Good to know, this means the benchmark options I added are relevant to measure just the function you're interested in optimizing. Please have a look (tell me if you want to see it in a PR, and should the PR include the benchmark options commit): // eslint-disable-next-line no-control-regex
const notAsciiRegex = /[^\x20-\x7F]/;
function getStringWidth(text) {
if (!text) {
return 0;
}
// shortcut to avoid needless string `RegExp`s, replacements, and allocations within `string-width`
if (!notAsciiRegex.test(text)) {
return text.length;
}
// emojis are considered 2-char width for consistency
// see https://github.com/sindresorhus/string-width/issues/11
// for the reason why not implemented in `string-width`
return stringWidth(text.replace(emojiRegex, " "));
}
BeforeTotal run time for 1000 repetitions for
AfterTotal run time for 1000 repetitions for
|
Wow! 20% faster by adding a trivial test :) Ship it! |
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements. - `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements. - `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements. - `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
So why wasn't this change incorporated #4779 ? I'm measuring it now (see Before and After below), and it gives a minor performance improvement on an unbundled Beforefunction concat(parts) {
if (process.env.NODE_ENV !== "production") {
Afterconst isProduction = process.env.NODE_ENV === "production";
const isNotProduction = !isProduction;
function concat(parts) {
if (isNotProduction) { (note: I hoisted the negation as well)
|
Okay, I take this back – bundled version starts having this |
Also, (IMO) the performance of the unbundled/development versions shouldn’t be as important as the performance of the bundled/production version. |
Agree, but for the sake of quick iterations on the code, it's nice to have at least NODE_ENV=production runs not much worse than the bundled version runs. The build takes quite some time, also now there's no way known to me to build just one variant. Of course, it's more correct to measure the build, not the source as I do, as there's an optimizing compiler in place (rollup). Maybe someone could split and speed up the build to improve development iteration times. |
@sompylasar the build system has a cache now. It should only recompile an artifact if:
We probably need to update CONTRIBUTING.md to reflect this |
@duailibe "one of the source files included in the artifact was changed;" – right, and this is what I'm doing by working on this ticket, changing the source files. 😄 |
@sompylasar I'm aware, but you're likely to change files that will only affect |
In addition to a tiny performance improvement outlined below, the CPU profile of traverseDoc is now more readable. Also anonymous arrow functions changed to named regular functions so that they are properly displayed in the CPU profile, and moved to outer scope where there's no closure so that they aren't re-created (this change's performance is dependent on JS engine implementation and optimization details). Before (profile): ``` 7129.9 ms 5.43 % 13349.9 ms 10.18 % traverseDocRec 7067.4 ms 5.39 % 11285.5 ms 8.60 % traverseDocRec 31.5 ms 0.02 % 1031.9 ms 0.79 % traverseDoc 23.6 ms 0.02 % 12313.4 ms 9.39 % traverseDoc 2.6 ms 0.00 % 0.3 ms 0.00 % (anonymous) 1.7 ms 0.00 % 1.7 ms 0.00 % call 1.6 ms 0.00 % 1.6 ms 0.00 % call 0.5 ms 0.00 % 0.5 ms 0.00 % conditionalGroup 0.4 ms 0.00 % 0.4 ms 0.00 % printDocToString$1 0.1 ms 0.00 % 0.1 ms 0.00 % printGenerically 0.1 ms 0.00 % 0.1 ms 0.00 % t 0.1 ms 0.00 % 0.1 ms 0.00 % ifBreak 0.1 ms 0.00 % 0.1 ms 0.00 % (anonymous) 0 ms 0 % 0.1 ms 0.00 % forEach ``` After (profile): ``` 6937.9 ms 5.37 % 12872.5 ms 9.97 % traverseDoc 5944.0 ms 4.60 % 11047.3 ms 8.55 % propagateBreaks 735.7 ms 0.57 % 1358.3 ms 1.05 % findInDoc 257.9 ms 0.20 % 466.7 ms 0.36 % findInDoc 0.1 ms 0.00 % 0.1 ms 0.00 % has 0.1 ms 0.00 % 0.1 ms 0.00 % printArgumentsList ``` Before (performance): ``` cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null Debugger listening on ws://127.0.0.1:9229/4b52c027-ef62-49d6-8770-179e805a0f43 For help see https://nodejs.org/en/docs/inspector Debugger attached. [debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]} [debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier-2/LspLanguageService.js' [debug] loaded options `null` [debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"} [debug] '--debug-repeat' option found, running formatWithCursor 1000 times. [debug] '--debug-repeat' measurements for formatWithCursor: { [debug] "repeat": 1000, [debug] "hz": 7.774598830700336, [debug] "ms": 128.624 [debug] } ``` After (performance): ``` cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null Debugger listening on ws://127.0.0.1:9229/aa76e134-a68c-44ed-89a8-efb68bc46baa For help see https://nodejs.org/en/docs/inspector Debugger attached. [debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]} [debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js' [debug] loaded options `null` [debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"} [debug] '--debug-repeat' option found, running formatWithCursor 1000 times. [debug] '--debug-repeat' measurements for formatWithCursor: { [debug] "repeat": 1000, [debug] "hz": 7.888114977163907, [debug] "ms": 126.773 [debug] } ```
New improvement 👉 #4848 |
In addition to a tiny performance improvement outlined below, the CPU profile of traverseDoc is now more readable. Also anonymous arrow functions changed to named regular functions so that they are properly displayed in the CPU profile, and moved to outer scope where there's no closure so that they aren't re-created (this change's performance is dependent on JS engine implementation and optimization details). Before (profile): ``` 7129.9 ms 5.43 % 13349.9 ms 10.18 % traverseDocRec 7067.4 ms 5.39 % 11285.5 ms 8.60 % traverseDocRec 31.5 ms 0.02 % 1031.9 ms 0.79 % traverseDoc 23.6 ms 0.02 % 12313.4 ms 9.39 % traverseDoc 2.6 ms 0.00 % 0.3 ms 0.00 % (anonymous) 1.7 ms 0.00 % 1.7 ms 0.00 % call 1.6 ms 0.00 % 1.6 ms 0.00 % call 0.5 ms 0.00 % 0.5 ms 0.00 % conditionalGroup 0.4 ms 0.00 % 0.4 ms 0.00 % printDocToString$1 0.1 ms 0.00 % 0.1 ms 0.00 % printGenerically 0.1 ms 0.00 % 0.1 ms 0.00 % t 0.1 ms 0.00 % 0.1 ms 0.00 % ifBreak 0.1 ms 0.00 % 0.1 ms 0.00 % (anonymous) 0 ms 0 % 0.1 ms 0.00 % forEach ``` After (profile): ``` 6937.9 ms 5.37 % 12872.5 ms 9.97 % traverseDoc 5944.0 ms 4.60 % 11047.3 ms 8.55 % propagateBreaks 735.7 ms 0.57 % 1358.3 ms 1.05 % findInDoc 257.9 ms 0.20 % 466.7 ms 0.36 % findInDoc 0.1 ms 0.00 % 0.1 ms 0.00 % has 0.1 ms 0.00 % 0.1 ms 0.00 % printArgumentsList ``` Before (performance): ``` cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null Debugger listening on ws://127.0.0.1:9229/4b52c027-ef62-49d6-8770-179e805a0f43 For help see https://nodejs.org/en/docs/inspector Debugger attached. [debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]} [debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier-2/LspLanguageService.js' [debug] loaded options `null` [debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"} [debug] '--debug-repeat' option found, running formatWithCursor 1000 times. [debug] '--debug-repeat' measurements for formatWithCursor: { [debug] "repeat": 1000, [debug] "hz": 7.774598830700336, [debug] "ms": 128.624 [debug] } ``` After (performance): ``` cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null Debugger listening on ws://127.0.0.1:9229/aa76e134-a68c-44ed-89a8-efb68bc46baa For help see https://nodejs.org/en/docs/inspector Debugger attached. [debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]} [debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js' [debug] loaded options `null` [debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"} [debug] '--debug-repeat' option found, running formatWithCursor 1000 times. [debug] '--debug-repeat' measurements for formatWithCursor: { [debug] "repeat": 1000, [debug] "hz": 7.888114977163907, [debug] "ms": 126.773 [debug] } ```
New npm scripts and documentation for performance benchmarks 👉 #4846 |
@vjeux What's the status of this work in 2020? Prettier is still very slow for me on big HTML files. Multiple seconds per formatting round. |
@sanmai-NL would you be able to share this file by any chance? It may be an issue for a specific pattern in your file that we may be able to address. |
@vjeux On https://han-aim.gitlab.io/dt-sd-asd/materials/ADP/ you'll find the rendered HTML, but the problem occurs with the source as well. (Latest stable VS Code and Prettier.) |
I had a quick look, this html is slow because of
|
Rome has been released with formatting performance many times more than Prettier: Maybe it's time to make performance more of a priority |
Formatting 100 mb json with prettier takes 100+ seconds and 16GB of ram, while |
I just found out that the performance metrics of prettier inside of Nuclide at Facebook:
It's not awesome but totally reasonable. However, for big files like this one (3k lines) it consistently takes multiple seconds.
Now that prettier is in a good shape (not changing as often) and massively successful, it would be nice to start spending time on performance improvements. As far as I know, outside of fixing exponential complexity, we haven't spent any time trying to optimize it.
This PR is a call for help, I'm sure that there's a ton of optimizations that can be done to make prettier faster, now would be an awesome time to start working on them.
Here's the challenge, what can we do to make the following file take less time to print?
Profiling what is taking time would be an awesome first step.
How to play:
Setup:
Run:
The text was updated successfully, but these errors were encountered: