Skip to content

Performance on big files #4776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vjeux opened this issue Jun 30, 2018 · 33 comments
Open

Performance on big files #4776

vjeux opened this issue Jun 30, 2018 · 33 comments
Labels
help wanted We're a small group who can't get to every issue promptly. We’d appreciate help fixing this issue! type:perf Issue with performance of Prettier

Comments

@vjeux
Copy link
Contributor

vjeux commented Jun 30, 2018

I just found out that the performance metrics of prettier inside of Nuclide at Facebook:

  • p50: 122ms
  • p75: 203ms
  • p90: 377ms

It's not awesome but totally reasonable. However, for big files like this one (3k lines) it consistently takes multiple seconds.

Now that prettier is in a good shape (not changing as often) and massively successful, it would be nice to start spending time on performance improvements. As far as I know, outside of fixing exponential complexity, we haven't spent any time trying to optimize it.

This PR is a call for help, I'm sure that there's a ton of optimizations that can be done to make prettier faster, now would be an awesome time to start working on them.

Here's the challenge, what can we do to make the following file take less time to print?

Profiling what is taking time would be an awesome first step.

How to play:

Setup:

git clone https://github.com/prettier/prettier.git
cd prettier
yarn
curl -O https://raw.githubusercontent.com/facebook/nuclide/master/pkg/nuclide-vscode-language-service-rpc/lib/LspLanguageService.js

Run:

time ./bin/prettier.js LspLanguageService.js > /dev/null
real	0m1.586s
@lydell lydell added the help wanted We're a small group who can't get to every issue promptly. We’d appreciate help fixing this issue! label Jun 30, 2018
@ikatyang ikatyang added the type:perf Issue with performance of Prettier label Jun 30, 2018
@sompylasar
Copy link
Contributor

sompylasar commented Jun 30, 2018

Hi, I took some time to investigate and play with the code, and I've already got something to share.

I'm using node --inspect-brk DevTools Profiler and node --prof/node --prof-process to get some internal profiling.
I haven't tried node --trace-deopt --trace-opt yet, but that'd be nice to check out, too.


Problem

According to the DevTools Profiler flamechart, Module._compile via Module._extensions takes noticeable chunk of startup time (245ms of about 800ms total time). Of that, about 143ms is taken by common/load-plugins, of which about 66ms is taken by third-party where cosmiconfig loadRc requires js-yaml that requires js-yaml/schema that requires js-yaml/type/js/function that requires webpack-packed esprima.

Proposed solution

Someone might need to dive into cosmiconfig to understand why it initializes js-yaml if it does not need it, and into js-yaml to optimize its module parse time.


Problem

According to the DevTools Profiler flamechart, getStringWidth (string-width) takes noticeable chunk of run time because of strip-ansi ansi-regex.

Proposed solution

Memoize getStringWidth so that the same string is not processed multiple times.

I've implemented that, and during the test file run, I got some counters:

{
  getStringWidthMemoizedCallCount: 49583,
  getStringWidthMemoizedMapSize: 1352,
  getStringWidthMemoizedHitCount: 48231,
  getStringWidthMemoizedMissCount: 1352
}

If I got them right, this eliminates almost 97.2% of the calls to getStringWidth at the tradeoff of some memory taken for the hashmap.

I attempted to use a prototype-less object (to cut prototype chain lookup) primed with delete that, if I get this article right, should switch the object to dictionary mode to not use hidden classes. Please tell me if I'm crazy.

const getStringWidth = util.getStringWidth;
let getStringWidthMemoizedCallCount = 0;
let getStringWidthMemoizedMapSize = 0;
let getStringWidthMemoizedHitCount = 0;
let getStringWidthMemoizedMissCount = 0;
const getStringWidthMemoizedMap = Object.create(null, {
  "": {
    value: 0,
    configurable: true
  }
});
// force switch the object to dictionary mode
delete getStringWidthMemoizedMap[""];
function getStringWidthMemoized(str) {
  ++getStringWidthMemoizedCallCount;
  let width = getStringWidthMemoizedMap[str];
  if (width !== undefined) {
    ++getStringWidthMemoizedHitCount;
    return width;
  }
  ++getStringWidthMemoizedMissCount;
  width = getStringWidth(str);
  getStringWidthMemoizedMap[str] = width;
  ++getStringWidthMemoizedMapSize;
}
/* eslint-disable no-console,prettier/prettier */
process.on("exit", () => {
  console.log({
    getStringWidthMemoizedCallCount,
    getStringWidthMemoizedMapSize,
    getStringWidthMemoizedHitCount,
    getStringWidthMemoizedMissCount,
  });
});
/* eslint-enable no-console,prettier/prettier */

Problem

FastPath (common/fast-path) uses arguments object which, according to my (maybe stale) knowledge, may prevent function optimization. Please tell me if I'm wrong.

During the test file run, there's not more than 2 arguments actually passed into FastPath methods, I counted:

{
  mapFastPathArgcMax: 2
}

Proposed solution

Refactor to not use arguments object. This still needs some more measurements to figure out if refactoring all the FastPath methods to not use arguments is actually an optimization; I only refactored one so far.

// Similar to FastPath.prototype.each, except that the results of the
// callback function invocations are stored in an array and returned at
// the end of the iteration.
FastPath.prototype.map = function map(callback, name1, name2, name3, name4) {
  // /*, name1, name2, ... */) {
  const s = this.stack;
  const origLen = s.length;
  let value = s[origLen - 1];

  const argv = [name1, name2, name3, name4];  // arguments
  const argc = argv.length;
  for (let i = 0; i < argc; ++i) {  // let i = 1
    const name = argv[i];  // arguments[i]
    if (name === undefined) {
      break;
    }
    value = value[name];
    s.push(name, value);
  }

@sompylasar
Copy link
Contributor

One more thing I tried, maybe it's already been tried and is already optimized, please tell me if it is.

fits and printDocToString uses a 3-item array to store commands in the queue. This does not take advantage of hidden classes to reuse object memory layouts.

I'm a little confused which variant should theoretically be best-performing in this case: with arrays or with classes.

objects created by the same constructor and have the same set of properties assigned in the same order
[...]
use literal initializer for Arrays with mixed values

http://thlorenz.com/talks/demystifying-v8/talk.pdf

Proposed solution

Refactor to use a constructor function that initializes the properties in a predefined order. Use new to create the objects.

function Cmd(ind, mode, doc) {
  this.ind = ind;
  this.mode = mode;
  this.doc = doc;
}
cmds.push(new Cmd(makeIndent(ind, options), mode, doc.contents));
const cmds = [new Cmd(rootIndent(), MODE_BREAK, doc)];

@sompylasar
Copy link
Contributor

Problem

printAstToDoc uses Map for its cache keyed by nodes.
On the test file, the stats are that it's 97.7% more misses than hits on this cache:

{printAstToDocCacheHitCount: 206, printAstToDocCacheMissCount: 9083}

@j-f1
Copy link
Member

j-f1 commented Jun 30, 2018

This is awesome @sompylasar! 😎

A few comments:

Memoize getStringWidth so that the same string is not processed multiple times.

Is there any reason you’re using an object as a map instead of a real Map? Is the object more performant?

Refactor [FastPath] to not use arguments object.

Is using the spread operator (...names) just as bad as using arguments? If it’s faster, we might want to use that to preserve the original behavior of the function.

printAstToDoc uses Map for its cache keyed by nodes.

Sounds like it’s not helping much. Remove it 🔥🔥🔥

@vjeux
Copy link
Contributor Author

vjeux commented Jun 30, 2018

This is awesome! I can't wait to see if all those leads eventually become performance wins!

For the Map inside of printAstToDoc, unfortunately we can't get rid of it. See #2259 for context. The low hit rate is misleading as it's preventing a sub-tree evaluation.

@sompylasar
Copy link
Contributor

One concern that I still have is that I could not get a good reproduction of the time measurements from subsequent runs on my machine, so I need (or someone could help out with this) to establish a way to benchmark multiple runs to get an average run time and then try to optimize that.

Is there any reason you’re using an object as a map instead of a real Map? Is the object more performant?

I previously saw some performance benefits from native objects' index operators compared to Map which requires method calls. This requires more isolated measurements on this particular code base.

Is using the spread operator (...names) just as bad as using arguments? If it’s faster, we might want to use that to preserve the original behavior of the function.

Babel transpiles spread operator for "rest" argument to arguments, so it should be the same:

function x(...args) {
  const x = args[0];
  for (let i = 1, ic = args.length; i < ic; ++i) {}
}
"use strict";

function x() {
  var x = arguments.length <= 0 ? undefined : arguments[0];
  for (var i = 1, ic = arguments.length; i < ic; ++i) {}
}

But, I recently saw an article that voids the recommendation to not use arguments, among other, since Node 8+ and TurboFan. This needs confirmation from folks who knows Node internals like @bmeurer.

For the Map inside of printAstToDoc, unfortunately we can't get rid of it. See #2259 for context. The low hit rate is misleading as it's preventing a sub-tree evaluation.

👍

@sompylasar
Copy link
Contributor

Btw, the naive memoization of getStringWidth that I wrote will definitely leak memory as it's module-level, there needs to be a more robust LRU cache or something, and a way to scope it or reset it after the formatting is complete.

@vjeux
Copy link
Contributor Author

vjeux commented Jun 30, 2018

The only hot place for getStringWidth is in a single function: https://github.com/prettier/prettier/blob/dcf44ffbdc2f403de02f12516b0c6d5d5813b16f/src/doc/doc-printer.js

@vjeux
Copy link
Contributor Author

vjeux commented Jun 30, 2018

If you look through getStringWidth function, it calls through multiple packages that each create new intermediate strings. Most of the strings that are fed through the printer will be pure ascii, so it's a lot of overhead.

I don't believe that adding a cache is the right strategy here. I recommend we should do the following:

  • Add a fast path that checks if there are only ascii characters and if that's the case, return the length of the string

If it still shows up in the profile, we may want to do more radical things adding two kind of strings, one which contains syntax (and therefore is pure ascii and doesn't need this check) and another one which is user-defined and can contain unicode characters with varying widths.

@sompylasar
Copy link
Contributor

If you look through getStringWidth function, it calls through multiple packages that each create new intermediate strings. Most of the strings that are fed through the printer will be pure ascii, so it's a lot of overhead.

Yes, that's what I noticed, too.

I don't believe that adding a cache is the right strategy here. I recommend we should do the following:

  • Add a fast path that checks if there are only ascii characters and if that's the case, return the length of the string

👍

@sompylasar
Copy link
Contributor

BTW, for those who interested why Object, not Map, for caches: https://community.risingstack.com/the-worlds-fastest-javascript-memoization-library/
Surprisingly, in this article's benchmark, Object with null prototype has slightly worse performance than plain Object that should do prototype chain lookup. No idea why.

@sompylasar
Copy link
Contributor

sompylasar commented Jul 1, 2018

I added an option to the CLI to run a benchmark around the format function on the stdin input, here's the Heavy (Bottom-Up) results from node --inspect-brk as seen in DevTools.

RegExp and replace within getStringWidth seem to really be the first contender for optimization (below is the profile for 10000 repetitions).

screen shot 2018-06-30 at 10 32 49 pm

screen shot 2018-06-30 at 10 36 27 pm

Also, probably caused by V8 runtime optimization and/or CPU caches and everything that helps processing the same code and data faster, the average time is down to 164ms on 1000 repetitions, 176ms on 100, 250ms on 10.

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 10 > /dev/null
{"benchmark":"10","totalMs":2507.7312729358673,"averageMs":250.77312729358673}

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 100 > /dev/null
{"benchmark":"100","totalMs":17653.580070972443,"averageMs":176.5358007097244}

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 1000 > /dev/null
{"benchmark":"1000","totalMs":164849.50500822067,"averageMs":164.84950500822066}

Another observation is that printGenerically is reported as Not optimized: Deoptimized too many times (below is the profile for 100 repetitions):
screen shot 2018-06-30 at 10 41 25 pm

@sompylasar
Copy link
Contributor

Below is the profile for 10 repetitions:
screen shot 2018-06-30 at 10 47 36 pm

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 1 > /dev/null
{"benchmark":"1","totalMs":461.6846880912781,"averageMs":461.6846880912781}

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 10 > /dev/null
{"benchmark":"10","totalMs":2507.7312729358673,"averageMs":250.77312729358673}

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 100 > /dev/null
{"benchmark":"100","totalMs":17653.580070972443,"averageMs":176.5358007097244}

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --benchmark 1000 > /dev/null
{"benchmark":"1000","totalMs":164849.50500822067,"averageMs":164.84950500822066}

The timing displayed is for format function only, not the full startup time.

function formatStdin(context) {
  // ...

      let formatted;
      const benchmark = context.argv["benchmark"];
      if (benchmark > 0) {
        const performance = require("perf_hooks").performance;
        const startMs = performance.now();
        for (let i = 0; i < benchmark; ++i) {
          formatted = format(context, input, options);
        }
        const totalMs = performance.now() - startMs;
        process.stderr.write(
          JSON.stringify({
            benchmark: benchmark,
            totalMs: totalMs,
            averageMs: totalMs / benchmark
          }) + "\n"
        );
      } else {
        formatted = format(context, input, options);
      }

      writeOutput(context, formatted, options);

@sompylasar
Copy link
Contributor

@vjeux I gave your recommendation about getStringWidth another thought.

I don't believe that adding a cache is the right strategy here. I recommend we should do the following:

  • Add a fast path that checks if there are only ascii characters and if that's the case, return the length of the string

This fast path that checks is still a full scan on every string that comes in (possibly with a RegExp) to only discover that nothing should have been done with the string. The cache, on the other hand, should be a lookup (O(1) if the measured string is interned) that gives the previously calculated length right away regardless of the contents of the string.

What do you think?

@vjeux
Copy link
Contributor Author

vjeux commented Jul 1, 2018

Most strings are tiny, O complexity doesn’t really apply here. Scanning a string to see if all characters are ascii is one of the fastest operation a cpu can do. Faster than allocating memory to put in a hash map and doing a lookup.

But I don’t really trust my instinct on those things; just build it and see if it changes anything in the perf.

For your comment about startup time, we keep the node process alive so I’m not super interested in this.

sompylasar added a commit to sompylasar/prettier that referenced this issue Jul 1, 2018
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements.
- `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
@sompylasar
Copy link
Contributor

Most strings are tiny, O complexity doesn’t really apply here. Scanning a string to see if all characters are ascii is one of the fastest operation a cpu can do. Faster than allocating memory to put in a hash map and doing a lookup.

Yes, I hear you. Although I believe it should not copy the string into the hashmap if the key string is already in the JS heap (but no hard evidence of that).

But I don’t really trust my instinct on those things; just build it and see if it changes anything in the perf.

Yes, that's why I started from measuring, and then tweaked the code to find opportunities to optimize.

For your comment about startup time, we keep the node process alive so I’m not super interested in this.

Good to know, this means the benchmark options I added are relevant to measure just the function you're interested in optimizing.

Please have a look (tell me if you want to see it in a PR, and should the PR include the benchmark options commit):
sompylasar@d43e022

// eslint-disable-next-line no-control-regex
const notAsciiRegex = /[^\x20-\x7F]/;

function getStringWidth(text) {
  if (!text) {
    return 0;
  }

  // shortcut to avoid needless string `RegExp`s, replacements, and allocations within `string-width`
  if (!notAsciiRegex.test(text)) {
    return text.length;
  }

  // emojis are considered 2-char width for consistency
  // see https://github.com/sindresorhus/string-width/issues/11
  // for the reason why not implemented in `string-width`
  return stringWidth(text.replace(emojiRegex, "  "));
}

getStringWidth dropped significantly in the Profiler Heavy (Bottom-Up).

Before

Total run time for 1000 repetitions for LspLanguageService.js is 182276ms, average run is 181.80ms.

$ cat LspLanguageService.js | NODE_ENV=production node --inspect-brk ./bin/prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/d085f49d-a12c-432e-ad7e-acf322152bd3
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":"1000","plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 5.500298017789999,
[debug]   "ms": 181.80833052420616
[debug] }
Waiting for the debugger to disconnect...

before

After

Total run time for 1000 repetitions for LspLanguageService.js is 145246ms, average run is 144.85ms (20.32% speed-up).

$ cat LspLanguageService.js | NODE_ENV=production node --inspect-brk ./bin/prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/1a338c30-c52c-4ae7-8e11-9aac0a2b736a
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":"1000","plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 6.903349947385214,
[debug]   "ms": 144.85720811223985
[debug] }
Waiting for the debugger to disconnect...

after-1
after-2

@vjeux
Copy link
Contributor Author

vjeux commented Jul 2, 2018

Wow! 20% faster by adding a trivial test :) Ship it!

sompylasar added a commit to sompylasar/prettier that referenced this issue Jul 2, 2018
sompylasar added a commit to sompylasar/prettier that referenced this issue Jul 2, 2018
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements.
- `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
sompylasar added a commit to sompylasar/prettier that referenced this issue Jul 2, 2018
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements.
- `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
@sompylasar
Copy link
Contributor

@vjeux 👉 #4789 , #4790

I need #4789 merged to continue working on more improvements.

vjeux pushed a commit that referenced this issue Jul 2, 2018
- `--debug-benchmark` uses `benchmark` module to produce statistically significant time measurements.
- `--debug-repeat` uses a naive loop and measures just the average run time, but useful for profiling to highlight hot functions.
@sompylasar
Copy link
Contributor

So why wasn't this change incorporated #4779 ? I'm measuring it now (see Before and After below), and it gives a minor performance improvement on an unbundled ./bin/prettier.js. Should be no difference when bundled and stripped out, but still.

Before

function concat(parts) {
  if (process.env.NODE_ENV !== "production") {
$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":"1000","plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 7.491085608126331,
[debug]   "ms": 133.492
[debug] }

After

const isProduction = process.env.NODE_ENV === "production";
const isNotProduction = !isProduction;

function concat(parts) {
  if (isNotProduction) {

(note: I hoisted the negation as well)

$ cat LspLanguageService.js | NODE_ENV=production node ./bin/prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":"1000","plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 8.571257146285646,
[debug]   "ms": 116.669
[debug] }

@sompylasar
Copy link
Contributor

Okay, I take this back – bundled version starts having this if (isNotProduction) { statement if I use a variable. 😞

@j-f1
Copy link
Member

j-f1 commented Jul 2, 2018

Also, (IMO) the performance of the unbundled/development versions shouldn’t be as important as the performance of the bundled/production version.

@sompylasar
Copy link
Contributor

Also, (IMO) the performance of the unbundled/development versions shouldn’t be as important as the performance of the bundled/production version.

Agree, but for the sake of quick iterations on the code, it's nice to have at least NODE_ENV=production runs not much worse than the bundled version runs. The build takes quite some time, also now there's no way known to me to build just one variant. Of course, it's more correct to measure the build, not the source as I do, as there's an optimizing compiler in place (rollup). Maybe someone could split and speed up the build to improve development iteration times.

@duailibe
Copy link
Member

duailibe commented Jul 2, 2018

@sompylasar the build system has a cache now. It should only recompile an artifact if:

  • one of the source files included in the artifact was changed;
  • yarn.lock changed (which will rebuild everything)
  • or you bump the version prefix.

We probably need to update CONTRIBUTING.md to reflect this

@sompylasar
Copy link
Contributor

@duailibe "one of the source files included in the artifact was changed;" – right, and this is what I'm doing by working on this ticket, changing the source files. 😄

@duailibe
Copy link
Member

duailibe commented Jul 2, 2018

@sompylasar I'm aware, but you're likely to change files that will only affect index.js, bin-prettier.js and standalone.js which build reasonably fast. The parsers (specially Flow and TypeScript) are the culprits of taking too long, so you should be fine :P

sompylasar added a commit to sompylasar/prettier that referenced this issue Jul 16, 2018
In addition to a tiny performance improvement outlined below,
the CPU profile of traverseDoc is now more readable.

Also anonymous arrow functions changed to named regular functions
so that they are properly displayed in the CPU profile,
and moved to outer scope where there's no closure
so that they aren't re-created (this change's performance is dependent
on JS engine implementation and optimization details).

Before (profile):
```
7129.9 ms       5.43 %    13349.9 ms     10.18 %    traverseDocRec
7067.4 ms       5.39 %    11285.5 ms      8.60 %      traverseDocRec
  31.5 ms       0.02 %     1031.9 ms      0.79 %      traverseDoc
  23.6 ms       0.02 %    12313.4 ms      9.39 %      traverseDoc
   2.6 ms       0.00 %       0.3 ms       0.00 %      (anonymous)
   1.7 ms       0.00 %       1.7 ms       0.00 %      call
   1.6 ms       0.00 %       1.6 ms       0.00 %      call
   0.5 ms       0.00 %       0.5 ms       0.00 %      conditionalGroup
   0.4 ms       0.00 %       0.4 ms       0.00 %      printDocToString$1
   0.1 ms       0.00 %       0.1 ms       0.00 %      printGenerically
   0.1 ms       0.00 %       0.1 ms       0.00 %      t
   0.1 ms       0.00 %       0.1 ms       0.00 %      ifBreak
   0.1 ms       0.00 %       0.1 ms       0.00 %      (anonymous)
     0 ms          0 %       0.1 ms       0.00 %      forEach

```

After (profile):
```
6937.9 ms       5.37 %    12872.5 ms      9.97 %    traverseDoc
5944.0 ms       4.60 %    11047.3 ms      8.55 %      propagateBreaks
735.7 ms        0.57 %     1358.3 ms      1.05 %      findInDoc
257.9 ms        0.20 %      466.7 ms      0.36 %      findInDoc
0.1 ms          0.00 %        0.1 ms      0.00 %      has
0.1 ms          0.00 %        0.1 ms      0.00 %      printArgumentsList
```

Before (performance):
```
cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/4b52c027-ef62-49d6-8770-179e805a0f43
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier-2/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 7.774598830700336,
[debug]   "ms": 128.624
[debug] }
```

After (performance):
```
cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/aa76e134-a68c-44ed-89a8-efb68bc46baa
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 7.888114977163907,
[debug]   "ms": 126.773
[debug] }
```
@sompylasar
Copy link
Contributor

New improvement 👉 #4848

vjeux pushed a commit that referenced this issue Jul 16, 2018
In addition to a tiny performance improvement outlined below,
the CPU profile of traverseDoc is now more readable.

Also anonymous arrow functions changed to named regular functions
so that they are properly displayed in the CPU profile,
and moved to outer scope where there's no closure
so that they aren't re-created (this change's performance is dependent
on JS engine implementation and optimization details).

Before (profile):
```
7129.9 ms       5.43 %    13349.9 ms     10.18 %    traverseDocRec
7067.4 ms       5.39 %    11285.5 ms      8.60 %      traverseDocRec
  31.5 ms       0.02 %     1031.9 ms      0.79 %      traverseDoc
  23.6 ms       0.02 %    12313.4 ms      9.39 %      traverseDoc
   2.6 ms       0.00 %       0.3 ms       0.00 %      (anonymous)
   1.7 ms       0.00 %       1.7 ms       0.00 %      call
   1.6 ms       0.00 %       1.6 ms       0.00 %      call
   0.5 ms       0.00 %       0.5 ms       0.00 %      conditionalGroup
   0.4 ms       0.00 %       0.4 ms       0.00 %      printDocToString$1
   0.1 ms       0.00 %       0.1 ms       0.00 %      printGenerically
   0.1 ms       0.00 %       0.1 ms       0.00 %      t
   0.1 ms       0.00 %       0.1 ms       0.00 %      ifBreak
   0.1 ms       0.00 %       0.1 ms       0.00 %      (anonymous)
     0 ms          0 %       0.1 ms       0.00 %      forEach

```

After (profile):
```
6937.9 ms       5.37 %    12872.5 ms      9.97 %    traverseDoc
5944.0 ms       4.60 %    11047.3 ms      8.55 %      propagateBreaks
735.7 ms        0.57 %     1358.3 ms      1.05 %      findInDoc
257.9 ms        0.20 %      466.7 ms      0.36 %      findInDoc
0.1 ms          0.00 %        0.1 ms      0.00 %      has
0.1 ms          0.00 %        0.1 ms      0.00 %      printArgumentsList
```

Before (performance):
```
cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/4b52c027-ef62-49d6-8770-179e805a0f43
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier-2/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 7.774598830700336,
[debug]   "ms": 128.624
[debug] }
```

After (performance):
```
cat ../LspLanguageService.js | NODE_ENV=production node --inspect-brk ./dist/bin-prettier.js --stdin-filepath LspLanguageService.js --loglevel debug --debug-repeat 1000 > /dev/null
Debugger listening on ws://127.0.0.1:9229/aa76e134-a68c-44ed-89a8-efb68bc46baa
For help see https://nodejs.org/en/docs/inspector
Debugger attached.
[debug] normalized argv: {"color":true,"editorconfig":true,"stdin-filepath":"LspLanguageService.js","loglevel":"debug","debug-repeat":1000,"plugin-search-dir":[],"plugin":[],"ignore-path":".prettierignore","config-precedence":"cli-override","_":[]}
[debug] resolve config from '/Users/ivanbabak/_sompylasar/_github/prettier/LspLanguageService.js'
[debug] loaded options `null`
[debug] applied config-precedence (cli-override): {"filepath":"LspLanguageService.js"}
[debug] '--debug-repeat' option found, running formatWithCursor 1000 times.
[debug] '--debug-repeat' measurements for formatWithCursor: {
[debug]   "repeat": 1000,
[debug]   "hz": 7.888114977163907,
[debug]   "ms": 126.773
[debug] }
```
@sompylasar
Copy link
Contributor

New npm scripts and documentation for performance benchmarks 👉 #4846

@sanmai-NL
Copy link
Contributor

sanmai-NL commented Dec 8, 2020

@vjeux What's the status of this work in 2020? Prettier is still very slow for me on big HTML files. Multiple seconds per formatting round.

@vjeux
Copy link
Contributor Author

vjeux commented Dec 8, 2020

@sanmai-NL would you be able to share this file by any chance? It may be an issue for a specific pattern in your file that we may be able to address.

@sanmai-NL
Copy link
Contributor

sanmai-NL commented Dec 8, 2020

@vjeux On https://han-aim.gitlab.io/dt-sd-asd/materials/ADP/ you'll find the rendered HTML, but the problem occurs with the source as well. (Latest stable VS Code and Prettier.)

@fisker
Copy link
Member

fisker commented Dec 9, 2020

I had a quick look, this html is slow because of

function preprocess(ast, options) {
, we should optimize this function, it map ast 10 times, and each map cost about 1s.

@jpike88
Copy link

jpike88 commented Apr 14, 2022

Rome has been released with formatting performance many times more than Prettier:
https://rome.tools/blog/2022/04/05/rome-formatter-release?ref=console.dev

Maybe it's time to make performance more of a priority

@sdwvit
Copy link

sdwvit commented Aug 24, 2024

Formatting 100 mb json with prettier takes 100+ seconds and 16GB of ram, while fs.writeFileSync(..., JSON.stringify(json, null, 2)) takes only 2 seconds and 700mb of ram...
I know it is only one use case, but I expect prettier to be no slower than plain JSON.stringify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted We're a small group who can't get to every issue promptly. We’d appreciate help fixing this issue! type:perf Issue with performance of Prettier
Projects
None yet
Development

No branches or pull requests

10 participants