Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: module token factory fast path #11023

Merged

Conversation

H4ad
Copy link
Contributor

@H4ad H4ad commented Feb 3, 2023

PR Checklist

Please check if your PR fulfills the following requirements:

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Other... Please describe:

What is the current behavior?

This PR aims to improve the speed of ModuleTokenFactory#create, today, here's the profiler information of the initialization (when we initialize 10k times):

 [Shared libraries]:
   ticks  total  nonlib   name
  36828   80.3%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    614    1.3%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      7    0.0%          [vdso]
      1    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28

 [JavaScript]:
   ticks  total  nonlib   name
    562    1.2%    6.7%  LazyCompile: *_object /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:192:22
    475    1.0%    5.6%  LazyCompile: *InstanceWrapper /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/injector/instance-wrapper.js:17:16
    343    0.7%    4.1%  LazyCompile: *applyDefaults /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:61:23
    330    0.7%    3.9%  LazyCompile: *getAllFilteredMethodNames /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/metadata-scanner.js:14:31
    314    0.7%    3.7%  LazyCompile: *reflectKeyMetadata /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/scanner.js:156:23
    304    0.7%    3.6%  LazyCompile: *_string /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:304:22

perf-startup-10000_old_module_token_factory.js.txt

The main issues are with methods that belong to object-hash.

Issue Number: #10844

What is the new behavior?

Now, I made two optimizations:

  • Add a fast path when don't have metadata.
  • Remove object-hash and perform what they will do manually.

The first optimization is the simpler one, we don't need to create an object and pass it to the object-hash, is way simpler to just get the main information and then generate a hash from it.

The second optimization was based on: the usage of object-hash is redundant, we don't need a library to traverse all the properties of the object and then generate a hash, we just need a hash, so I just serialize the entire token and then create a hash from it.

The results are these:

ModuleTokenFactory#create x 31,722 ops/sec ±2.45% (84 runs sampled)
BetterModuleTokenFactory#create x 356,129 ops/sec ±1.94% (84 runs sampled)

ModuleTokenFactory#create with metadata x 26,051 ops/sec ±2.36% (85 runs sampled)
BetterModuleTokenFactory#create with metadata x 126,754 ops/sec ±1.82% (89 runs sampled)

Fastest is BetterModuleTokenFactory#create

Benchmark

When we don't have metadata, the improvement is 11x, with metadata the improvement is almost 5x.

Now, the profiler information looks like:

 [Shared libraries]:
   ticks  total  nonlib   name
  25708   80.0%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    308    1.0%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      4    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
      1    0.0%          [vdso]

 [JavaScript]:
   ticks  total  nonlib   name
    384    1.2%    6.3%  LazyCompile: *InstanceWrapper /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/injector/instance-wrapper.js:17:16
    328    1.0%    5.3%  LazyCompile: *getAllFilteredMethodNames /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/metadata-scanner.js:14:31
    322    1.0%    5.3%  LazyCompile: *reflectKeyMetadata /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/scanner.js:156:23
    312    1.0%    5.1%  LazyCompile: *OrdinaryGetMetadata /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/reflect-metadata/Reflect.js:600:37
    239    0.7%    3.9%  LazyCompile: *next /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/iterare/lib/iterate.js:20:9
    197    0.6%    3.2%  LazyCompile: *isMethod /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/metadata-scanner.js:15:26

perf-startup-10000_better_module_token_factory.js.txt

We have more space to optimize it, like changing how the hash is generated but for now I keep it simple.

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

@@ -3,33 +3,51 @@ import { Type } from '@nestjs/common/interfaces/type.interface';
import { randomStringGenerator } from '@nestjs/common/utils/random-string-generator.util';
import { isFunction, isSymbol } from '@nestjs/common/utils/shared.utils';
import stringify from 'fast-safe-stringify';
import * as hash from 'object-hash';
import { createHash } from 'crypto';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering how fast it is compared to @napi-rs/blake-hash 🤔

Also, on a side note, did you have a chance to compare https://github.com/Brooooooklyn/uuid to @lukeed/uuid (for generating uuids)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never tried but I run some benchmarks:

uuid v4 x 17,312,526 ops/sec ±1.17% (91 runs sampled)
uuid v5 x 234,243 ops/sec ±1.20% (87 runs sampled)
sha1 x 502,120 ops/sec ±2.31% (79 runs sampled)
napi-rs/blake x 495,174 ops/sec ±2.27% (71 runs sampled)
nanoid x 3,418,245 ops/sec ±1.43% (90 runs sampled)
uid x 48,565,650 ops/sec ±1.07% (91 runs sampled)
luuked x 5,012,523 ops/sec ±1.36% (90 runs sampled)
napi-rs/uuid x 5,499,031 ops/sec ±0.92% (89 runs sampled)
Fastest is uid

Don't seem to be faster in our scenario.

Benchmark: perf-nanoid-uuid.benchmark.ts.zip

@H4ad H4ad force-pushed the perf/module-token-factory-fast-path branch from 1951f31 to 2d7703a Compare February 3, 2023 11:04
@H4ad
Copy link
Contributor Author

H4ad commented Feb 4, 2023

Hey, reading this article I found a hash function that is faster than using sha1 called xxhash and I found a node-js binding made in rust @node-rs/xxhash, see the results:

ModuleTokenFactory#create x 32,258 ops/sec ±1.88% (85 runs sampled)
BetterModuleTokenFactory#create x 369,860 ops/sec ±1.99% (84 runs sampled)
BetterModuleTokenFactoryWithNewHash#create x 728,901 ops/sec ±2.14% (84 runs sampled)
ModuleTokenFactory#create with metadata x 27,770 ops/sec ±1.35% (90 runs sampled)
BetterModuleTokenFactory#create with metadata x 125,311 ops/sec ±1.31% (88 runs sampled)
BetterModuleTokenFactoryWithNewHash#create with metadata x 159,555 ops/sec ±1.33% (87 runs sampled)
Fastest is BetterModuleTokenFactoryWithNewHash#create

I also update the benchmark.ts on gist, you can test for yourself.

The output of xxhash function is a number, like 3687026082.

The profiler information looks like:

Baseline:

 [Shared libraries]:
   ticks  total  nonlib   name
  37338   80.5%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    643    1.4%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      7    0.0%          [vdso]
      7    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28

perf-startup-10000_baseline.txt

With new Hash:

 [Shared libraries]:
   ticks  total  nonlib   name
  25158   79.7%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    169    0.5%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      8    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
      3    0.0%          [vdso]

perf-startup-10000_module_token_factory_with_xxhash.txt

Compared to the sha1, it didn't show too much difference in profiler information, just a little bit of improvement, but I think that's a great improvement based on the benchmark.ts.

I push the changes, so try and see if it's okay to introduce another dependency.

@kamilmysliwiec
Copy link
Member

LGTM

@kamilmysliwiec kamilmysliwiec merged commit 76b7185 into nestjs:master Feb 6, 2023
@H4ad H4ad deleted the perf/module-token-factory-fast-path branch February 6, 2023 10:40
@silentroach
Copy link

silentroach commented Feb 7, 2023

upd: everything is fine, found problem on my side


looks like it breaks runtime in alpine with error

/usr/app/node_modules/@node-rs/xxhash/index.js:226
    throw loadError
    ^

Error: Cannot find module '@node-rs/xxhash-linux-arm64-musl'
Require stack:
- /usr/app/node_modules/@node-rs/xxhash/index.js
- /usr/app/node_modules/@nestjs/core/inspector/deterministic-uuid-registry.js
- /usr/app/node_modules/@nestjs/core/inspector/uuid-factory.js
- /usr/app/node_modules/@nestjs/core/injector/instance-wrapper.js

@kamilmysliwiec
Copy link
Member

What image do you use @silentroach?

@silentroach
Copy link

silentroach commented Feb 7, 2023

I use official node:16.19.0-alpine.

Looks like when I install @nestjs/core on macbook (apple silicon, maybe it is important) it also installs xxhash for darwin platform in package-lock.json and when I run npm ci it docker will install the same package for darwin instead of new one for alpine :(

Seems like now package-lock is different for different platforms, it can be a breaking change for somebody

@Scrib3r
Copy link

Scrib3r commented Feb 7, 2023

@silentroach
I use node:18.14.0-alpine with latest version of npm v9.3.1 and it's fine for me. What've set up in your package-lock.json?

@silentroach
Copy link

My fault, sorry. After searching in commit history I found that package-lock was corrupted after conflict resolving in different branch and all optional dependencies other that one for my platform was removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants