Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: ModuleTokenFactory #10844

Closed
1 task done
H4ad opened this issue Jan 10, 2023 · 17 comments
Closed
1 task done

perf: ModuleTokenFactory #10844

H4ad opened this issue Jan 10, 2023 · 17 comments
Labels
needs triage This issue has not been looked into type: enhancement 🐺

Comments

@H4ad
Copy link
Contributor

H4ad commented Jan 10, 2023

Is there an existing issue that is already proposing this?

  • I have searched the existing issues

NestJS version

9.2.1

Is your performance suggestion related to a problem? Please describe it

I want to improve the initialization time of NestJS, so I started to do some profiling in NestFactory#create.

Today, the most expensive calls in the initialization are:

 [JavaScript]:
   ticks  total  nonlib   name
    529    1.1%    6.1%  LazyCompile: *_object /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:192:22
    348    0.7%    4.0%  LazyCompile: *getAllFilteredMethodNames /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/metadata-scanner.js:14:31
    323    0.7%    3.7%  LazyCompile: *OrdinaryGetMetadata /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/reflect-metadata/Reflect.js:600:37
    298    0.6%    3.4%  LazyCompile: *reflectKeyMetadata /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/scanner.js:150:23
    298    0.6%    3.4%  LazyCompile: *randomUUID node:internal/crypto/random:396:20
    289    0.6%    3.3%  LazyCompile: *_string /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:304:22
    278    0.6%    3.2%  LazyCompile: *applyDefaults /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/object-hash/index.js:61:23
    221    0.5%    2.5%  LazyCompile: *next /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/iterare/lib/iterate.js:20:9
    184    0.4%    2.1%  LazyCompile: *reflectInjectables /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/scanner.js:134:23
    181    0.4%    2.1%  LazyCompile: *InstanceWrapper /home/h4ad/Projects/opensource/performance-analysis/test-performance-nestjs/node_modules/@nestjs/core/injector/instance-wrapper.js:16:16

*_object and *_string are part of object-hash library, used inside this line:

const moduleId = this.getModuleId(metatype);
const opaqueToken = {
id: moduleId,
module: this.getModuleName(metatype),
dynamic: this.getDynamicMetadataToken(dynamicModuleMetadata),
};
return hash(opaqueToken, { ignoreUnknown: true });

Describe the performance enhancement you are proposing and how we can try it out

I didn't understand why you need to have the same token for the same metadata, so I did a test to comment all those lines and just return a random token from random-string-generator:

import { randomStringGenerator } from '@nestjs/common/utils/random-string-generator.util';

export class ModuleTokenFactory {
  private readonly moduleIdsCache = new WeakMap<Type<unknown>, string>();

  public create(
    metatype: Type<unknown>,
    dynamicModuleMetadata?: Partial<DynamicModule> | undefined,
  ): string {
    return randomStringGenerator();
  }

And I run the tests and the tests passed!

Benchmarks result or another proof (eg: POC)

About the improvements, we drop from:

Statistical profiling result from perf-startup-1000_baseline.log, (48915 ticks, 77 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
  39486   80.7%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    709    1.4%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      8    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28

To:

Statistical profiling result from perf-startup-10000_randomStringGenerator.log, (30896 ticks, 72 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
  24583   79.6%          /home/h4ad/.asdf/installs/nodejs/16.16.0/bin/node
    179    0.6%          /usr/lib/x86_64-linux-gnu/libc-2.31.so
      7    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
      4    0.0%          [vdso]

This represents a drop of 18019 CPU ticks, reflecting a reduction of 4.2ms from 29.65 to 25.45 in the initialization accords the CPU profile.

Baseline CPU Profile: perf-one-02_baseline.cpuprofile.zip
Baseline Profiler Information perf-startup-10000_baseline.txt

Faster CPU profile: perf-one-02_fast.cpuprofile.zip
Faster Profiler Information: perf-startup-10000_randomStringGenerator.txt

Branch with the optimizations: https://github.com/H4ad/nest/blob/6eac7fec12b9b9cae65a17cf331ab1ee7a93a27d/packages/core/injector/module-token-factory.ts#L11-L16

So my question is: do we really need that metadata to be serialized in the same hash every time or could we do this optimization?

If the answer is that we don't need it, I can open a pull request, along with #10825 and the other PRs I open, we can see a lot of startup time improvements, not just in the #create method, but in #createApplicationContext and also in createMicroservice because ModuleTokenFactory is used by NestContainer which is used in all these methods.

@H4ad H4ad added needs triage This issue has not been looked into type: enhancement 🐺 labels Jan 10, 2023
@kamilmysliwiec
Copy link
Member

do we really need that metadata to be serialized in the same hash every time or could we do this optimization?

Quick answer is - yes.

Thanks for investigating this issue though.

@MartinSchere
Copy link

This adds substantial overhead to applications with more than 20 modules, to the point where it should be considered an issue. In our app with ~30 modules, this adds ~12 seconds of extra cold boot latency...

@micalevisk
Copy link
Member

micalevisk commented Jan 14, 2023

I appreciate the effort on this.

I have a project (nestjs v9) with 37 modules and .create() took ~0.6s.
In another one (nestjs v8) with 63 modules it took ~3s.

Not sure why tho.


I was not able to start the app after changing hash with randomStringGenerator.

@kamilmysliwiec
Copy link
Member

Perhaps there's room for optimizations in the object-hash library? I just saw an issue created by @H4ad with some great ideas on how to make things somewhat better

@H4ad
Copy link
Contributor Author

H4ad commented Jan 15, 2023

@kamilmysliwiec Yes, there are a lot of things that I could improve in object-hash, my idea is first to optimize that library, then open a PR with the new version for NestJS and also remove fast-json-stringify, I think we don't need it since object-hash is capable to deal with it.

Optimizing fast-json-stringify is much harder so that's why I'm thinking of removing that library.

@pocesar
Copy link

pocesar commented Jan 15, 2023

@H4ad I did just that, take a look. I just can't run the tests locally, they fail with missing typings from axios

@H4ad
Copy link
Contributor Author

H4ad commented Jan 19, 2023

Hi everyone, just an update on this topic.

I made the updates that I described in the last comment and in the performance tests I didn't see any performance improvement, in fact I see a performance degradation in real projects except in the project used initially for testing (a project without typeorm) .

I still believe that @pocesar solution won't fit, but it actually works for him, I talk to @micalevisk and he suggests that maybe the behavior of ModuleTokenFactory can be changed by the user instead of leaving it to us, so they decide which one is the best based on many options or community options.

I did some research in the Angular codebase to see how they do this kind of injection, and from what I've seen they do this heavy stuff at compile time, which I don't think is an option for us because we don't have control over the build configuration.

One option I thought maybe could work is having a helper to help make the dynamic modules unique so we don't need to perform any serialization because the user will ensure the module is unique to that ID, I did some experiments and it worked and it helped to reduce a little startup time.

I'll continue my research, and as soon as I find something, I'll come back and share it with you.

@pocesar If you have time, could you share or tell me what are you doing to your initialization takes 20s? I could not reproduce this behavior.

@pocesar
Copy link

pocesar commented Jan 19, 2023

@H4ad this happens when running on Cloud Run on GCP. Cloud Run overhead in an empty project is around 5-6s, so 14s is only NestJS doing serialization. I used clinic flame to see that the biggest culprit is the fast-safe-stringify and decirc calls. remember that I'm currently on Nest 8.x

@H4ad
Copy link
Contributor Author

H4ad commented Jan 21, 2023

Hey @kamilmysliwiec, I did some experiments and I found two possible solutions for this problem.

Module Token

Based on the last assumption of making dynamic modules unique by using a UID, I could create a proof of concept that works in my private API, see the commit with the changes.

I introduce ModuleToken, basically, it wraps the dynamic module, generates and UID, and finally, when we generate the module, I just perform a validation to see if the metadata is instance of ModuleToken and then I just return the moduleId.

if (dynamicModuleMetadata instanceof ModuleToken)
return dynamicModuleMetadata.moduleId;

To use this new behavior, I just wraps your module by ModuleToken, like this:

image

A little bit verbose but this could be implemented by the libraries instead, to make the adoption easier.

I also add a faster path when the module is static:

public getFastModuleToken(moduleId: string, moduleName: string): string {
const key = `${moduleId}_${moduleName}`;
if (this.moduleTokensCache.has(key)) return this.moduleTokensCache.get(key);
const hash = createHash('sha1').update(key).digest('hex');
this.moduleTokensCache.set(key, hash);
return hash;
}

Basically, I skip what object-hash will do under the hood and just perform that operation quickly as possible.

All these changes were only possible because I change these lines:

image

Basically, I stop to create a new instance of an object and just use what the user sends to us, without this modification, my class ModuleToken is transformed into just a plain object.

But with this final change, I realize that we could have a better option.

Using object reference as an ID

See the commit with the changes

First, why we need to use ModuleTokenFactory?

From what I discover, it's because we associate a UID for each module to be easier to know which service or any injectable be part of what module. The problem is, when we have dynamic modules, we don't know exactly how to generate a ID for that module because is not so simple to distinguish between two dynamic modules.

But, why is not simple? Basically, the answer I found is in these lines:

const { module: type, ...dynamicMetadata } = metatype;
return { type, dynamicMetadata };

In those lines, we perform a shallow clone in the dynamic metadata information, so we lost the ability to distinguish between two dynamic modules because every time the same object is sent to this function, another object with a different reference is created.

The main point of having a UID in the DynamicModule was to solve this issue. And to fix this issue, we just need to change to:

return { type: metatype.module, dynamicMetadata: metatype }

So, relying on fixing this problem, to generate the ID was as simple as:

if (!dynamicModuleMetadata)
return this.getFastModuleToken(moduleId, this.getModuleName(metatype));
if (this.dynamicallyModuleMetadataCache.has(dynamicModuleMetadata))
return this.dynamicallyModuleMetadataCache.get(dynamicModuleMetadata);
this.dynamicallyModuleMetadataCache.set(
dynamicModuleMetadata,
randomStringGenerator(),
);
return this.dynamicallyModuleMetadataCache.get(dynamicModuleMetadata);

Basically, I create a WeakMap to store dynamicModuleMetadata and I use randomStringGenerator to generate the module ID.

I also keep the faster path when dynamicModuleMetadata doesn't exist to be able to generate the same UID in this case too.

Using this approach, we eliminate the usage of object-hash and also fast-safe-stringify at the same time, and we don't introduce any API change, so this fix could be back-ported to other NestJS versions.

The only drawback of this approach is: two dynamic modules that have the same name and properties will be evaluated twice, in the old version, NestJS identified this as being the same and return the same ID for both modules. To be honest, I think that this is a user problem instead a problem of the framework should worry about.

Summary

Both options should be heavily tested, so here are some instructions on how you can test in your own private API.

@pocesar If you have time man, test the second option (with lower effort), it will be very great to see if it works in your case.

First option

To test the first option, you should override three files and create one file in the root:

Then, just build your API and run with node perf-startup.js.

Second Option

To test the second option, you should override two files and create one file in the root:

Then, just build your API and run with node perf-startup.js.

@micalevisk
Copy link
Member

I tested the second option in a nestjs v8 app with 67 [eager] modules (and a bunch of circular imports) but didn't get any relevant improvement :/ NestFactory.create still took ~1s in the hot path. The app works as expected.

But if that means that now nestjs won't have that two dependencies anymore, then it sounds good.

@kamilmysliwiec
Copy link
Member

@H4ad

Basically, I create a WeakMap to store dynamicModuleMetadata and I use randomStringGenerator to generate the module ID.

I also keep the faster path when dynamicModuleMetadata doesn't exist to be able to generate the same UID in this case too.

Using this approach, we eliminate the usage of object-hash and also fast-safe-stringify at the same time, and we don't introduce any API change, so this fix could be back-ported to other NestJS versions.

Well not exactly. Since you're using has() to determine whether the dynamic module metadata already exists in the collection, even if you used two deeply equal objects, has() would return false. That means, for example, TypeOrmModule.forFeatue([User]) used twice in two different modules would currently result in a single module, whereas in your implementation we'd end up with 2.

@pocesar
Copy link

pocesar commented Jan 23, 2023

I don't know if it's possible, but a last resort would to be to use Rust and serde to serialize those big objects, and do all the calculations on optimized binary code. the only issue is that it will require a building step just for this

@H4ad
Copy link
Contributor Author

H4ad commented Jan 23, 2023

@kamilmysliwiec So, the back-port will not be feasible for the second option will not be possible to back-port (if back-port is a thing that you want).

In this case, what about just launching this as a breaking-change in the next NestJS version?

About the first option, we could launch it as a feature for the current version because it does not break the compatibility, what do you think?

@pocesar I don't know if it's possible, but a last resort would to be to use Rust and serde to serialize those big objects, and do all the calculations on optimized binary code. the only issue is that it will require a building step just for this

Your line of thinking is similar to what Angular does, but the main problem is: modules could be dynamic, so it's not possible to put the initialization of all modules in the build step because that information could be different for each initialization of the API.

Also, moving the computation to Rust I don't think that will bring too much speed improvement, I already tried putting the object-hash, and just crypto.createHash, inside Worker and it didn't perform a faster initialization (in my tests).

@pocesar
Copy link

pocesar commented Jan 23, 2023

@H4ad the idea is for getting a stable hash from an object declaration, it doesn't really matter if it's dynamic or not, since it's only for hashing. WASM is very good at hashing, but the same can be done using native addons as well.

@H4ad
Copy link
Contributor Author

H4ad commented Jan 23, 2023

@pocesar We can achieve this same behavior with ModuleToken that I introduce, in the build step, you can just create a plugin to scan when imports have a module that is dynamically loaded by just searching for non-classes inside the import and then add the ModuleToken to it.

But nice to know, I will keep the WASM usage in my mind, large strings could be useful but for the size of the strings that NestJS generates, the overhead of sending the information is greater than the benefits.

@kamilmysliwiec
Copy link
Member

Build-time optimizations are off the table (anything that involves pre-processing TS, including NestJS CLI plugins, proved to be rather unstable and hard to maintain).

In this case, what about just launching this as a breaking-change in the next NestJS version?

@H4ad I'm OK with breaking changes in this case BUT we still need to generate a consistent hash per each module and this hash shouldn't change each time the application is run (it should be identical). This is required for some other features that I'm currently working on

@H4ad
Copy link
Contributor Author

H4ad commented Feb 3, 2023

Based on kamil's requirements, I push some optimizations for ModuleTokenFactory.

For now, I don't think it will solve @pocesar's problem but it will improve boot speed a bit.

Also, based on the requirements, I don't think it's possible to solve the pocesar problem without introducing a new API to skip the #create method for that specific module.

But I hope what we find in this issue can guide someone to a better version in the future.

Thank you all for your patience and help with this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage This issue has not been looked into type: enhancement 🐺
Projects
None yet
Development

No branches or pull requests

5 participants