-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] dedupe hoister levelQueue items for performance gains #6083
Conversation
That's weird, it doesn't really make sense to me - I don't see how |
src/package-hoister.js
Outdated
@@ -243,8 +243,12 @@ export default class PackageHoister { | |||
this.taintKey(key, info); | |||
|
|||
// | |||
const pushed = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use a Set
here
It does indeed fix my issue. Here is the output of the first few seconds of a yarn install (without my fix and with a console.log statement showing what As you can see, there are tons of dupes in there. This is only the first few seconds of output and it's already up to 210,403 items pushed. When I filter out duplicates, this drops to 1680. |
Impressive, then. Can you share the dependencies in your |
That's a little hard to do because I have a private monorepo with ~150 packages. That's a lot of dependencies. I'm going to spend some time trying to reduce it down to something smaller that I can share. Stay tuned. |
I think this issue is related to having multiple packages depending on the same 3rd party library. It seems to reprocess the 3rd party lib each time I depend on it, and appends it's dependencies to the same shared array each time (hence the dupes). Based on that, here's an alternative solution to fix this issue. The dependencies array is defined in package-reference.js. This should probably be a Set instead of an array, so that we can ensure it has no dupes. I proved this out by changing Line 83 of package-reference.js, from this: this.dependencies = this.dependencies.concat(deps); to this: this.dependencies = [...new Set([...this.dependencies, ...deps])]; This fixes my issue by ensuring that when dependencies are added, they are also deduped. It should also be pretty easy to throw a unit test around addDependencies that ensures that it filters out duplicates. Still working to get a small repo you can reproduce this on. |
Here's a reproduction of the issue. It actually looks like this issue is specific to peer dependencies. In my reproduction, I have a package called This reproduction does not cause yarn to hang for seemingly-forever because there are only a handful of packages, but by putting a console.log in a strategic location, you can verify that the dependency lists are growing rapidly and they end up with lots of dupes, which causes severe performance issues in large monorepos. The following output was generated using the reproduction mentioned above, and by adding outputyarn install v1.9.0-0 [1/4] Resolving packages... [2/4] Fetching packages... [3/4] Linking dependencies... workspace-aggregator-335394b6-e949-4a7b-907c-5ac249984f59 dependencies: [ 'pkg-a@1.0.0', 'pkg-b@1.0.0', 'pkg-c@1.0.0', 'pkg-d@1.0.0', 'pkg-e@1.0.0', 'pkg-f@1.0.0', 'pkg-has-peer-dependency@1.0.0', 'pkg-peer-dependency@1.0.0' ] pkg-a dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-b dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-c dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-d dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-e dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-f dependencies: [ 'pkg-peer-dependency@^1.0.0', 'pkg-has-peer-dependency@^1.0.0' ] pkg-peer-dependency dependencies: [] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-has-peer-dependency dependencies: [ 'pkg-peer-dependency@1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0', 'pkg-peer-dependency@^1.0.0' ] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] pkg-peer-dependency dependencies: [] [4/4] Building fresh packages... success Saved lockfile. Done in 0.29s. |
Thanks a lot! I booked time to check your repro on monday, this sounds really great |
@arcanis, had any time to review this yet? |
Yes! I just finished my own testing - I didn't quite saw the freeze you witnessed (I would be super interested if you could share the package.json that caused it - possibly in private in you prefer, my email is in my profile), but I did see a slight improvement when coupling your fix with a feature I'm working on, which is great. I think it's mergeable - it should ship in the 1.9.0, scheduled today. |
Summary
I'm submitting this PR early so I can get as much feedback as possible. I am not at all familiar with the yarn codebase, so I'm not sure if I'm even making this change in the right place.
I have a rather large monorepo (~150 packages) and when running
yarn install
from yarn>=1.4.0, it just hangs during the "Linking Dependencies" step. For example, today I let it run for 3.5 hours before I finally gave in and killed it. For the past several months I've just used v1.3.2 as a temporary workaround. Today I spent several hours in the yarn code addingconsole.log
statements all over the place and I think I've determined thatthis.levelQueue
is growing exponentially at package-hoister.js:247. The same dependencies are being added to the levelQueue hundreds of times (if not thousands). My levelQueue had nearly three million items in it, and it takes a long time to iterate over all those dependencies, which made it look like yarn had just crashed.I added a few lines of code to de-dupe the items being pushed onto the queue and that seems to solve my issue.
Because I'm not familiar with the codebase, I don't know what unintended effect this might have, but it seemed to solve all my issues and the node modules appear to have installed correctly (and rather quickly too!).
Please let me know if you have any concerns. I'll gladly finish out this pull request with tests, etc. but I want to make sure I'm going in the right direction first.
Test plan
I'll write some unit tests for this if it is an acceptable change.