Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster GetDependencyInformation rule #1359

Closed
pepeiborra opened this issue Feb 12, 2021 · 3 comments
Closed

Faster GetDependencyInformation rule #1359

pepeiborra opened this issue Feb 12, 2021 · 3 comments
Labels
performance Issues about memory consumption, responsiveness, etc. type: enhancement New feature or request

Comments

@pepeiborra
Copy link
Collaborator

pepeiborra commented Feb 12, 2021

The current implementation walks all the source files to parse the module header and then proceeds to locate the imports. In large repos (>10000 modules) this results in very long startup times:

  • It is sequential
  • It is not cached
  • UPDATE: It's quadratic

The obvious solution is to cache it so that the cost is only paid once.

Open questions:

  • How would we cache it in a way that doesn't become stale?
  • Can we reuse one of the existing caches (.hi files, .hie files and hiedb)?

UPDATE: the reason it's quadratic is that we traverse the module graph once from each node, which is O(N^2). I wonder if there's a reason for this, in that Shake would crash if there were cyclic module dependencies and the error reporting would be crappy. /cc @cocreature @ndmitchell any insights?

/cc @wz1000 @mpickering

@wz1000
Copy link
Collaborator

wz1000 commented Feb 12, 2021

It is possible to cache almost anything in hiedb, given that you can answer the following questions:

  1. What schema do you intend to use that will let you answer your queries?
  2. Cache invalidation: How will the cache be checked for freshness and consistency?

Currently, for 2. each entry currently in hiedb is associated with a particular .hie file, which is identified by its filepath and hash. If the hash of the .hie file changes, we know to invalidate all the information associated with that file in the database.

@jneira jneira added performance Issues about memory consumption, responsiveness, etc. type: refactor type: enhancement New feature or request labels Feb 24, 2021
@cocreature
Copy link
Contributor

I wonder if there's a reason for this, in that Shake would crash if there were cyclic module dependencies and the error reporting would be crappy. /cc @cocreature @ndmitchell any insights?

That’s exactly it, shake gets very unhappy about cyclic deps.

@michaelpj
Copy link
Collaborator

This got fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues about memory consumption, responsiveness, etc. type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants