Skip to content
Please note that GitHub no longer supports Internet Explorer.

We recommend upgrading to the latest Microsoft Edge, Google Chrome, or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize import desugaring for full builds #3768

Merged

Conversation

@colinwahl
Copy link
Contributor

colinwahl commented Jan 10, 2020

TLDR: This PR introduces an optimization to the import desugaring phase which eliminates duplicated work, and leads to drastically lower build times when you have a large amount of modules.

@natefaubion noticed that clean builds at Awake were starting to take quite a while, so I went and profiled one of them. I ran the results through profiteur and noticed that about 41% of the time (and memory) spent on our build was from the function externsEnv within desugarImportsWithEnv.

Screen Shot 2020-01-10 at 12 41 35 AM

Some spelunking revealed that externsEnv was called once per (transitive) dependency of each module that was being compiled. This resulted in quite a bit of repeated work. For example, if you have modules A, B, C, and D, where the dependency graph looks like A -> B -> C -> D, then you call externsEnv on the following lists of modules: [A], [A, B], [A, B, C]. Another example is that if every module depends on Prelude, then you are computing externsEnv on Prelude once per module.

externsEnv can be decently expensive, and it is easy to imagine that this duplicated work could get out of hand with more modules (we have definitely passed that threshold at Awake, I think we have 1300+ 2000+ modules).

I realized that we could use the fact that we are already compiling in topological order along with the fact that Envs will be disjoint for "non-transitively-dependent" pairs of modules to instead keep a running Env which is updated as externs files become available, and which never computes externsEnv on an extern file more than once. In doing this, a tremendous amount of redundant computation is avoided, resulting in a much "healthier" looking profile:

Screen Shot 2020-01-10 at 1 07 40 AM

Screen Shot 2020-01-10 at 1 07 49 AM

Where calls to externsEnv account for 0.2% of the time of a full build, and parsing (~50%), typechecking (~26%) and codegen (~15%) are the dominating items of the profile

In practice, on the Awake codebase, this reduces the time needed for a full build from ~2ms20s to ~1m20s, about a 43% speedup.

The bulk of this change is quite manual and not very invasive - the important bits are in the changes to buildModule, which will need some eyes on it :) The main change to extends the coordination done in make to collect dependencies, and then ensure all dependency's externs are added to the Env in the correct topological order. Since the coordination to collect dependencies was already done, this ended up being a pretty easy modification.

Notes:

  • I have only applied the optimization when compiling through make, all other workflows such as psci, purs-ide, etc should not be affected.
  • I see that a decent amount of time (~6%) is spend computing applyExternsFileToEnvironment within rebuildModule. This may be able to benefit from the same sort of optimization.
Copy link
Member

kritzcreek left a comment

This looks like a great find! The IDE rebuilds probably don't care as we're never building more than one module there which means we'll need to build the full Env for that module in one go either way.

stack.yaml.lock Outdated Show resolved Hide resolved
Copy link
Member

hdgarrood left a comment

Really nice work. Thank you very much!

@natefaubion natefaubion merged commit 25afff7 into purescript:master Jan 11, 2020
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garyb

This comment has been minimized.

Copy link
Member

garyb commented Jan 11, 2020

🎉

@colinwahl

This comment has been minimized.

Copy link
Contributor Author

colinwahl commented Jan 11, 2020

Awesome! Thanks all for the reviews 👍

@colinwahl colinwahl deleted the colinwahl:colin/name-desugaring-optimization branch Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.