Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster dev builds; tooling for measuring build times and app performance. #3491

Merged
merged 10 commits into from
Jun 2, 2022

Conversation

kazcw
Copy link
Contributor

@kazcw kazcw commented May 27, 2022

Pull Request Description

  • Change dev profile settings. Improves build performance; will not affect anything else. Details below.
  • Introduce script for benchmarking various incremental builds. Usage is explained in the script comments.
  • Add a line to intervals showing total main-thread CPU work logged in a profile; this can be used to compare the results of optimizations (I'll be starting a discussion informed by that data separately; this change just enables the tooling to report it).

Dev profile settings

Let's start with a before-and-after. Here are the dev-mode build times for various incremental changes (data collected with the new bench-build.sh), without and with this PR:
debug1

The impact is greatest for far-reaching incremental changes--rebuild after editing a file in enso-prelude now takes less than half the time.

The fix: more debug info is not better

I noticed that the file enso_gui.wasm, which is the output of Rust compilation and the input of wasm-pack and wasm-bindgen, was 3.3 GB. Since the ide.wasm file produced by wasm-pack is about 80 MB and is not compressed, I suspected rustc was putting data in there that wasn't being used. wasm-objdump revealed that the file consisted almost entirely of debug info. The problem turned out to be a setting in the workspace's Cargo.toml: for the dev profile, we had a setting to enable debug info: debug = true. As it turns out, debug = true enables extremely verbose debug info that wasm-pack can't use (much of it is not usually used by anything even on x86); the subtly different setting debug = 1 enables enough debug info for informative stack traces, which is all we need debug info for, and it does it with gigabytes less compiler output.

Could the extra information ever be useful?

It's possible that some day, wasm-pack will support emitting this data in wasm files, and DevTools will support using it; but WASM doesn't even support encoding this information yet, so let's revisit this in the future when the tooling supports it.

Why does the rustwasm book recommend debug=true?

Good question. Let's see if they know about some reason for it.

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH
      ./run.sh ide dist and ./run.sh ide watch.

[ci no changelog needed]

…mance.

- Change debug profile settings to avoid generating multiple GB of debug
  data that wasm-pack discards anyway; improves build performance,
  especially for mid-to-large incremental changes.
- Introduce script for benchmarking various incremental builds.
- Add a line to `intervals` showing total main-thread CPU work logged in
  a profile; this can be used to compare the results of optimizations.
@kazcw
Copy link
Contributor Author

kazcw commented May 30, 2022

@wdanilo Re: our call this morning, I just compared debug=0; the difference in compile time vs debug=1 is less than 10% (it reduces the size of the intermediate wasm file by 80%, but it seems this debug info is not as expensive to process as the type of info produced by debug=true). So, I think the additional gain there is probably not worth the complexity of separate stacktrace/no-stacktrace build modes.

@wdanilo
Copy link
Member

wdanilo commented May 31, 2022

@wdanilo Re: our call this morning, I just compared debug=0; the difference in compile time vs debug=1 is less than 10% (it reduces the size of the intermediate wasm file by 80%, but it seems this debug info is not as expensive to process as the type of info produced by debug=true). So, I think the additional gain there is probably not worth the complexity of separate stacktrace/no-stacktrace build modes.

Thanks for the info! I'm not sure if we should not do it though. If you are working on a visual component, any speedup is important. After pressing save, waiting 9 instead of 10 seconds is a small, but observable difference. If you are modifying colors and changing padding, you're compiling GUI hundreds of times per day and this makes the process more smooth. If this is one-shot job (defining the build mode) I think we should just do it in order to optimize these builds for the interactive development as much as possible. What do you think of it?

Copy link
Member

@wdanilo wdanilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing.

@mwu-tow
Copy link
Contributor

mwu-tow commented May 31, 2022

@kazcw
I'm not sure if we want to have this for the native builds. Full debug symbols are useful there, e.g. when debugging non-WASM tests. Is it possible to have this only for WASM?

@kazcw
Copy link
Contributor Author

kazcw commented May 31, 2022

Re: @mwu-tow
I don't think we can set profile options per-target, but tests are always built with the test profile. Is there anything we build for testing that isn't a test as far as Cargo is concerned? If so, the build script could use --profile=test when building it.

Re: @wdanilo
Makes sense. I've changed dev to debug=0, and introduced dev-debug, which inherits everything from dev except it sets debug=1. @mwu, This will need build support--a new dev-debug choice for --wasm-profile. We will also need to accept production for #3498.

@mwu-tow
Copy link
Contributor

mwu-tow commented Jun 1, 2022

@kazcw

I don't think we can set profile options per-target, but tests are always built with the test profile. Is there anything we build for testing that isn't a test as far as Cargo is concerned? If so, the build script could use --profile=test when building it.

Ah, it is far less troublesome then. I have no issue with this.

This will need build support--a new dev-debug choice for --wasm-profile. We will also need to accept production for #3498.

I don't think that this would work, AFAIR wasm-pack supports only a predefined set of profiles. See: rustwasm/wasm-pack#1111
Have you tried calling it manually with such profile? If you have a working command example, let me know and I'll generate it.

@kazcw
Copy link
Contributor Author

kazcw commented Jun 1, 2022

Wow, I can't believe they don't support that. Tomorrow I will look into a workaround.

@kazcw
Copy link
Contributor Author

kazcw commented Jun 1, 2022

@mwu-tow
We can define our own "profiles" at the build-script level, by overriding profile settings through rustc flags (unfortunately, cargo doesn't seem to support any cleaner way to do that than RUSTFLAGS / CARGO_ENCODED_RUSTFLAGS). The profiles we'd want are:

  • dev / profile / release: The standard wasm-pack profiles, as currently supported.
  • debug: Uses the dev wasm-pack profile, but overrides debug level by setting RUSTFLAGS=-Cdebuginfo=1.

@mwu-tow
Copy link
Contributor

mwu-tow commented Jun 1, 2022

@kazcw Perhaps we should then just add additional parameter for "rust debug level"? Then we'd able to adjust it for all wasm-pack profiles.

@kazcw
Copy link
Contributor Author

kazcw commented Jun 1, 2022

@mwu-tow That works.

@kazcw
Copy link
Contributor Author

kazcw commented Jun 1, 2022

Because I put the new build benchmarking script in the tools/ directory, I need review from @4e6, sole codeowner of that subtree. The CODEOWNERS file lists that directory in the section Engine (old). Should I put the script somewhere else? This seems like a logical place for it.

@kazcw kazcw added the CI: Ready to merge This PR is eligible for automatic merge label Jun 1, 2022
@kazcw
Copy link
Contributor Author

kazcw commented Jun 2, 2022

I don't know the plan for /tools in general, but I'm claiming /tools/build-performance. Set codeowners for it (everyone in this discussion), and while I'm updating codeowners I added a line for /lib/rust/profiler.

@mergify mergify bot merged commit 8a50e8f into develop Jun 2, 2022
@mergify mergify bot deleted the wip/kw/build-faster-dev branch June 2, 2022 19:24
mwu-tow pushed a commit that referenced this pull request Jun 10, 2022
…file. (#3498)

### Pull Request Description

Using the new tooling (#3491), I investigated the **performance / compile-time tradeoff** of different codegen options for release mode builds. By scripting the testing procedure, I was able to explore many possible combinations of options, which is important because their interactions (on both application performance and build time) are complex. I found **two candidate profiles** that offer specific advantages over the current `release` settings (`baseline`):
- `thin16`: Supports incremental compiles in 1/3 the time of `baseline` in common cases. Application runs about 2% slower than `baseline`.
- `fat1-O4`: Application performs 13% better than `baseline`. Compile time is almost 3x `baseline`, and non-incremental.  
(See key in first chart for the settings defining these profiles.)

We can build faster or run faster, though not in the same build. Because the effect sizes are large enough to be impactful to developer and user experience, respectively, I think we should consider having it both ways. We could **split the `release` profile** into two profiles to serve different purposes:
- `release`: A profile that supports fast developer iteration, while offering realistic performance.
- `production`: A maximally-optimized profile, for nightly builds and actual releases.

Since `wasm-pack` doesn't currently support custom profiles (rustwasm/wasm-pack#1111), we can't use a Cargo profile for `production`; however, we can implement our own profile by overriding rustc flags.

### Performance details

![perf](https://user-images.githubusercontent.com/1047859/170788530-ab6d7910-5253-4a2b-b432-8bfa0b4735ba.png)

As you can see, `thin16` is slightly slower than `baseline`; `fat1-O4` is dramatically faster.

<details>
  <summary>Methodology (click to show)</summary>

I developed a procedure for benchmarking "whole application" performance, using the new "open project" workflow (which opens the IDE and loads a complex project), and some statistical analysis to account for variance. To gather this data:

Build the application with profiling:
`./run.sh ide build --profiling-level=debug`

Run the `open_project` workflow repeatedly:
`for i in $(seq 0 9); do dist/ide/linux-unpacked/enso --entry-point profile --workflow open_project --save-profile open_project_thin16_${i}.json; done`

For each profile recorded, take the new `total_self_time` output of the `intervals` tool; gather into CSV:
`echo $(for i in $(seq 0 9); do target/rust/debug/intervals < open_project_thin16_${i}.json | tail -n1 | awk '{print $2}'; do`
(Note that the output of intervals should not be considered stable; this command may need modification in the future. Eventually it would be nice to support formatted outputs...)

The data is ready to graph. I used the `boxplot` method of the [seaborn](https://seaborn.pydata.org/index.html) package, in order to show the distribution of data.
</details>

#### Build times
![thin16](https://user-images.githubusercontent.com/1047859/170788539-1578e41b-bc30-4f30-9b71-0b0181322fa5.png)

In the case of changing a file in `enso-prelude`, with the current `baseline` settings rebuilding takes over 3 minutes. With the `thin16` settings, the same rebuild completes in 40 seconds.

(To gather this data on different hardware or in the future, just run the new `bench-build.sh` script for each case to be measured.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants