Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firefox compile time regression #43211

Open
glandium opened this Issue Jul 13, 2017 · 34 comments

Comments

Projects
None yet
@glandium
Copy link
Contributor

glandium commented Jul 13, 2017

Preliminary note: this may or may not be related to #43141, but just in case, I file this separately instead of piling up there.

I'm sorry I don't have much more fine-grained data at the moment, but -Ztime-passes, when multiple things are built in parallel (while building Firefox), is not entirely helpful, because it doesn't tell to which crate the interleaving numbers belong to.

Rust build times on Firefox are already bad with 1.18. Some of the most time consuming crates have build times that look like the following:

  • 74.992975322 webrender
  • 99.960699075 geckoservo
  • 259.393332042 gkrust
  • 259.639402886 gkrust_gtest
  • 363.456949809 style

With the latest nightly as of writing (1.20.0-nightly (f85579d 2017-07-12)), those times become:

  • 85.71714646 webrender
  • 124.097890719 geckoservo
  • 384.619852505 gkrust_gtest
  • 396.106775997 gkrust
  • 455.281318295 style

I'm trying to narrow it down to older nightlies, but it started badly, because I got an ICE building the style crate on the last 1.18.0 nightly (2017-04-29)

Those are times for --release with lto enabled (I have a separate bug to file about lto, but I don't think that's relevant to this regression).

CC: @froydnj, @rillian

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 13, 2017

Mmmmm with 1.19.0-nightly (75b0568 2017-05-15):

  • 60.430645418 webrender
  • 116.540179527 0 geckoservo
  • 370.565763687 0 gkrust_gtest
  • 372.559032193 0 gkrust
  • 433.088609909 0 style

With 1.19.0-nightly (d3abc80 2017-05-09):

  • 63.26174462 webrender
  • 117.30302075 geckoservo
  • 319.482715603 gkrust
  • 320.997348082 gkrust_gtest
  • 387.128239057 style

There would seem to be different regression points for different things, and early in the 1.19 development, at that (I tried 75b0568 because of range given in #43141 )

Also note that the order in which the mentioned crates are built is webrender, style, geckoservo, gkrust, gkrust_gtest (the last two are interchangeable depending on timing), and are all happening sequentially because, to simplify things, they are a dependency chain (except webrender, which may build in parallel of style, but it's not like it would matter much).

Also note that this long tail of builds takes so long that by the time geckoservo starts to build after style, there's only that happening on the build machine (everything C++ that doesn't depend on rust is done), and in fact, style has been the only thing happening on the build machine for a while too...

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 13, 2017

Note, I'm not excluding that as a counterpart, the resulting builds might have better code, but I haven't verified that (and I don't know which specific tests in the Firefox test suites would tell me that about the rust code)

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 13, 2017

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jul 13, 2017

@glandium

I remember that LLVM 4.0 introduced a 10% regression in LLVM passes time, but that was #40123 (merged Apr 24).

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jul 13, 2017

P-?

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

It would be interesting to know if turning off LTO makes a difference here.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 17, 2017

More data points would also be useful -- how hard are they to gather?

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 17, 2017

What kind of data points? Different versions? Different configurations?

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

@glandium Different Rust versions, yes, and if possible, timings for specific passes.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

Some more questions:

  • Can you give some instructions on how to build Firefox with Stylo enabled? Is it just ./mach build?
  • Does FF use the system provided Rust? I.e. can I just switch the Rust version via rustup and expect the FF build system to pick that up?
  • Where does the Rust code in FF live? (I tried changing files in the servo subdirectory but that did not seem to do anything)
@jdm

This comment has been minimized.

Copy link
Contributor

jdm commented Jul 17, 2017

  • To build with stylo, create a .mozconfig file in the root directory and add ac_add_options --enable-stylo.
  • The build system uses what ever rustc is in your PATH.
  • servo/components/{style,selectors} and servo/ports/geckolib are the main places.
@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

Thanks, @jdm! Do you know about the other questions?

@jdm

This comment has been minimized.

Copy link
Contributor

jdm commented Jul 17, 2017

I edited my previous comment to add answers.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

Thanks, @jdm! I'll try to reproduce those build times locally and collect some data.

Will the build system respect CARGO_BUILD_JOBS=1? I seem to remember that that didn't work when building servo, but it might help getting more readable -Ztime-passes output.

@jdm

This comment has been minimized.

Copy link
Contributor

jdm commented Jul 17, 2017

I do not know the answer to that, unfortunately.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 17, 2017

./mach -j1 seems to do the trick.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 18, 2017

So, I've gathered some data with the latest nightly (rustc 1.20.0-nightly (2652ce677 2017-07-17)):

                TOTAL        LLVM
--------------------------------------                
webrender         63s         53s
style            262s        196s
geckoservo       117s        106s
gkrust           241s        237s

This is with LTO enabled. The full output can be found here:
https://gist.github.com/michaelwoerister/c0e34a3f69221dcad6581ddedec79e3a

None of these crates spends a lot of time before LLVM. style is a bit of an exception here (with a surprising amount of time spent in name resolution) but overall (unless I messed up in my measurements somewhere) we can rule out that #43141 plays a role here.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 18, 2017

@glandium It's interesting that your timings of style and gkrust seem a lot more extreme than mine. Is it possible that this is a memory/swapping issue too?

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 18, 2017

Some more data: Turning off LTO makes the compile time of the final crate practically disappear. This is to be expected as that crate is almost empty. The rest stays pretty much unchanged.

                TOTAL        LLVM
--------------------------------------                
webrender         59s         51s
style            251s        187s
geckoservo       118s        107s
gkrust           < 1s        < 1s
@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 18, 2017

Here are my timings with the nightly of the original post (1.20.0-nightly (f85579d 2017-07-12)):

                TOTAL        LLVM
--------------------------------------                
webrender         61s         51s
style            262s        198s
geckoservo       114s        103s
gkrust           239s        234s

They don't look quite as bad.

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 18, 2017

Do you have numbers with 1.18?

My numbers come from Firefox CI, which happens on Amazon EC2, where the CPUs are Xeons that probably are slower than your CPU?

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 19, 2017

Here are timings with 1.18 stable:

                TOTAL        LLVM
--------------------------------------                
webrender         43s         36s
style            178s        140s
geckoservo        87s         79s
gkrust           160s        156s

Things are quite a bit slower compared to the latest nightly. But it's not an entirely valid comparison, since nightlies have debug assertions (and maybe even LLVM assertions) enabled in the compiler, so they are expected to be slower. I'll try to find a nightly that roughly corresponds to 1.18 stable, then we'll see how much that accounts for.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jul 19, 2017

Rust Nightly does enable LLVM assertions, whereas other channels do not. The impact on compile times is significant, especially when doing optimizations.

Rust CI does make builds available without LLVM assertions, but they’re not easy to use: #39754. (More discussion of this: https://internals.rust-lang.org/t/disabling-llvm-assertions-in-nightly-builds/5388)

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 19, 2017

See here are timings from rustc 1.18.0-nightly (63c77214c 2017-04-24), i.e. a nightly version roughly equivalent to stable 1.18, but with all assertions enabled:

                TOTAL        LLVM
--------------------------------------                
webrender         60s         52s
style            262s        217s
geckoservo       120s        111s
gkrust           206s        201s

These times are roughly the same as the current nightly, except for gkrust, where the current nightly spends 31 seconds more in LLVM.

The full -Z time-passes output of everything above can be found at:
https://gist.github.com/michaelwoerister/c0e34a3f69221dcad6581ddedec79e3a

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 25, 2017

So, I did some builds with 1.18, 1.19 (now released) and 1.20 beta (presumably without the LLVM asserts). I forced them to be -j1 both for make and cargo, avoiding things running in parallel and altering the times of each other (which, they appear to do, a lot).

As in the first comment, I'm going to list the times taken by the slowest crates:

                1.18     1.19     1.20
webrender      40.06s   39.61s   44.87s
clap           40.08s   53.27s   57.75s
bindgen        44.31s   44.47s   44.46s
geckoservo     77.14s   77.32s   84.46s
syntex_syntax  105.02s  103.19s  96.80s
gkrust         128.17s  132.96s  140.84s
gkrust_gtest   128.72s  134.81s  141.59s
style          187.41s  188.26s  206.54s

So while not as bad as in the first comment, it's not entirely good either. At least 1.19 is not too bad, except, somehow, for clap.

@glandium

This comment has been minimized.

Copy link
Contributor Author

glandium commented Jul 27, 2017

https://bugzilla.mozilla.org/show_bug.cgi?id=1382743#c11 suggests something did go bad when upgrading from 1.18 to 1.19. -j1 might be hiding some issues.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jul 27, 2017

-j1 might be hiding some issues.

If the issues are related to the job server then that's quite likely.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 9, 2017

1.20 is now beta

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 10, 2017

There appears to be a 5% "unexplained" regression between 1.19 and 1.20. Allocators are the prime suspect - is anyone looking at it?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 11, 2017

@arielb1 I haven't had much time personally to look into the allocator-related regressions yet. Where'd you find the 5% number from, though? Is there perhaps a suspect commit on perf.rust-lang.org?

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 13, 2017

@alexcrichton

That's from @glandium's numbers

                1.18     1.19     1.20
webrender      40.06s   39.61s   44.87s
clap           40.08s   53.27s   57.75s
bindgen        44.31s   44.47s   44.46s
geckoservo     77.14s   77.32s   84.46s
syntex_syntax  105.02s  103.19s  96.80s
gkrust         128.17s  132.96s  140.84s
gkrust_gtest   128.72s  134.81s  141.59s
style          187.41s  188.26s  206.54s

style was affected by #43572, but for the other crates it looks like about 5%

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Aug 17, 2017

triage: P-medium

Discussed this in the @rust-lang/compiler meeting. We decided to downgrade this to P-medium, since the immediate problem has been resolved, though we will want to revisit (and perhaps upgrade priority again) in the future.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 28, 2017

1.20 will soon be stable

@michaelwoerister michaelwoerister removed their assignment Oct 31, 2017

@cyplo

This comment has been minimized.

Copy link
Contributor

cyplo commented Apr 15, 2018

Heya !
Some perf improvements landed in 2018, including incremental compilation, I wonder if they helped in this case ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.