Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CodeGen options to optimize for size. #32386

Merged
merged 4 commits into from
May 3, 2016

Conversation

brandonedens
Copy link
Contributor

Add CodeGen options to annotate functions with the attributes OptimizeSize and/or MinSize used by LLVM to reduce .text size.
Closes #32296

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @arielb1 (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@huonw
Copy link
Member

huonw commented Mar 21, 2016

Are these orthogonal to the opt-level? I.e. does it make sense to have, say, -C opt-level=1 -C opt-size=true?

Also, I wonder if this could be a level thing too, e.g. -C opt-size=0 (default), -C opt-size=1 (OptimizeSize) -C opt-size=2 (MinSize).

@alexcrichton
Copy link
Member

I agree with @huonw that we likely want to fold this all into one parameter so we can play around with various values in the future.

cc @rust-lang/tools, @rust-lang/compiler

@sanxiyn
Copy link
Member

sanxiyn commented Mar 21, 2016

Shouldn't this also change inline threshold? See #29943.

@brandonedens
Copy link
Contributor Author

So, clang specifies
-O1, -O2, -O3, -Os, and -Oz
-Os trips OptimizeSize
And -Oz trips function attribute MinSize and OptimizeSize

Internally in clang -Os and -Oz
Set an optimize size internal variable to 1 and 2 respectively and then
sets those function attributes and likely more llvm attributes and the like
in the future (compressed .data anyone?)

I think the opt_size integer is an excellent idea, or we could encode the
opt_level Os and Oz.

It's up to you all.
I'll submit another patch this evening after work for opt_size integer
(quick and cheap) and if desired I'll also try the Os and Oz implementation.

How do I gate these commands on nightly, there was discussion on irc about
doing so. Maybe I'll investigate that too.

Thanks!
Brandon
On Mar 20, 2016 9:08 PM, "Huon Wilson" notifications@github.com wrote:

Are these orthogonal to the opt-level? I.e. does it make sense to have,
say, -C opt-level=1 -C opt-size=true?

Also, I wonder if this could be a level thing too, e.g. -C opt-size=0
(default), -C opt-size=1 (OptimizeSize) -C opt-size=2 (MinSize).


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#32386 (comment)

@alexcrichton
Copy link
Member

It looks like clang's other special treatment of these function attributes is to optimize any #[cold] function for size, but that's orthogonal to this patch I think. Other than that I believe we mirror clang by just providing the options?

@alexcrichton
Copy link
Member

Ah yes and I agree with @sanxiyn that we should mirror the inline threshold decisions in clang for now, and if the optimize-for-size flag factors into that we should likely do so too.

@brandonedens
Copy link
Contributor Author

I also noticed in clang lib/Frontend/CompilerInvocation.cpp +442
Opts.UnrollLoops =
Args.hasFlag(OPT_funroll_loops, OPT_fno_unroll_loops,
(Opts.OptimizationLevel > 1 && !Opts.OptimizeSize));

Looks like it also might be appropraite to disable unrolling loops when opt_size > 0. Still poking around in the Rust internals to see how that would be appropriate to do.

@sanxiyn
Copy link
Member

sanxiyn commented Mar 22, 2016

Looking at lib/Analysis/InlineCost.cpp, inline threshold is set to 75 for Os and 25 for Oz.

@arielb1
Copy link
Contributor

arielb1 commented Mar 22, 2016

r? @alexcrichton

@alexcrichton
Copy link
Member

This is looking great to me, thanks @brandonedens! We talked in person about perhaps adding a disabling of loop unrolling as well, but otherwise could you just add a test or two to the PR? This could both test acceptance of the option and also perhaps add a src/test/codegen test to ensure that we emit the right LLVM attributes.

@alexcrichton
Copy link
Member

Ok, we discussed this at the tools triage meeting yesterday and here were some of our thoughts:

  • We think that having -C opt-size separate from -C opt-level may not actually be the course of action we wish to take. If you're optimizing for size, for example, you basically have to turn optimizations on to get inlining/constant propagation to eliminate huge swaths of otherwise dead code. In that sense -C opt-level=0 -C opt-size=2 doesn't actually make that much sense, so we may as well fold them into the same -C opt-level flag. This also mirrors what gcc/clang do, I believe?
  • We probably want two optimization levels for size (as you've got here), and we just need to be sure to pick their names carefully as they'll be instantly stable.
  • Do you have any numbers indicating what the impact of -C opt-size=2 are? We were curious just to see, for example, the impact on the compiler or other various bits of code.

@brandonedens
Copy link
Contributor Author

Excellent to hear of the conversation and it was a pleasure to chat the
other night.

Super excited about this infrastructure work but bear with me; just
switched over to Rust on armv8 rather than x86 and most of my free time for
these things is nights and weekends.

As I mentioned before it might be prudent to follow a strategy similar to
GCC and/or Clang. In those compilers the O? flags are meta flags that in
turn enable or disable a collection of optimizations and those
optimizations fluctuate based upon the target too.

So via GCC you can execute
gcc -Q -Os --help=optimizers
to get a listing of what optimizations will be enabled/disabled for Os.
as a side note:
gcc -Q --help=target
also presents interesting information.

So we have the following meta flags.
GCC:
O0, O1, O2, O3, Og, Ofast, Os
Og = compile with optimizations but maximum debugging capability
Ofast = disregard standard compliance
Os = optimize for size

Clang:
O0, O1, O2, O3, Ofast, Os, Oz, -O, -O4
Oz = optimize for size at the expense of everything else
O4 (and up currently just O3)

I'd probably switch opt-level to mirror the naming in GCC / Clang, a
string. I'd probably then begin to expose the underlying optimizations
being enabled as adjustable parameters and then encode those flags being
enabled / disabled using the meta options while also presenting the user a
method of inspecting what was actually enabled / disabled in a manner
similar to GCC.

Lately I've been integrating GCC output with other compilers and its really
important to know and control how the compiler is aligning the data, what
calling convention is being used, etc...

Can we gate this stuff on nightly only? I know you said it'd immediately be
stable but it'd be real nice if we could tuck it away until we know the
ergonomics are good.

Regardless, I'll try to build up something that looks more akin to Clang on
this branch; basically some functionality via Os and Oz.

Brandon
On Mar 24, 2016 09:30, "Alex Crichton" notifications@github.com wrote:

Ok, we discussed this at the tools triage meeting yesterday and here were
some of our thoughts:

  • We think that having -C opt-size separate from -C opt-level may not
    actually be the course of action we wish to take. If you're optimizing for
    size, for example, you basically have to turn optimizations on to get
    inlining/constant propagation to eliminate huge swaths of otherwise dead
    code. In that sense -C opt-level=0 -C opt-size=2 doesn't actually make
    that much sense, so we may as well fold them into the same -C opt-level
    flag. This also mirrors what gcc/clang do, I believe?
  • We probably want two optimization levels for size (as you've got
    here), and we just need to be sure to pick their names carefully as they'll
    be instantly stable.
  • Do you have any numbers indicating what the impact of -C opt-size=2
    are? We were curious just to see, for example, the impact on the compiler
    or other various bits of code.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#32386 (comment)

@alexcrichton
Copy link
Member

Yeah we certainly have the ability to make things nightly only if necessary. For example we can have nightly-only compiler flags (which require an unstable opt-in) or we could gate specific values of specific flags (if we really want). It's basically up to do whatever.

We do have some -C flags to control optimization passes in LLVM, although they're not necessarily super comprehensive

@bors
Copy link
Contributor

bors commented Mar 27, 2016

☔ The latest upstream changes (presumably #32432) made this pull request unmergeable. Please resolve the merge conflicts.

@trufae
Copy link

trufae commented Mar 27, 2016

Can you provide some size comparisons compiling (Servo for example) ?

@brandonedens brandonedens force-pushed the llvm_min_size branch 2 times, most recently from 219410b to 8230673 Compare March 28, 2016 06:12
@brandonedens
Copy link
Contributor Author

I've yet to be able to compile servo with my latest update to the branch. I'll see if I can figure out how to get servo mach to use my local rust build.

I did however compile the time example from Iron under optimizations -opt-level=2,3,s,z on x86_64 as --release after I saw your request.
Here are the sizes:

$ size time.O*
text data bss dec hex filename
3332499 454392 3952 3790843 39d7fb time.O2
3332135 454392 3952 3790479 39d68f time.O3
3304303 454432 3952 3762687 3969ff time.Os
3300695 455008 3952 3759655 395e27 time.Oz

I started looking at the symbol size differences between O3 and Oz which is kinda interesting. Can produce diffs and/or the ELF files if you'd like to look at them yourself.

@alexcrichton
Copy link
Member

Interesting! Looks like it's a <1% difference between O3 and Oz?

@brandonedens
Copy link
Contributor Author

Here are the differences between O3 to Oz sorted by size and alphabetic. Note the elements that came from the Rust compile.

time.O3_to_Oz_ordered_alphabetic.patch.txt

time.O3_to_Oz_ordered_by_size.patch.txt

llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, t as u32);
}
(llvm::CodeGenLevelNone, _) => {
(_, llvm::CodeGenOptSizeDefault, _) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, 75);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this doesn't quite mirror the standard LLVM logic?

If O3 is enabled, the threshold seems to be 275, then there's size/min (75/25), and finally there's the default 225.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 225 and 275 below. Yes, the test is in different order, but opt_level and opt_size are not independent. opt_level is O2 if opt_size is Os or Oz.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm pointing out is that this is different from clang, which we are intentionally mirroring for now.

@alexcrichton
Copy link
Member

@brson how do you feel about -C opt-level=z and -C opt-level=s?

…attempts

to mimic the behavior of clang's options Os and Oz.
… nightly

compiler check to prevent their inclusion in the stable / beta compilers.
@brandonedens
Copy link
Contributor Author

Bump Alex.
In my latest commit, is that an appropriate method of restricting -Os and -Oz to nightly only? In doing so should I also remove the tests?

@brandonedens
Copy link
Contributor Author

Here's a comparison between two release builds of the iron library with and without optimize for minsize. Might be interesting to see what final binaries look like after building rust proper minsize optimized. Still haven't succeeded in getting rust built from a local rust; advice appreciated.

▶ size target/release-minsize/libiron.rlib
text data bss dec hex filename
14832 592 16 15440 3c50 iron.0.o (ex target/release-minsize/libiron.rlib)
9163 264 0 9427 24d3 iron.1.o (ex target/release-minsize/libiron.rlib)
46861 656 0 47517 b99d iron.2.o (ex target/release-minsize/libiron.rlib)
11943 376 0 12319 301f iron.3.o (ex target/release-minsize/libiron.rlib)

▶ size target/release/libiron.rlib
text data bss dec hex filename
19074 592 16 19682 4ce2 iron.0.o (ex target/release/libiron.rlib)
14896 264 0 15160 3b38 iron.1.o (ex target/release/libiron.rlib)
59684 656 0 60340 ebb4 iron.2.o (ex target/release/libiron.rlib)
13383 376 0 13759 35bf iron.3.o (ex target/release/libiron.rlib)

@brson
Copy link
Contributor

brson commented Apr 29, 2016

@alexcrichton +1

(Some("2"), _) => OptLevel::Default,
(Some("3"), _) => OptLevel::Aggressive,
(Some("s"), true) => OptLevel::Size,
(Some("z"), true) => OptLevel::SizeMin,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tweak the error message here to indicate that nightly is required if s or z is passed on the stable/beta builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added below and tested locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I added a early exit for attempts to optimize for size on non-nightly.

@alexcrichton
Copy link
Member

@bors: r+ 16eaecb

@bors
Copy link
Contributor

bors commented May 2, 2016

⌛ Testing commit 16eaecb with merge 9a003b0...

bors added a commit that referenced this pull request May 2, 2016
Add CodeGen options to optimize for size.

Add CodeGen options to annotate functions with the attributes OptimizeSize and/or MinSize used by LLVM to reduce .text size.
Closes #32296
@bors
Copy link
Contributor

bors commented May 2, 2016

💔 Test failed - auto-win-gnu-64-nopt-t

@alexcrichton
Copy link
Member

@bors: retry

On Mon, May 2, 2016 at 1:13 PM, bors notifications@github.com wrote:

[image: 💔] Test failed - auto-win-gnu-64-nopt-t
http://buildbot.rust-lang.org/builders/auto-win-gnu-64-nopt-t/builds/4063


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#32386 (comment)

@spacekookie
Copy link
Member

I'm sorry to maybe be asking a stupid question here but how would I go about using opt-level={s,z}? There isn't actually any documentation anywhere on that. Like...I assumed they would be flags in my cargo.toml file but that didn't work :P Is it something I somehow need to provide to rustc?

@steveklabnik
Copy link
Member

@spacekookie it's all good! However, we try to keep the issue tracker soley for bugs; getting help on a PR isn't the best way to go about it. Could you post your question to https://users.rust-lang.org/ instead? It's the best place to ask for questions and get help. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.