The binary size - performance tradeoff #69

japaric · 2018-03-22T12:53:01Z

As of the latest LLVM upgrade (4.0 -> 6.0 on 2018-02-11) LLVM seems to now perform loop unrolling more agressively; this increased the binary size of a minimal program that only zeroes .bss and initializes .data from 130 bytes .text (nightly-2018-02-10) to 1114 bytes (nightly-2018-03-20) when using opt-level=3 + LTO -- FWIW, I highly doubt the loop unrolling actually improves performance at all. Original report: rust-lang/rust#49260

This put us in a bad spot because by default we'll end with large optimized (--release) binaries -- I can already foresee future comparisons between C and Rust pointing out that the smallest embedded C program is only a hundred bytes in size whereas the smallest embedded Rust program is 1 KB.

So we should make sure we clearly document why Rust programs are so large by default and how to make Rust programs small. Using opt-level=s + LTO on the minimal program mentioned above brings the size back to 130 bytes .text.

cc @jamesmunns ^ that should be included in the book

There are other possibilites to explore here: like having something like C's / clang's #pragma nounroll to prevent LLVM from optimizing loops marked with that attribute, but I doubt we'll get any of that into the 2018 edition release -- it's too late, I think.

The text was updated successfully, but these errors were encountered:

therealprof · 2018-03-23T14:41:26Z

FWIW: I think funky stuff like unrolling (which is really worthless on embedded architectures) is to be expected to be done at higher optimisation levels and I've seen all kind of funky size regressions. I'm always using (and recommending) -s. Potentially -z could also be tried, but -O3 is a big nono...

whitequark · 2018-03-23T14:44:01Z

-O3 is pretty much defined as "-O2 with optimizations that cause code bloat", so I'm not sure why you'd go higher than -O2 on embedded devices.

therealprof · 2018-03-23T15:17:28Z

the smallest embedded C program is only a hundred bytes in size whereas the smallest embedded Rust program is 1 KB.

NB: I highly doubt that. As soon as one uses some the initialisation code from some of the typical SDKs, the code will be well in the kBs already. To even stay in Rusts range you'll have to manually bang the memory mapped registers and write your own linker scripts.

Case in point, this is the smallest possible binary for a main { while(1) {} } loop for the STM32F051 I could achieve based on STM32Cube initialisation:

# arm-none-eabi-size .pioenvs/disco_f051r8/firmware.elf
   text	   data	    bss	    dec	    hex	filename
    892	   1080	   1600	   3572	    df4	.pioenvs/disco_f051r8/firmware.elf

Emilgardis · 2018-03-23T15:19:43Z

Could we get a RFC for something like #[no_unroll]/#[unroll(disable)]?

whitequark · 2018-03-23T15:20:43Z

@Emilgardis No need for an RFC, marking the function with the loop as #[cold] should suffice.

jonas-schievink · 2018-03-23T15:22:58Z

@japaric also wrote in rust-lang/rust#49260:

My experience with opt-level={s,z}, at least when LLVM 4 was around, is that they produce bigger binaries than opt-level=3

If this is still the case with LLVM 6, this definitely wants to be investigated and fixed on the LLVM side.

He also wrote:

iirc, opt-level={s,z} also reduces the iniling threshold which prevents LLVM from optimizing dead branches when using RTFM's claim mechanism.

This might be the cause for some amount of bloat due to unnecessary branches, but shouldn't #[inline] be a strong enough hint to LLVM to still inline the function?

therealprof · 2018-03-23T15:23:17Z

Why not get the default flags changed instead? It'd be very annoying to put annotations in every source file just in case someone might accidentally not change the compiler flags...

whitequark · 2018-03-23T15:24:22Z

This might be the cause for some amount of bloat due to unnecessary branches, but shouldn't #[inline] be a strong enough hint to LLVM to still inline the function?

The inlining thresholds in LLVM are tailored for C, which produces functions with relatively compact IR, and likely aren't well suited for Rust. In our in-house language we had to raise them significantly to get decent reductions in code size.

Emilgardis · 2018-03-23T15:26:00Z

@whitequark I've never heard of that attribute, seems like it should work however.

therealprof · 2018-03-23T15:28:57Z

@jonas-schievink I can not confirm that it produces larger files with opt-level=s, at least not in general. This all has quite a bit of premature optimisation smell to it, same as with the #[inline(always)] we had sprinkled all over the map...

durka · 2018-03-23T16:23:16Z

There has been a small amount of discussion about unrolling atttributes: rust-lang/rfcs#2219

…

On Fri, Mar 23, 2018 at 11:29 AM, Daniel Egger ***@***.***> wrote: @jonas-schievink <https://github.com/jonas-schievink> I can not confirm that it produces larger files with opt-level=s, at least not in general. This all has quite a bit of premature optimisation smell to it, same as with the #[inline(always)] we had sprinkled all over the map... — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#69 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAC3ny1RMzakWcAlwf4TUD8Up6h0FM_1ks5thRS9gaJpZM4S2_bO> .

RandomInsano · 2018-03-24T13:42:19Z

The choice to unroll or not should be at the LLVM layer. With the Linux HAL I could very well want loop unrolling if my x86_64 machine was using a driver over the SMBus. Is it possible for the LLVM backends to automatically opt-out when appropriate?

…

On Mar 23, 2018, at 11:23 AM, Alex Burka ***@***.***> wrote: There has been a small amount of discussion about unrolling atttributes: rust-lang/rfcs#2219 On Fri, Mar 23, 2018 at 11:29 AM, Daniel Egger ***@***.***> wrote: > @jonas-schievink <https://github.com/jonas-schievink> I can not confirm > that it produces larger files with opt-level=s, at least not in general. > This all has quite a bit of premature optimisation smell to it, same as > with the #[inline(always)] we had sprinkled all over the map... > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#69 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAC3ny1RMzakWcAlwf4TUD8Up6h0FM_1ks5thRS9gaJpZM4S2_bO> > . > — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

japaric · 2018-08-10T02:44:06Z

This issue was moved to rust-embedded/book#11

japaric added the docs label Mar 22, 2018

est31 mentioned this issue Apr 6, 2018

Document guidance on optimizing for size rustwasm/team#109

Closed

japaric mentioned this issue Aug 10, 2018

The binary size - performance tradeoff rust-embedded/book#11

Closed

japaric closed this as completed Aug 10, 2018

jonas-schievink mentioned this issue Sep 7, 2018

opt-level: z often worse than s, sometimes worse than 3 on small files rust-lang/rust#54026

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The binary size - performance tradeoff #69

The binary size - performance tradeoff #69

japaric commented Mar 22, 2018

therealprof commented Mar 23, 2018

whitequark commented Mar 23, 2018

therealprof commented Mar 23, 2018 •

edited

Loading

Emilgardis commented Mar 23, 2018

whitequark commented Mar 23, 2018

jonas-schievink commented Mar 23, 2018

therealprof commented Mar 23, 2018

whitequark commented Mar 23, 2018

Emilgardis commented Mar 23, 2018

therealprof commented Mar 23, 2018

durka commented Mar 23, 2018 via email

RandomInsano commented Mar 24, 2018 via email

japaric commented Aug 10, 2018

The binary size - performance tradeoff #69

The binary size - performance tradeoff #69

Comments

japaric commented Mar 22, 2018

therealprof commented Mar 23, 2018

whitequark commented Mar 23, 2018

therealprof commented Mar 23, 2018 • edited Loading

Emilgardis commented Mar 23, 2018

whitequark commented Mar 23, 2018

jonas-schievink commented Mar 23, 2018

therealprof commented Mar 23, 2018

whitequark commented Mar 23, 2018

Emilgardis commented Mar 23, 2018

therealprof commented Mar 23, 2018

durka commented Mar 23, 2018 via email

RandomInsano commented Mar 24, 2018 via email

japaric commented Aug 10, 2018

therealprof commented Mar 23, 2018 •

edited

Loading