Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm thumbv6m binary size increase with dead code change (or from 1.50 to 1.51) #82748

Open
davidlattimore opened this issue Mar 4, 2021 · 0 comments
Labels
I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state

Comments

@davidlattimore
Copy link
Contributor

I've observed an unexpected increase in binary size in response to a change in a crate that we use. The change only adds new public methods, which we don't call, so all the changed code is effectively dead code, but still it results in a significant increase in our binary size. My guess is that the presence of this new code causes LLVM to make different inlining decisions, even though the new code isn't actually called anywhere.

This happens on 1.50.0. The increase (for a minimal binary included below) is from 932 bytes to 2164 bytes.

Switching from 1.50 to 1.51 (currently in beta) without the above change causes the same increase from 932 bytes to 2164 bytes.

I was going to mark this as a stable to beta regression, but TBH, I think it's probably a pre-existing issue that just triggers in response to legitimate changes in library code. I expect that whatever changed between 1.50 and 1.51 is similar in nature to the code change above.

I've tarred up a moderately minimal bit of code that reproduces this:

binary-size-increase.tar.gz

To reproduce, run the ./check-size script contained within the tarball. You might need to rustup target install thumbv6m-none-eabi first.

For me, with current stable 1.50, this shows a change in binary size from 932 bytes to 2164 bytes:

Size prior to commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 932 Mar  4 16:39 a.bin
Size after commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 2164 Mar  4 16:39 b.bin

If I adjust the script to use 1.51, then I get 2164 bytes for both.

Looking at the disassembly of each binary, it seems that the larger binary includes compiler_builtins::int::specialized_div_rem::u64_div_rem, where the smaller binary doesn't. u64_div_rem is called from __udivmoddi4, which is called from __aeabi_uldivmod. These are also absent from the smaller binary, but present and called from MicroSecond::cycles / Delay::delay in the larger binary.

Cargo.toml sets opt-level = "s". Similar results are observed with opt-level = "z".

Given that LTO is enabled, I'd have expected that dead code would be removed before inlining decisions were made, so I'm surprised that a change to code that isn't called would have this effect.

If there's anything we can do to help LLVM make more optimal decisions when optimizing for binary size, that'd be awesome, although I'm sure it's a pretty difficult problem.

@JohnTitor JohnTitor added I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state labels Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state
Projects
None yet
Development

No branches or pull requests

2 participants