Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid panic_bounds_check in fmt::write. #78122

Merged

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Oct 19, 2020

Writing any fmt::Arguments would trigger the inclusion of usize formatting and padding code in the resulting binary, because indexing used in fmt::write would generate code using panic_bounds_check, which prints the index and length.

These bounds checks are not necessary, as fmt::Arguments never contains any out-of-bounds indexes.

This change replaces them with unsafe get_unchecked, to reduce the amount of generated code, which is especially important for embedded targets.


Demonstration of the size of and the symbols in a 'hello world' no_std binary:

Source code
#![feature(lang_items)]
#![feature(start)]
#![no_std]

use core::fmt;
use core::fmt::Write;

#[link(name = "c")]
extern "C" {
    #[allow(improper_ctypes)]
    fn write(fd: i32, s: &str) -> isize;
    fn exit(code: i32) -> !;
}

struct Stdout;

impl fmt::Write for Stdout {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        unsafe { write(1, s) };
        Ok(())
    }
}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
    let _ = writeln!(Stdout, "Hello World");
    0
}

#[lang = "eh_personality"]
fn eh_personality() {}

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    unsafe { exit(1) };
}

Before:

   text	   data	    bss	    dec	    hex	filename
   6059	    736	      8	   6803	   1a93	before
0000000000001e00 T <T as core::any::Any>::type_id
0000000000003dd0 D core::fmt::num::DEC_DIGITS_LUT
0000000000001ce0 T core::fmt::num::imp::<impl core::fmt::Display for u64>::fmt
0000000000001ce0 T core::fmt::num::imp::<impl core::fmt::Display for usize>::fmt
0000000000001370 T core::fmt::write
0000000000001b30 t core::fmt::Formatter::pad_integral::write_prefix
0000000000001660 T core::fmt::Formatter::pad_integral
0000000000001350 T core::ops::function::FnOnce::call_once
0000000000001b80 t core::ptr::drop_in_place
0000000000001120 t core::ptr::drop_in_place
0000000000001c50 t core::iter::adapters::zip::Zip<A,B>::new
0000000000001c90 t core::iter::adapters::zip::Zip<A,B>::new
0000000000001b90 T core::panicking::panic_bounds_check
0000000000001c10 T core::panicking::panic_fmt
0000000000001130 t <&mut W as core::fmt::Write>::write_char
0000000000001200 t <&mut W as core::fmt::Write>::write_fmt
0000000000001250 t <&mut W as core::fmt::Write>::write_str

After:

   text	   data	    bss	    dec	    hex	filename
   3068	    600	      8	   3676	    e5c	after
0000000000001360 T core::fmt::write
0000000000001340 T core::ops::function::FnOnce::call_once
0000000000001120 t core::ptr::drop_in_place
0000000000001620 t core::iter::adapters::zip::Zip<A,B>::new
0000000000001660 t core::iter::adapters::zip::Zip<A,B>::new
0000000000001130 t <&mut W as core::fmt::Write>::write_char
0000000000001200 t <&mut W as core::fmt::Write>::write_fmt
0000000000001250 t <&mut W as core::fmt::Write>::write_str

Writing any fmt::Arguments would trigger the inclusion of usize
formatting and padding code in the resulting binary, because indexing
used in fmt::write would generate code using panic_bounds_check, which
prints the index and length.

These bounds checks are not necessary, as fmt::Arguments never contains
any out-of-bounds indexes.

This change replaces them with unsafe get_unchecked, to reduce the
amount of generated code, which is especially important for embedded
targets.
@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 19, 2020
@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 19, 2020

Related optimization:

// Use Arguments::new_v1 instead of format_args!("{}", expr) to potentially
// reduce size overhead. The format_args! macro uses str's Display trait to
// write expr, which calls Formatter::pad, which must accommodate string
// truncation and padding (even though none is used here). Using
// Arguments::new_v1 may allow the compiler to omit Formatter::pad from the
// output binary, saving up to a few kilobytes.
panic_fmt(fmt::Arguments::new_v1(&[expr], &[]));

That optimzation would not be very effective if even writing the most trivial fmt::Arguments pulls in formatting code for usizes, like it does now.

Not sure if this can (or should) be tested for. Checking if some code results in a binary smaller than some threshold would not be a good test. Coming up with a somewhat complete list of symbols that 'should not be there' is also not easy.

@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 19, 2020

@rustbot modify labels: +A-fmt +I-heavy +libs-impl

@rustbot
Copy link
Collaborator

rustbot commented Oct 19, 2020

Error: Label libs-impl can only be set by Rust team members

Please let @rust-lang/release know if you're having trouble with this bot.

@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 19, 2020

@rustbot modify labels: +A-fmt +I-heavy +T-libs

@rustbot rustbot added A-fmt Area: std::fmt I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 19, 2020
@jonas-schievink jonas-schievink added the WG-embedded Working group: Embedded systems label Oct 19, 2020
@Mark-Simulacrum
Copy link
Member

I think it would be good to see if we can add a simple codegen test to assert that there's no panic/bounds check here, since it seems easy for it to sneak back in.

I would also like debug asserts here - the compiler should be correct, but I'm not sure we're testing that anywhere else (other than kinda via these bounds checks and such).

@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 19, 2020

@Mark-Simulacrum sounds good. Will add that when I have time.

@rustbot modify labels: +S-waiting-on-author -S-waiting-on-review

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 19, 2020
@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 20, 2020

Added the debug_asserts and a test.

The test is a run-make test, because it needs to check the result after linking. A codegen or assembly test doesn't check the parts that will be pulled in from core by the linker.

@rustbot modify labels: -S-waiting-on-author +S-waiting-on-review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 20, 2020
@dtolnay
Copy link
Member

dtolnay commented Oct 20, 2020

src/test/run-make/fmt-write-bloat/main doesn't need to be checked in, right?

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me -- r=me with the binary file removed.

src/test/run-make/fmt-write-bloat/main.rs Outdated Show resolved Hide resolved
@m-ou-se m-ou-se force-pushed the fmt-write-bounds-check branch 2 times, most recently from 1158ec5 to 356d5b5 Compare October 20, 2020 18:36
@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 20, 2020

Oh oops. Accidentally committed some things I shouldn't have. Thanks. Updated.

@dtolnay
Copy link
Member

dtolnay commented Oct 20, 2020

@bors r=Mark-Simulacrum

I confirmed that the new test fails without the changes in the PR.
---- [run-make] run-make/fmt-write-bloat stdout ----

error: make failed
status: exit code: 2
command: "make"
stdout:
------------------------------------------
LD_LIBRARY_PATH="/git/rust/build/x86_64-unknown-linux-gnu/test/run-make/fmt-write-bloat/fmt-write-bloat:/git/rust/build/x86_64-unknown-linux-gnu/stage1/lib:/git/rust/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/git/rust/build/x86_64-unknown-linux-gnu/stage0/lib" '/git/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc' --out-dir /git/rust/build/x86_64-unknown-linux-gnu/test/run-make/fmt-write-bloat/fmt-write-bloat -L /git/rust/build/x86_64-unknown-linux-gnu/test/run-make/fmt-write-bloat/fmt-write-bloat  main.rs -O
nm /git/rust/build/x86_64-unknown-linux-gnu/test/run-make/fmt-write-bloat/fmt-write-bloat/main | "/git/rust/src/etc/cat-and-grep.sh" -v panicking panic_fmt panic_bounds_check pad_integral Display Debug
[[[ begin stdout ]]]
00000000000020b2 R anon.ad56e829de00bb5e05e79e94a78aaeca.0.llvm.11246502207772803501
0000000000004008 B __bss_start
0000000000004008 b completed.8060
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000001070 t deregister_tm_clones
00000000000010e0 t __do_global_dtors_aux
0000000000003d20 d __do_global_dtors_aux_fini_array_entry
0000000000004000 D __dso_handle
0000000000003dd8 d _DYNAMIC
0000000000004008 D _edata
0000000000004010 B _end
0000000000001d6c T _fini
0000000000001120 t frame_dummy
0000000000003d18 d __frame_dummy_init_array_entry
0000000000002524 r __FRAME_END__
0000000000003f98 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
000000000000217c r __GNU_EH_FRAME_HDR
0000000000001000 t _init
0000000000003d20 d __init_array_end
0000000000003d18 d __init_array_start
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000001290 T __libc_csu_fini
0000000000001220 T __libc_csu_init
                 U __libc_start_main@@GLIBC_2.2.5
00000000000011c0 T main
00000000000010a0 t register_tm_clones
00000000000011b0 T rust_begin_unwind
0000000000001040 T _start
0000000000004008 D __TMC_END__
0000000000001d60 T _ZN36_$LT$T$u20$as$u20$core..any..Any$GT$7type_id17hec09389da73b5e7cE
0000000000001c30 T _ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u64$GT$3fmt17h54377f781f7a5c8fE
0000000000001c30 T _ZN4core3fmt3num3imp54_$LT$impl$u20$core..fmt..Display$u20$for$u20$usize$GT$3fmt17h4fa1341a9dac6525E
00000000000012c0 T _ZN4core3fmt5write17h100dc89493224d93E
0000000000001a80 t _ZN4core3fmt9Formatter12pad_integral12write_prefix17h204c271bcefaaa8cE
00000000000015b0 T _ZN4core3fmt9Formatter12pad_integral17hb075f5592a5d8a70E
00000000000012a0 T _ZN4core3ops8function6FnOnce9call_once17hd1e4d320fcc785faE.llvm.14856878211942904523
0000000000001130 t _ZN4core3ptr13drop_in_place17h8d4e61ec59d31a0eE
0000000000001ad0 t _ZN4core3ptr13drop_in_place17hd5cc673701fb5dfbE
0000000000001ba0 t _ZN4core4iter8adapters3zip16Zip$LT$A$C$B$GT$3new17h5f4b1cffaf34d111E
0000000000001be0 t _ZN4core4iter8adapters3zip16Zip$LT$A$C$B$GT$3new17hf19b0aefdc63476aE
0000000000001ae0 T _ZN4core9panicking18panic_bounds_check17ha31d9d2fa83bb84dE
0000000000001b60 T _ZN4core9panicking9panic_fmt17he9332856f0d02b11E
0000000000001140 t _ZN50_$LT$$RF$mut$u20$W$u20$as$u20$core..fmt..Write$GT$10write_char17h3bc2d3f42fbd7666E
0000000000001150 t _ZN50_$LT$$RF$mut$u20$W$u20$as$u20$core..fmt..Write$GT$9write_fmt17h0fd2e1dfd6fb1e2aE
00000000000011a0 t _ZN50_$LT$$RF$mut$u20$W$u20$as$u20$core..fmt..Write$GT$9write_str17h46dc1af754709fa5E

[[[ end stdout ]]]
Error: should not match: panicking
Error: should not match: panic_fmt
Error: should not match: panic_bounds_check
Error: should not match: pad_integral
Error: should not match: Display

@bors
Copy link
Contributor

bors commented Oct 20, 2020

📌 Commit 356d5b5 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 20, 2020
@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 5, 2020
@Mark-Simulacrum
Copy link
Member

@m-ou-se I believe this is waiting on you to implement the fix suggested in #78122 (comment), which would permanently designate the llvm-8 (now 9, IIRC) builder to be debug-assert free and test this test only when debug assertions are off. Do you need something from me to coordinate that?

@m-ou-se
Copy link
Member Author

m-ou-se commented Nov 22, 2020

Oh, thanks for the reminder! I'll probably have time to do this in the next few days.

@m-ou-se
Copy link
Member Author

m-ou-se commented Nov 28, 2020

Looks like there's a lot more runners on which debug assertions are disabled than I originally thought. (i686-gnu, x86_64-gnu-llvm-8, x86_64-gnu-distcheck, i686-gnu-nopt, ..)

I've changed the test to only check for the panicking and panic_fmt symbols if NO_DEBUG_ASSERTIONS is set. If not set, it still checks for the other symbols.

I think there's a only-hosts comment you can put to achieve the same thing.

I only found a force-host, but I'm not sure exactly what it does, or if it'd be the right thing.

I saw one other test that greps through $(RUSTC) -vV to determine the host. I've used that now to check for cross compiling, and moved it back to run-make.

@Mark-Simulacrum
Copy link
Member

This looks good to me. Could you squash the test adding commits? r=me with that done.

(FWIW I'm happy on smaller PRs to just always squash commits, and I think generally our policy of "add new commits" does more harm than good on average).

@m-ou-se
Copy link
Member Author

m-ou-se commented Nov 28, 2020

For most PRs I personally find it hard to review them with force-pushes between the reviews, which is why I usually use new commits after a review.

Squashed the test commits into one.

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Nov 28, 2020

📌 Commit 85b217d1d66d2cb61cf4b6d9cf5c0fc7a28c7831 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 28, 2020
@bors
Copy link
Contributor

bors commented Nov 29, 2020

⌛ Testing commit 85b217d1d66d2cb61cf4b6d9cf5c0fc7a28c7831 with merge 614ccbb44cca6b1bee828395ef4cebe629b27c92...

@bors
Copy link
Contributor

bors commented Nov 29, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 29, 2020
It checks that fmt::write by itself doesn't pull in any panicking or
or display code.
@m-ou-se
Copy link
Member Author

m-ou-se commented Nov 29, 2020

This failed on Windows because the no_std test (using #[link(name = "c")]) didn't link there. Added # ignore-windows.

@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Nov 29, 2020

📌 Commit 1da5780 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 29, 2020
@@ -1096,25 +1098,37 @@ pub fn write(output: &mut dyn Write, args: Arguments<'_>) -> Result {
Ok(())
}

fn run(fmt: &mut Formatter<'_>, arg: &rt::v1::Argument, args: &[ArgumentV1<'_>]) -> Result {
unsafe fn run(fmt: &mut Formatter<'_>, arg: &rt::v1::Argument, args: &[ArgumentV1<'_>]) -> Result {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd add a // SAFETY: arg and args must come from the same Arguments comment to the newly unsafe functions as well to indicate the necessary preconditions when calling them.

@bors
Copy link
Contributor

bors commented Nov 29, 2020

⌛ Testing commit 1da5780 with merge cf9bfdb...

@bors
Copy link
Contributor

bors commented Nov 30, 2020

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing cf9bfdb to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 30, 2020
@bors bors merged commit cf9bfdb into rust-lang:master Nov 30, 2020
@rustbot rustbot added this to the 1.50.0 milestone Nov 30, 2020
@m-ou-se m-ou-se deleted the fmt-write-bounds-check branch November 30, 2020 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-fmt Area: std::fmt I-heavy Issue: Problems and improvements with respect to binary size of generated code. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants