New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace stack overflow checking with stack probes #16012

Closed
Zoxc opened this Issue Jul 26, 2014 · 76 comments

Comments

Projects
None yet
@Zoxc
Contributor

Zoxc commented Jul 26, 2014

We currently abuse LLVM segmented stack support to check for stack overflows. It would be more efficient to use guard pages to detect these. We already have guard pages on all the stacks. However to ensure that the code doesn't skip the guard pages, we need to insert stack probes. LLVM already can generate code for that on x86 and ARM for Windows. We'd just need to expose that as an option on other platforms.

It would be nice if support for stack probes could be added to MIPS in LLVM too, so we can get rid of the runtime support for the stack overflow checking.

Using stack probes is also easy and desirable to support in freestanding mode.

@sfackler sfackler added the A-codegen label Jul 26, 2014

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jul 27, 2014

Here's a comment in the past about this. Sadly I don't think "just use a guard page will cut it" for the reasons outlined in that comment. That being said, I'd love to stop using segmented stacks!

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Jul 27, 2014

I do think printing out a error message is a bad thing. On Windows it means interfering with the dialog which informs the user about an error and allows developers to debug the application. On Linux it interferes with debugging. If such a message desired for some reason, we can support it, but the POSIX solution probably isn't very pretty.

I was wondering what the impact of this would be, so I commented out the code generating the split stack attributes and did some simple benchmarks:

2.35% reduced size of librustc
4.11% less time compiling libcore
19.3% less time running shootout-fibo

@thestinger thestinger added the I-slow label Jul 27, 2014

@thestinger

This comment has been minimized.

Contributor

thestinger commented Jul 27, 2014

Printing an error message on stack overflow would be trivial if there weren't green threads. It's as simple as installing a signal handler and print out an error if the address is located in the guard page range. It's yet another case where green threads make the language significantly worse than C++. I don't think there's a sane way to do it without horrific spin locks in today's Rust unless printing a special error message is not a requirement.

@thestinger

This comment has been minimized.

Contributor

thestinger commented Jul 27, 2014

Anyway... the segmented stack support only catches overflow when it happens to occur on the Rust side rather than in C code that's being called. It doesn't actually work. For example, an infinitely recursive function may be allocating memory and there's a good chance it will be jemalloc triggering the overflow.

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Jul 27, 2014

We can give error messages for libgreen by having it inform libnative about the active guard page. Foreign code might skip guard pages on non-Windows platforms, so we won't catch all stack overflows.

@thestinger

This comment has been minimized.

Contributor

thestinger commented Jul 27, 2014

It's the platform's problem if it doesn't build with -fcheck-stack. Stack frames that large aren't common in C code anyway.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jul 28, 2014

@thestinger Knock off the general comments about how Rust is a better or worse language than C++ please.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jul 28, 2014

Anyway, I totally agree with the thrust of using guard pages here, and would like us to move away from segmented stacks as soon as possible. I personally don't think good error messages in libgreen should block us. The fibo benchmark is particularly compelling.

I don't see why libgreen is relevant anyway; if there's a guard page either way, just have libgreen handle it in the same way as libnative. It should be easy to run enough of Rust from the signal handler to print out an error message before dying.

@huonw

This comment has been minimized.

Member

huonw commented Aug 4, 2014

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Aug 4, 2014

That may have been me.

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Aug 5, 2014

I do have an implementation of the LLVM part for x86 and perhaps ARM (not sure how comprehensive that support is).
Zoxc/llvm@llvm-mirror:master...stprobe

I don't know enough about MIPS to implement that in LLVM or enough about MIPS and ARM to implement the __probestack support function.

@huonw

This comment has been minimized.

Member

huonw commented Aug 5, 2014

http://togototo.wordpress.com/2014/08/05/fibonacci-numbers-on-the-galaxy-s3-arm-benchmarks-of-rust-ocaml-haskell-go-racket-lua-c-and-java/

That fibonacci implementation is tail recursive, and optimises to a loop:

else-block.i.i.i:                                 ; preds = %else-block.i.i.i.preheader, %else-block.i.i.i
  %.tr710.i.i.i = phi i64 [ %107, %else-block.i.i.i ], [ %104, %else-block.i.i.i.preheader ]
  %.tr69.i.i.i = phi i64 [ %106, %else-block.i.i.i ], [ 1, %else-block.i.i.i.preheader ]
  %.tr8.i.i.i = phi i64 [ %.tr69.i.i.i, %else-block.i.i.i ], [ 0, %else-block.i.i.i.preheader ]
  %106 = add i64 %.tr8.i.i.i, %.tr69.i.i.i
  %107 = add i64 %.tr710.i.i.i, -1
  %108 = icmp eq i64 %107, 0
  br i1 %108, label %_ZN3fib20hcf5cee5c8487747eOaaE.exit.i.loopexit, label %else-block.i.i.i
@thestinger

This comment has been minimized.

Contributor

thestinger commented Aug 5, 2014

@huonw: It's probably just the difference in compilers then, never mind. Rust is actually a bit faster than the C code on x86_64 when using 32-bit integers.

@huonw

This comment has been minimized.

Member

huonw commented Aug 5, 2014

On x86-64 actually seems to be using a 64-bit int in Rust vs. the 32-bit one in C. Changing to i32 throws Rust in a much better light:

75699
LANGUAGE Rust 2167
75699
LANGUAGE C 2220
75699
LANGUAGE C 2271

(Anyway, this is off-topic for this bug, although @pcwalton may be interested in it.)

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Aug 5, 2014

Would not having safe stacks on MIPS block landing this?

Do we definitely want a signal handler/exception handler to print out that a stack overflow happened?

bors added a commit that referenced this issue Oct 23, 2014

auto merge of #16388 : Zoxc/rust/stmesg, r=pcwalton
This installs signal handlers to print out stack overflow messages on Linux. It also ensures the main thread has a guard page.

This will catch stack overflows in external code. It's done in preparation of switching to stack probes (#16012).

I've done some simple tests with overflowing the main thread, native threads and green threads (with and without UV) on x86-64.
This might work on ARM, MIPS and x86-32.

I've been unable to run the test suite on this because of #16305.

bors added a commit that referenced this issue Oct 24, 2014

auto merge of #16388 : Zoxc/rust/stmesg, r=alexcrichton
This installs signal handlers to print out stack overflow messages on Linux. It also ensures the main thread has a guard page.

This will catch stack overflows in external code. It's done in preparation of switching to stack probes (#16012).

I've done some simple tests with overflowing the main thread, native threads and green threads (with and without UV) on x86-64.
This might work on ARM, MIPS and x86-32.

I've been unable to run the test suite on this because of #16305.

bors added a commit that referenced this issue Oct 24, 2014

auto merge of #16388 : Zoxc/rust/stmesg, r=alexcrichton
This installs signal handlers to print out stack overflow messages on Linux. It also ensures the main thread has a guard page.

This will catch stack overflows in external code. It's done in preparation of switching to stack probes (#16012).

I've done some simple tests with overflowing the main thread, native threads and green threads (with and without UV) on x86-64.
This might work on ARM, MIPS and x86-32.

I've been unable to run the test suite on this because of #16305.
@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 19, 2017

@pcwalton See https://reviews.llvm.org/D9653#206892. The only remaining objection the upstream had to the overall design is the inability to configure the name of the stack probe function. I don't know why @Zoxc repeatedly refused to do this trivial change.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 19, 2017

@whitequark Sounds like one of us needs to do it. Should you or should I?

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 19, 2017

@pcwalton Please do it, I was going to do it for months but I'm really overloaded.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 20, 2017

On it.

@brson

This comment has been minimized.

Contributor

brson commented Jun 20, 2017

Thanks @pcwalton.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 20, 2017

Updated the patches. cc @whitequark

https://reviews.llvm.org/D34386
https://reviews.llvm.org/D34387

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 21, 2017

https://reviews.llvm.org/D34386 has been approved.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 21, 2017

Still waiting on the second one. @whitequark If the upstream review continues to go slowly, can we pull this into our fork?

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 21, 2017

@pcwalton It's pretty quick so far... There's nothing inherently troublesome with having this in our own fork but I'm very worried about this major feature never reaching upstream. For one, this makes it significantly harder to maintain rustc support for out-of-tree LLVM backends. Another reason is that this is something that really ought to benefit the entire LLVM community, not just rustc but at least also clang (-fstack-check currently does nothing).

FWIW I am watching the progress of those two patches closely. In fact let me commit D34386 for you.

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 21, 2017

@pcwalton Looks like the only remaining upstream concern with D34387 is the spilling of R11, and I am unfortunately as lost on this ABI issue as you. I can try looking into it...

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 21, 2017

@pcwalton Commented on D34387; I don't think R11 should be spilled.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jun 21, 2017

@whitequark Note that compiler-rt has to be updated in order for clang to be able to make use of stack probes. Someone can do that, but that person won't be me. Working on getting through the LLVM review process has already taken up more of my time than I can afford to spare at present.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jun 21, 2017

Once the patches are upstream I don't mind doing the legwork to integrate it into the compiler, I'm already experimenting locally with the current state of the patches. (I'll take care of compiler-rt and whatnot)

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 21, 2017

Note that compiler-rt has to be updated in order for clang to be able to make use of stack probes. Someone can do that, but that person won't be me.

That's fine; it's far easier, required skills wise, to get something into compiler-rt than LLVM itself.

And, thank you for your time!

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jun 22, 2017

I've opened #42816 for the integration into the compiler.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jun 22, 2017

@whitequark looks like D34387 has been approved, mind landing that upstream as well?

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jun 22, 2017

@alexcrichton Yeah, did it as soon as I woke up and read mail.

@briansmith

This comment has been minimized.

briansmith commented Jul 7, 2017

Great work! Could somebody given a summary of what would need to also do the stack probes (or other stack overflow checking) on ARM & AAarch64, and then also what it would takes for MIPS and other platforms?

@jonas-schievink

This comment has been minimized.

Contributor

jonas-schievink commented Jul 7, 2017

Implement in LLVM, cherry-pick the commits into Rust's LLVM, reimplement #42816 for the other platforms.

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Jul 7, 2017

Looks like the calling convention used in LLVM master doesn't match these, but only on Windows where we don't use them. This seems to be because unlike my patches, the probe-stack attribute takes preference over the Windows functions.

probestack.rs also lacks unwinding information. My compiler-rt patch does have this.

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jul 7, 2017

This seems to be because unlike my patches, the probe-stack attribute takes preference over the Windows functions.

Only on x86_32 Windows, right? The current semantics isn't an accident, I considered it cleaner.

probestack.rs also lacks unwinding information.

It is not needed. __rust_probestack is a leaf function that never unwinds, and the debugging information is generated by rustc, since the assembly is wrapped in a naked function.

@Zoxc

This comment has been minimized.

Contributor

Zoxc commented Jul 7, 2017

Only on x86_32 Windows, right? The current semantics isn't an accident, I considered it cleaner.

So if you use "probe-stack"="__probe_stack" you need 2 different functions on x86_32 depending on the platform, how is that cleaner?

It is not needed. __rust_probestack is a leaf function that never unwinds, and the debugging information is generated by rustc, since the assembly is wrapped in a naked function.

I wouldn't rely on LLVM generating correct debugging information for inline assembly / naked functions nor debuggers having suitable heuristics to be able to give correct stack traces.

@whitequark

This comment has been minimized.

Contributor

whitequark commented Jul 7, 2017

So if you use "probe-stack"="__probe_stack" you need 2 different functions on x86_32 depending on the platform, how is that cleaner?

Omitting "probe-stack" results in the unmodified platform ABI being used, including "probe-stack" results in LLVM's probe function ABI being used.

I wouldn't rely on LLVM generating correct debugging information for inline assembly / naked functions nor debuggers having suitable heuristics to be able to give correct stack traces.

That's a good point.

@bstrie

This comment has been minimized.

Contributor

bstrie commented Jul 7, 2017

Now that all the foundational work has been done and this has been implemented for all tier-1 platforms, I'm tempted to close this bug and open individual bugs for further platforms. Though given that we do want ARM to become tier-1 soon-ish I'm fine with leaving it open until that work is done, but overall I think this needs to be part of a broader discussion of how we judge severity of a security issue when that issue hinges on LLVM support for a less-supported platform.

@bstrie

This comment has been minimized.

Contributor

bstrie commented Jul 14, 2017

Taking the initiative and closing this in favor of #43241 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment