Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: Implement stack probes for x86 #42816

Merged
merged 1 commit into from Jul 6, 2017
Merged

Conversation

alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Jun 22, 2017

This commit implements stack probes on x86/x86_64 using the freshly landed
support upstream in LLVM. The purpose of stack probes here are to guarantee a
segfault on stack overflow rather than having a chance of running over the guard
page already present on all threads by accident.

At this time there's no support for any other architecture because LLVM itself
does not have support for other architectures.

@rust-highfive
Copy link
Collaborator

rust-highfive commented Jun 22, 2017

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

alexcrichton commented Jun 22, 2017

This intends to address #16012 for x86/x86_64, and it also shouldn't land until the upstream diff has landed (although this seems likely to do so, so I figured I may as well go ahead and get this reviewed). Note that this is using a fork of our fork temporarily until the patches are upstreamed, this will not land pointing at a different fork of Rust.

@bors
Copy link
Contributor

bors commented Jun 22, 2017

The latest upstream changes (presumably #42682) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton alexcrichton force-pushed the probestack branch 3 times, most recently from 9727e7a to 0c33a2e Compare Jun 22, 2017
@alexcrichton alexcrichton added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 22, 2017
// our goal here is to take %eax and add it to %rsp, but we're also going to
// touch each page between %rsp+8 and %rsp+8-%rax
//
// The ABI here is that the stack frame size is located in `%eax`. Upon
Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%rax (also on the line below)

Copy link
Member Author

@alexcrichton alexcrichton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically it's %eax because I'm seeing this before calls to probestack:

mov    $0x2008,%eax

Although it's essentiall %rax as that zeroes out the top bits anyway.

//
// The ABI here is that the stack frame size is located in `%eax`. Upon
// return we're not supposed to modify `%esp` or `%eax`. It's unclear
// whether we can modify caller-saved registers, but to be on the safe side
Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to SystemV x86_64 ABI, %rcx is the 4th function argument, so you'd better save it, no doubt here.

Copy link
Member Author

@alexcrichton alexcrichton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I think the comment still applies. We could pick any caller-saved register for that scratch space (I think) but I'm not sure what LLVM guarantees here. Figured it'd be good to just stay conservative.

Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pick any caller-saved register for that scratch space (I think) but I'm not sure what LLVM guarantees here.

I don't think we're on the same page. The probestack function does not get invoked via the standard function call sequence; the prologue code inserted by LLVM consists of just a call instruction. So anything that's not defined by the ABI as a scratch register has to be saved in it.

Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I might be wrong, let me double-check it.

Copy link
Member Author

@alexcrichton alexcrichton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry what I mean here is:

  • At the start of a function, all caller-saved registers are scratch space
  • Some unknown set of instructions happen
  • Then there's call __rust_probestack

From what I've seen the "unknown set of instructions" is the empty set, but I'm not sure if that's true 100% of the time with LLVM. If that set of instructions is always empty then all caller-saved registers should be scratch space here as well.

Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? %rcx is used for passing the 4th argument in SysV ABI. If it is not saved by the unknown set of instructions nor __rust_probestack then how will the callee know the value of its 4th argument?

Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To illustrate. Let's say we translate this LLVM IR:

declare void @use([40096 x i8]*)

define i32 @test(i32 %a, i32 %b, i32 %c, i32 %d) "probe-stack"="__probestack" {
  %array = alloca [40096 x i8], align 16
  call void @use([40096 x i8]* %array)
  ret i32 %d
}

This results in the following assembly:

test:                                   # @test
        pushq   %rbx
        movl    $40096, %eax            # imm = 0x9CA0
        callq   __probestack
        subq    %rax, %rsp
        movl    %ecx, %ebx
        movq    %rsp, %rdi
        callq   use
        movl    %ebx, %eax
        addq    $40096, %rsp            # imm = 0x9CA0
        popq    %rbx
        retq

As you can see %ecx is not saved around the __probestack call. You have to save it there.

Copy link
Member Author

@alexcrichton alexcrichton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is clearly just confusing, I'm going to delete it.

@alexcrichton
Copy link
Member Author

alexcrichton commented Jun 22, 2017

Ah also, I've updated our llvm fork now that patches are upstream and that's here, this should be ready to land.

#[cfg(target_arch = "x86_64")]
pub unsafe extern fn __rust_probestack() {
// our goal here is to take %eax and add it to %rsp, but we're also going to
// touch each page between %rsp+8 and %rsp+8-%rax
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment isn't accurate anymore — you don't actually modify rsp in this function.

#[no_mangle]
#[cfg(all(target_arch = "x86", windows))]
pub unsafe extern fn __rust_probestack() {
// This is similar to 32-bit unix but we're going to actually perform the
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure you don't need this function: just omit the probe-stack attribute on Windows and LLVM will automatically add a call to Microsoft's implementation. It's mandated by the Windows ABI, so LLVM has to do that.

@@ -16,6 +16,7 @@ pub fn target() -> TargetResult {
base.cpu = "x86-64".to_string();
base.pre_link_args.get_mut(&LinkerFlavor::Gcc).unwrap().push("-m64".to_string());
base.max_atomic_width = Some(64);
base.stack_probes = true;
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think you need this.

@@ -15,6 +15,7 @@ pub fn target() -> TargetResult {
let mut base = super::windows_msvc_base::opts();
base.cpu = "x86-64".to_string();
base.max_atomic_width = Some(64);
base.stack_probes = true;
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think you need this.

@@ -15,6 +15,7 @@ pub fn target() -> TargetResult {
let mut base = super::windows_msvc_base::opts();
base.cpu = "pentium4".to_string();
base.max_atomic_width = Some(64);
base.stack_probes = true;
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think you need this.

@@ -21,6 +21,7 @@ pub fn target() -> TargetResult {
// space available to x86 Windows binaries on x86_64.
base.pre_link_args
.get_mut(&LinkerFlavor::Gcc).unwrap().push("-Wl,--large-address-aware".to_string());
base.stack_probes = true;
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think you need this.

//! guard page. If a function did not have a stack probe then there's a risk of
//! having a stack frame *larger* than the guard page, so a function call could
//! skip over the guard page entirely and then later hit maybe the heap,
//! possibly leading to a security vulnerability.
Copy link
Member

@steveklabnik steveklabnik Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth maybe doing

//! possibly leading to security vulnerabilities such as [The Stack Clash], for example.
//!
//! [The Stack Clash]: https://blog.qualys.com/securitylabs/2017/06/19/the-stack-clash

// The ABI here is that the stack frame size is located in `%eax`. Upon
// return we're not supposed to modify `%esp` or `%eax`.
asm!("
push %rcx
Copy link
Contributor

@pcwalton pcwalton Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you can use r11 here and avoid this spill, since r11 is a designated scratch register.

//! thread has a guard page then a stack overflow is guaranteed to hit that
//! guard page. If a function did not have a stack probe then there's a risk of
//! having a stack frame *larger* than the guard page, so a function call could
//! skip over the guard page entirely and then later hit maybe the heap,
Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just this issue. The Stack Clash vulnerability is not necessarily easy to trigger, especially on amd64, but if you have secondary threads, their stacks is just anonymous memory that is very likely to be close to heap in the first place. Even worse if the threads have small stacks.

//! Note that `#[naked]` is typically used here for the stack probe because the
//! ABI corresponds to no actual ABI. Additionally it means that on Windows we
//! don't have to maintain assembly for MSVC and MinGW, we can just have one
//! blob of assembly.
Copy link
Member

@whitequark whitequark Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows remark is no longer relevant.

cmp $$0x1000,%rax
ja 2b

// Finish up the last remainings stack space requested, getting the last
Copy link
Contributor

@pcwalton pcwalton Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber-nit: "remainings" -> "remaining"

@bors
Copy link
Contributor

bors commented Jun 23, 2017

The latest upstream changes (presumably #42828) made this pull request unmergeable. Please resolve the merge conflicts.

@whitequark
Copy link
Member

whitequark commented Jun 23, 2017

Heads-up, you need to cherry-pick r306142.

@pcwalton Here goes a better part of my evening... that STI.isTargetWin32() definition is downright evil.

@pftbest
Copy link
Contributor

pftbest commented Jun 26, 2017

Sorry for interrupting, I would like to submit a patch for MSP430, but it sits on top of the changes from this pull request. Do I need to wait until it's merged?

@alexcrichton
Copy link
Member Author

alexcrichton commented Jul 6, 2017

@bors: r=nikomatsakis

@bors
Copy link
Contributor

bors commented Jul 6, 2017

📌 Commit 51b54ca has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Jul 6, 2017

🔒 Merge conflict

@bors
Copy link
Contributor

bors commented Jul 6, 2017

The latest upstream changes (presumably #42899) made this pull request unmergeable. Please resolve the merge conflicts.

alexcrichton added a commit to alexcrichton/compiler-builtins that referenced this pull request Jul 6, 2017
bors added a commit to rust-lang/compiler-builtins that referenced this pull request Jul 6, 2017
Add `__rust_probestack` intrinsic

Will be required for rust-lang/rust#42816
alexcrichton added a commit to alexcrichton/compiler-builtins that referenced this pull request Jul 6, 2017
bors added a commit to rust-lang/compiler-builtins that referenced this pull request Jul 6, 2017
Add `__rust_probestack` intrinsic

Will be required for rust-lang/rust#42816
This commit implements stack probes on x86/x86_64 using the freshly landed
support upstream in LLVM. The purpose of stack probes here are to guarantee a
segfault on stack overflow rather than having a chance of running over the guard
page already present on all threads by accident.

At this time there's no support for any other architecture because LLVM itself
does not have support for other architectures.
@alexcrichton
Copy link
Member Author

alexcrichton commented Jul 6, 2017

@bors: r=nikomatsakis

@bors
Copy link
Contributor

bors commented Jul 6, 2017

📌 Commit 5dbd97d has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Jul 6, 2017

Testing commit 5dbd97d with merge cd72f2e...

bors added a commit that referenced this pull request Jul 6, 2017
rustc: Implement stack probes for x86

This commit implements stack probes on x86/x86_64 using the freshly landed
support upstream in LLVM. The purpose of stack probes here are to guarantee a
segfault on stack overflow rather than having a chance of running over the guard
page already present on all threads by accident.

At this time there's no support for any other architecture because LLVM itself
does not have support for other architectures.
@bors
Copy link
Contributor

bors commented Jul 6, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing cd72f2e to master...

@bors bors merged commit 5dbd97d into rust-lang:master Jul 6, 2017
@alexcrichton alexcrichton deleted the probestack branch Jul 6, 2017
bors added a commit that referenced this pull request Jul 6, 2017
@colin-kiegel colin-kiegel mentioned this pull request Jul 12, 2017
10 tasks
@killercup killercup mentioned this pull request Aug 11, 2017
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet