Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify unwinding #638

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
9 participants
@Aatch
Copy link
Contributor

Aatch commented Jan 21, 2015

Specifies behaviour around unwinding. Especially with respect to non-Rust code.


Rendered

unsafe when used incorrectly) can be used to mark functions as specifically not unwinding. This is
primarily for performance reasons in cases where the function cannot be inferred to not unwind. The
primary use case for this attribute is the `oom` function that aborts the process in the case of a
out-of-memory condition.

This comment has been minimized.

@alexcrichton

alexcrichton Jan 21, 2015

Member

Out of curiosity, I think I remember hearing that code size benefited from this change, do you have some examples or numbers that this could cite to show the benefits that it reaps?

the only thing can reasonably be done if we attempt to unwind into code that likely does not know
how to handle it.

## Edge Cases and Other Miscellany

This comment has been minimized.

@alexcrichton

alexcrichton Jan 21, 2015

Member

Could this section also include the feature gate for the new attributes?

safe to use nested invocations of the `try` function. The first part of the implementation would be
to improve the unwinding system to allow for nested calls to the try function.

The `try` function itself doesn't need changing. The current signature is `fn try<F:FnOnce()>(f: F)

This comment has been minimized.

@alexcrichton

alexcrichton Jan 21, 2015

Member

Should this signature be unsafe fn try?

from non-Rust code. As such, the task ends when control returns to original non-Rust code.

This extension essentially produces "nested tasks", where the thread is one task, then each transfer
of control from non-Rust code to Rust code is another. Unwinding out of a task is undefined

This comment has been minimized.

@alexcrichton

alexcrichton Jan 21, 2015

Member

Another possible use case is for a transition to look like Rust -> C -> Rust where Rust gives a callback to a C function which it then calls. Just to clarify, this means that the callback is itself a "nested task"?

This comment has been minimized.

@Aatch

Aatch Jan 21, 2015

Author Contributor

Yes, basically if you enter Rust code from somewhere that is not Rust code, then that is a new task.


The `try` function itself doesn't need changing. The current signature is `fn try<F:FnOnce()>(f: F)
-> Result<(), Box<Any + Send>>`, which provides means for the caller to know if the invocation was
successful. The function would remain unsafe.

This comment has been minimized.

@bill-myers

bill-myers Jan 21, 2015

Isn't it possible to make it safe by adding a Send bound to F, and allowing it to be called at any point? (and accepting that TLS access needs to be transactional)

This comment has been minimized.

@thestinger

thestinger Jan 21, 2015

Isn't it possible to make it safe by adding a Send bound to F, and allowing it to be called at any point? (and accepting that TLS access needs to be transactional)

So no fast thread-local storage? Or do you want it to do poisoning too?

This comment has been minimized.

@bill-myers

bill-myers Jan 22, 2015

I think I didn't explain properly, I meant accepting the fact that if you have a function that does two writes to TLS, it may panic in the middle resulting in only a single write, and that you thus need to have a RAII struct implementing Drop if you want transactional atomicity (a simple "scoped TLS assignment guard" could be provided in the standard library).

It's of course also possible to have other schemes with more overhead but I think reaching any agreement on that would be quite hard.

BTW, this means that TLS APIs must not allow borrowing values, just moving them in/out and swapping.

This comment has been minimized.

@dgrunwald

dgrunwald Jan 26, 2015

Contributor

Let's say I'm writing an FFI binding around an C library that makes use of callbacks. I want my binding to be usable by safe rust code.
Currently, the only way to prevent unwinding through the C code is to abort the process when unwinding out of rust code.
But how would an unsafe recursive try help me? I can't prevent the users of my binding from using TLS. So unless I can get a safe try (with all the changes to TLS that entails), I can't use it in my FFI binding, and have to keep aborting the process.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2015

cc me

@dgrunwald

This comment has been minimized.

Copy link
Contributor

dgrunwald commented Jan 26, 2015

Instead of #[unsafe_no_unwind], why not #[no_panic] with the similar semantics as C++ noexcept? (see also: http://discuss.rust-lang.org/t/no-panic/1356)
I think you'd still get most of the optimization benefit, but don't risk undefined behavior.

The semantics of #[no_panic] would be:

  • A caller of a no_panic function is guaranteed that we won't attempt to unwind out of the function.
  • If a panic does happen within a no_panic function, the process is aborted.
  • There is no guarantee how many Drop impls will run when unwinding into a no_panic function (but if a Drop impl does run, all other Drop impls that should have run prior are guaranteed to do so, in the correct order)

Because #[no_panic] allows the compiler to see whether a function might panic in cross-crate calls, it could skip emitting unwinding infos for more functions because we're guaranteed not to unwind into them.
The last point allows the compiler to reuse the same aborting code for all no_panic functions, as it doesn't need to call drop for any locals used by the functions. But it goes further: it allows the compiler to automatically mark any functions as no_panic if they are only reachable from other no_panic functions in the call graph. And any explicit panic!() calls in no_panic functions can be directly turned into an abort() call.
Thus, a program that doesn't use try or threads, and marks the main function as #[no_panic] would always abort() instead of panicking, and wouldn't have to link against the rust unwinding logic at all! (though you'd need LTO for the the compiler to be able to figure out the complete call graph)

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 6, 2015

@dgrunwald I think that the #[no_panic] attribute you're describing may be able to work in tandem with the #[unsafe_no_unwind] attribute being proposed here. The compiler would have to emit extra code for a safe version of #[no_panic] (force abort on unwind), but the unsafe version of the attribute has a clear analog in LLVM which may allow for optimizations it may not otherwise be able to deduce.

In general the semantics of #[no_panic] that you're describing sound very similar to the nounwind attribute in LLVM as I believe it already performs much of the propagating analysis.


@Aatch could you update this RFC with some of the feedback in this thread? There hasn't been much discussion on this but I think that it may be ready for merging soon.

@dgrunwald

This comment has been minimized.

Copy link
Contributor

dgrunwald commented Feb 7, 2015

AFAIK the only optimizations that #[unsafe_no_unwind] can do and #[no_panic] can't are those where the undefined behavior "travels back in time" and causes the compiler to eliminate whole code paths.

For example, I'd expect that #[unsafe_no_unwind] will optimize away all bounds checking (which should probably be mentioned in the RFC, since it makes the attribute more dangerous than one might naively expect from the current description)

With my no_panic, there's really two directions of propagation:
a) Functions that only calls no_panic functions and does not cause panics itself, can never be observed to unwind, and thus be marked no_panic. (propagation: callees -> caller)
b) Functions that are only called by no_panic functions will always unwind into a no_panic function. Because no_panic doesn't have to run drop impls, it's impossible to observe when exactly during unwinding the process was aborted, and thus we can mark the function no_panic (propagation: callers -> callee)

I was under the impression that this corresponds to the noexcept semantics in C++ -- looking more closely, C++ just says that unwinding into noexcept functions is implementation-defined. g++ seems to handle noexcept the way as I'm proposing; but clang still unwinds within noexcept functions, and LLVM nounwind seems to do only the callee->caller propagation :(

Example:

fn set_index<T>(arr: &mut [T], idx: usize, val: T) {
    arr[idx] = val
}
#[no_panic] // or #[no_unwind], not sure how we should call this
fn main() {
   let arr : &mut[Box<int>] = get_array();
   set_index(arr, 5, Box::new(42));
}

When the index 5 is out-of-range, by my no_panic rules, the compiler does not have to call the Box destructor in set_index. This is especially important when set_index gets inlined into main: I don't want inlining to cause main to have unwinding tables despite being marked #[no_panic], just so that a box can get destroyed before the process is aborted.

@dgrunwald

This comment has been minimized.

Copy link
Contributor

dgrunwald commented Feb 7, 2015

Back to the main part of the RFC: catching panics and converting them to error codes.

The mechanism suggested by this RFC is unsafe, because catching panics is unsafe in the presence of TLS. Moreover, one cannot easily wrap the mechanism behind a safe API, because reasoning about the safety requires global knowledge about the usage of TLS in the program.

In my mind, this makes the try mechanism useless where it is needed most: FFI binding libraries that attempt to provide a safe Rust API around a C library that involves callbacks. Aborting the whole process remains the only valid choice for those libraries.

@carllerche

This comment has been minimized.

Copy link
Member

carllerche commented Feb 8, 2015

@dgrunwald Why would #[unsafe_no_unwind] optimize bound checks away? Bound checking guards against more than segv crashes, for example, it prevents accessing memory that isn't owned by the container (and could contain secrets). Optimizing bound checks away seems more dangerous than one would otherwise expect.

@jmesmon

This comment has been minimized.

Copy link

jmesmon commented Feb 8, 2015

@carllerche if you can assume a lack of panic, all panics become unreachable markers. And bounds check failures panic. As a result, the compiler is able to assume that all of the bounds checks succeed.

As to #[unsafe_no_unwind]'s safety: it's got unsafe in the name to warn users that they have to be very careful about using it, just like the unsafe keyword we already have.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 11, 2015

@dgrunwald

If a panic does happen within a no_panic function, the process is aborted.

This was one of the points you mentioned for #[no_panic]. I'm curious, how do you envision this being implemented? I would personally implement this via a "try/catch" block, but that would incur a cost whereas #[unsafe_no_unwind] would not incur any cost (it's just an LLVM attribute).

@dgrunwald

This comment has been minimized.

Copy link
Contributor

dgrunwald commented Feb 11, 2015

I'm curious, how do you envision this being implemented?

I had hoped that it would be possible to get the unwinding implementation to abort the process when a function is marked in some way in the unwinding tables (or even simpler: if the function is not present in the unwinding tables).

Unfortunately it looks like LLVM has no good way to do this, and indeed needs a try/catch block: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-May/036960.html
:-(

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Mar 5, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented May 28, 2015

Unassigning myself as I believe this has now moved under the T-lang umbrella

@alexcrichton alexcrichton removed their assignment May 28, 2015

@aturon aturon self-assigned this Jul 16, 2015

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 4, 2015

We discussed this RFC today in the language subteam meeting, and decided that it ought to be closed in favor of RFC #1236, which covered roughly the same ground. Thanks @Aatch nonetheless. There are also a few bits in here that may be worth pulling out for a later RFC, such as the annotations for functions that govern whether landing pads are created or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.