Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize intrinsics::abort in core #2512

Open
japaric opened this issue Aug 2, 2018 · 36 comments
Open

Stabilize intrinsics::abort in core #2512

japaric opened this issue Aug 2, 2018 · 36 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@japaric
Copy link
Member

japaric commented Aug 2, 2018

I would like to see intrinsics::abort stabilized in some form as it's useful in some cases to have
panicking behavior (and alloc_error) = abort (minimizes code size and the abort exception / signal
can be handled elsewhere or in a upper abstraction layer).

As intrinsics::abort lowers to a trap instruction on most architectures (*) I propose we add a
safe wrapper around it in core::arch::$ARCH for the architectures where it does.

(*) On most but not on all. For example, On MSP430 it lowers to a call to the abort function
which results in a linker error unless that symbol is defined somewhere in the dependency graph.

(Related: There's std::process::abort which claims to be a safe wrapper around intrinsics::abort
but if you at its implementation it is, in fact, not just a wrapper around intrinsics::abort but
something completely different.)

@rust-lang/libs what would be needed to move this forward? An full RFC, or just a PR + FCP?

cc @Amanieu @glandium

@steveklabnik
Copy link
Member

I've also found it strange that this doesn't exist in core; big 👍 from me.

@Mark-Simulacrum Mark-Simulacrum added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Aug 2, 2018
@glandium
Copy link

glandium commented Aug 2, 2018

I'm all for it, but I'm wary about core::arch. AIUI, there is, purposefully, no platform-specific code in libcore at the moment. A specific module for that seems like opening the door to more.

@japaric
Copy link
Member Author

japaric commented Aug 2, 2018

@glandium core::arch already exists on stable. There are architecture specific APIs in it for SIMD instructions.

@Amanieu
Copy link
Member

Amanieu commented Aug 3, 2018

I'm not a big fan of the way intrinsics::abort is used at the moment: it just crashes the program with an obscure error message (Illegal instruction). IMO the proper way to handle this is to use something like #[abort_implementation] to allow customizing the behavior of aborts per platform.

Then again, this is a lot of work for something that is very rarely used (compared to normal panics). It might not be worth the trouble.

@alexcrichton
Copy link
Member

Sorry for the delay in getting to this @japaric, but FWIW I'd personally be ok with a PR to add this to an unstable (eventually to-be-stable) location in libstd and then later have a stabilization PR. This doesn't seem like it's RFC-grade to me

@japaric
Copy link
Member Author

japaric commented Sep 5, 2018

@Amanieu this proposal is about stabilizing a way to generate a trap / abort instruction. Your comment seems to be about a user customizable process::abort that's available in core.

@alexcrichton sounds good! The main question here is where in core to put this. IMO, it makes sense to put this in coresimd / core::arch::$ARCH and name it trap, instead of abort, as this is a stable wrapper for the llvm.trap intrinsic that only produces a trap instruction for some architectures (e.g. it doesn't do that for MSP430). Could I get yours and @gnzlbg's opinions on that location?

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 5, 2018

@japaric I'd put it in core::intrinsics first since the llvm intrinsic it forwards to is "portable", as in, it can be called on all architectures, although it might not do what you expect on all of them. We can just point people here to the llvm docs.

I propose we add a safe wrapper around it in core::arch::$ARCH for the architectures where it does.

So suppose we would put it in core::intrinsics and stabilize it. Would we still need to add those safe wrappers to core::arch or could one be built in a stable third-party crate ? Even if it can be built in a third party crate, it might still make sense to expose a safe wrapper to it somewhere in core, but we don't have to solve that problem right now, and maybe with some experience about how to write that in a crate and use it, we can put a better solution in core later.

@japaric
Copy link
Member Author

japaric commented Sep 5, 2018

I'd put it in core::intrinsics first since the llvm intrinsic it forwards to is "portable"

It's already in intrinsics: core::intrinsics::abort but there's no precedent for stabilizing any API in there ... Hold on, the core::intrinsics module is unstable but e.g. core::intrinsics::transmute and a few other intrinsics are stable -- that's weird: core::intrinsics::transmute(..) is stable but the more idiomatic use core::intrinsics; intrinsics::transmute(..) requires a feature gate.

Yeah, I'm not sure stuff in core::intrinsics is supposed to stabilized. core::intrinsics::transmute may only be stable as a byproduct of stabilizing it as a re-export in core::mem.

I propose we add a safe wrapper around it

This was a typo, i think. I probably meant a stable wrapper around the llvm.trap intrinsic. Whether it's safe or not is not too important (though I don't see a reason why abort should be unsafe); the important part is that it becomes a stable API.

So suppose we would put it in core::intrinsics and stabilize it.

If that's OK then, yeah, we don't need anything else in core::arch, but like I said above I don't think we are supposed to tell people to use core::intrinsics on stable. My proposal is assuming that core::intrinsics is not the right place to stabilize this API in.

@japaric
Copy link
Member Author

japaric commented Sep 5, 2018

And as a data point: for these architectures llvm.trap lowers to an
instruction: ARM{,BE}, AArch64, i686, mips{,64,64el}, powerpc{,64,64le},
nvptx64, sparc64, x86_64 and wasm32.

For MSP430 and RISCV32 llvm.trap lowers to a call to the abort function,
which is not defined in core or compiler_builtins so it can cause a linker
error.

AFAIK, there's no precedent for a stable API in core that can produce linker
errors when compiled for some architectures. This is way I'd prefer not to make
a stable llvm.trap available to all targets.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 5, 2018

Do clang and GCC export these somehow? I've used a __builtin_trap intrinsic in C and C++, but in some platforms that actually lowered to 1 / 0...

We could re-export it in core::arch for all architectures in which it does something meaningful too. But @alexcrichton maybe the libs team should discuss what's the plan for core::intrinsics, whether its for unstable stuff only, or whether parts of it could be stabilized. If not, where could we put this?

cc @eddyb @sunfishcode does Cretonne have a trap intrinsic? If so, does it do the same thing as llvm.trap? Also, where would you put this?

@SimonSapin
Copy link
Contributor

Libs team hat on: I’d rather not stabilize the intrinsics module. I view it as an implementation detail that (ideally) could be removed entirely.

In the past, when stabilizing an intrinsic, we’ve added a re-export or wrapper in another (stable) module.

That core::intrinsics::transmute(..) works on stable is a bug: rust-lang/rust#15702. We clarified in #2405 that the stability promise does not extend to features that are accidentally available on the stable channel.

@nagisa
Copy link
Member

nagisa commented Sep 5, 2018

For MSP430 and RISCV32 llvm.trap lowers to a call to the abort function, which is not defined in core or compiler_builtins so it can cause a linker error.

First, lowering into abort() function is just a "default" that generic lowering code in LLVM performs for targets that do not implement lowering of their own. At least RISCV has a dedicated trap instruction in its ISA, so that’s just a shortcoming of the backend that can readily be fixed.

Additionally, there is a function attribute called trap-func-name that when specified will cause all @llvm.trap()s in that function to lower into that specific function call instead, making it possible for us to easily make an override for targets that for some reason do not implement lowering themselves.

Libs team hat on: I’d rather not stabilize the intrinsics module. I view it as an implementation detail that (ideally) could be removed entirely.

Think of intrinsics of a name that could potentially be given up for functions like this (and implementation details becoming private/moving elsewhere). For unreachable specifically, we ended up deciding to create a std::hint module, despite my foresight that it will want to live alongside abort/trap. Since trap does not fit into hint, we can put this function somewhere else, sure. Except there are few or no places where it would fit in thematically, so we might end up creating yet another module that will likely contain just this function.

@japaric
Copy link
Member Author

japaric commented Sep 5, 2018

First, lowering into abort() function is just a "default" that generic lowering code in LLVM performs for targets that do not implement lowering of their own. (..)

Are you suggesting that we "polyfill" llvm.trap in rustc for targets that don't have a dedicated trap instruction? What should such function do when targeting MSP430? It seems to (but I'm no expert) that 16-bit and 8-bit architecture use all their opcode space so there's no undefined opcode that we could use here.

Also, with this approach we would have to take care not only of MSP430 but also of every new architecture that we add in the future and that has this problem (possibly, AVR).

@alexcrichton
Copy link
Member

I don't personally have a lot of thoughts about where this should go or what to name it, but I agree with @SimonSapin that we should avoid the intrinsics module

@eddyb
Copy link
Member

eddyb commented Sep 6, 2018

does Cretonne Cranelift have a trap intrinsic? If so, does it do the same thing as llvm.trap? Also, where would you put this?

It has trap as an instruction, including conditional forms.

As for where to place this, definitely not intrinsics.
I would also maybe call it trap instead of abort to distinguish it from libc abort.

@cr1901
Copy link

cr1901 commented Sep 6, 2018

Re: msp430 and llvm.trap: @awygle and I spent an hour or so discussing how we could implement llvm.trap for msp430 on IRC. We couldn't agree on reasonable semantics/implementation, so currently there are no plans to implement llvm.trap." There are multiple reasonable semantics, but their implementation is ugly (short of deferring to a function provided by msp430-rt that the user implements on a per-application basis).

Therefore I'd prefer that we'd not have a stable trap API on msp430; I would prefer something like core::arch::$ARCH::trap.

@eddyb
Copy link
Member

eddyb commented Sep 6, 2018

Oh, using core::arch is definitely one way to go about doing this that I'd be fine with.

@japaric
Copy link
Member Author

japaric commented Sep 9, 2018

@gnzlbg given the last comments, do you think we could move forward with core::arch::$ARCH::trap?

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 9, 2018

do you think we could move forward with core::arch::$ARCH::trap?

Yes, this looks fine to me. Will you send a PR for adding it to the architecture-specific submodules in stdsimd?

Does anyone feel strongly about this intrinsic needing an RFC to stabilize or would an FCP be enough to stabilize it for some of the archs at least?

@parched
Copy link

parched commented Sep 10, 2018

Perhaps if it's going in core::arch::$ARCH::trap it should have an architecture specific name e.g. UDF for arm.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 10, 2018

Since I expect this functionality (independent of the name) to be provided at least for the current architectures that stdsimd, I'd expect us to use assert_instr here to guarantee a specific instruction sequence that we reliably generate (and document).

So I checked out the instructions generated by llvm.trap for the following architectures supported by stdsimd (https://gcc.godbolt.org/z/PG2zmQ):

  • x86 and x86_64: ud2
  • aarch64: brk #0x1
  • arm: mov pc, lr @ trap or nothing? (i tried here for many cpus and mattrs combinations, and this often generates nothing here)
  • ppc32, ppc64, and ppc64le: trap
  • mips32, mips64, mips64el, mips64r6: break
  • nvptx: trap
  • wasm32: unreachable

And things are IMO looking good. All architectures that we support (one exception) have a native trap instruction sequence that llvm.trap generates reliably.

The only exception appears to be arm, but maybe I just did not pass the right command flags to llvm? (I only tried -march=arm -mcpu={cpu name here} for many cpus)

@parched
Copy link

parched commented Sep 10, 2018

@gnzlbg on arm it generates an undefined instruction which your disassembler may shown like .inst and won't be there if you filter out directives.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 10, 2018

@parched indeed, on arm the assembly contains an .inst 0xe7ffdefe directive.

@japaric
Copy link
Member Author

japaric commented Sep 15, 2018

Yes, this looks fine to me. Will you send a PR for adding it to the architecture-specific submodules in stdsimd?

Done in rust-lang/stdarch#571

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 12, 2018

Heads up: I've merged @japaric 's implementation of the platform specific "abort" intrinsics to stdsimd here: rust-lang/stdarch#571

Note that this does not mean that these are in the path towards stabilization. This issue is about a portable way to abort, which had some unresolved questions that made the discussion derail a bit. These questions are:

  • How portable would such an intrinsic be? all targets? or only some that support it?
  • If it should be available on all targets, what to do in platforms that don't have a way to trap? e.g. loop {} ?
  • where in libcore does this intrinsic belong to?

I think these questions are worth resolving, and once we have consensus on these, I can submit a PR implementing a portable/semi-portable way to trap.

In the meantime, what has been merged into nightly is intrinsics for platform-specific trap instructions. These should be usable from libcore soon, and allow emulating portable ways to trap in external crates. The implementation provides was good, and exposing these for experimentation purposes had achieved enough consensus for them to be usable in nightly.

It is still unclear at this point whether we only want to stabilize a portable way to trap, or whether we want to stabilize both, but I think it is worth holding the stabilization of the platform-specific trap instructions (which would need an RFC) until the unresolved questions about the portable trap intrinsic are resolved.

Afterwards, I would welcome a PR to stdsimd with an implementation (or might submit one myself).

At that point, we will probably have two ways to trap in nightly, a platform-specific way, and a portable/semi-portable way, and will have hopefully gathered enough experience with both that the next step would be to write a small RFC about whatever solution "feels right".

@Amanieu
Copy link
Member

Amanieu commented Nov 12, 2018

IMO we should just always call the abort C function instead of messing around with arch-specific instructions. In terms of code size this is practically identical on most platforms (except x86 where you save 4 bytes), and you have the advantage of having a proper abort error message instead of a confusing "Illegal Instruction" error. In environments without std it is trivial to provide an implementation of this function.

@SimonSapin
Copy link
Contributor

In environments without std it is trivial to provide an implementation of this function.

That’s what this issue is about. I don’t know how to provide that implementation without a libc and without messing with architecture-specific instructions though, could you give an example?

@Amanieu
Copy link
Member

Amanieu commented Nov 12, 2018

I meant that we should just leave this symbol undefined and let the user define it (somewhat like panic_handler).

@SimonSapin
Copy link
Contributor

Could you give an example of how you expect a "typical" embedded application to define this symbol? Especially the divergence part. Do we want to recommend loop {}?

Do you mean that defining this symbol would be mandatory for applications that do not link to std? I don’t think we can do that, since https://blog.rust-lang.org/2018/10/25/Rust-1.30.0.html#no_std-applications

@Amanieu
Copy link
Member

Amanieu commented Nov 12, 2018

An alternative would be to do something like this:

#[unwind(aborts)]
pub fn abort() -> ! {
    panic!("abort")
}

This reuses the panic mechanism to print an abort message, and forces an abort during unwinding.

@SimonSapin
Copy link
Contributor

Today, "panic=abort" means that std’s panic handler calls libpanic_abort::__rust_start_panic instead of libpanic_unwind::__rust_start_panic. The former calls libc::abort or intrinsics::abort depending on the platform.

When std is not linked none of the above is relevant and applications need to defined their own panic handler. This is where intrinsics::abort is typically used, after the message-printing is already taken care of (or deliberately skipped).

So adding "per-function panic=abort" doesn’t help with the fact that no_std applications need a way to diverge in their panic handler.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 11, 2018
@SimonSapin
Copy link
Contributor

@jhpratt How is this related to abort?

@elichai
Copy link

elichai commented Aug 19, 2020

Any updates here? I have a use for this in a no-std no libc library that gets called from C, right now we panic but that will unwind through C and cause UB.

@Amanieu
Copy link
Member

Amanieu commented Aug 19, 2020

Why would panic be UB? If you're using #[no_std] then you're already required to define a panic handler with #[panic_handler] which can call the C abort function.

@Lokathor
Copy link
Contributor

i think they mean that unwinding as a result of panic would be the UB.

@elichai
Copy link

elichai commented Aug 19, 2020

I added an emphasis on library :)
A user might use whatever panic handler/strategy they want and/or even use regular std binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests