Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upDefault behavior of unwinding in FFI functions #58794
Comments
Mark-Simulacrum
added
I-nominated
T-lang
C-tracking-issue
labels
Feb 28, 2019
Mark-Simulacrum
added a commit
to Mark-Simulacrum/rust
that referenced
this issue
Feb 28, 2019
Mark-Simulacrum
added a commit
to Mark-Simulacrum/rust
that referenced
this issue
Feb 28, 2019
Mark-Simulacrum
referenced this issue
Feb 28, 2019
Open
Tracking issue for unwind allowed/abort #58760
This comment has been minimized.
This comment has been minimized.
|
It would be helpful for the discussion if someone knowledgeable could write a summary covering the following:
When I say major platforms, I mean GNU libunwind, Windows SEH, possibly others. Is it possible to have different default behavior depending on which unwinder you're using? Say, unwind normally on major platforms, abort on others? |
This comment has been minimized.
This comment has been minimized.
In another thread, @alexcrichton wrote:
As for the current implementation, my understanding is: on Unix it works; on Windows it mostly works, with some issues that could be solved. See my comment in that thread for more details.
On Unix, longjmp just resets the stack pointer and ignores unwinding. On Windows, longjmp triggers SEH unwinding and so will run Rust destructors, AFAIK. * * (I said in other threads that it didn't, because I misread the description of this PR and thought that it changed things so destructors wouldn't run when unwinding via longjmp; in reality, it only did that to the abort-on-unwind handler itself.) |
This comment has been minimized.
This comment has been minimized.
As far as the last part of that is concerned, that is an implementation-specific and unspecified behaviour. |
Centril
referenced this issue
Mar 10, 2019
Open
Abort instead of unwinding past FFI functions #52652
Centril
removed
the
I-nominated
label
Mar 10, 2019
This comment has been minimized.
This comment has been minimized.
|
The current behavior on stable amounts to a soundness hole. For example, based on #52652 (comment), we can write (playground): extern "C" fn bad() {
panic!()
}
fn main() {
bad()
}The behavior of this program is undefined on stable because we attach the Soundness is non-negotiable and as such we landed #55982 to close this soundness hole. However, since there was no explicit confirmation of this step by the language team the change was reverted on 1.33 pending confirmation. The change is still seen in beta and nightly compilers. Based on notes by @alexcrichton in #52652 (comment), #55982 (comment), #55982 (comment), and #55982 (comment), I propose that we go ahead with and confirm the change in #55982. @rfcbot merge |
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Mar 10, 2019
•
|
Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
rfcbot
added
proposed-final-comment-period
disposition-merge
labels
Mar 10, 2019
This comment has been minimized.
This comment has been minimized.
|
This change was tried twice, and twice it was reverted because important parts of the ecosystem broke. Do we really want to try merge it again without any changes? The discussion on this topic is pretty fragmented, the internals thread mentioned in the top post has been quite active as well. I'm still waiting for the summary I requested #58794 (comment). I thought the scope of this issue is much bigger than what @Centril just mentioned. If you only care about the soundness issue at the IR level, another way to fix it is to never emit the nounwind attribute. |
This comment has been minimized.
This comment has been minimized.
This is untrue. The second time it was reverted it was reverted only because of the lack of a completed T-Lang FCP (which we are doing now). |
This comment has been minimized.
This comment has been minimized.
I don't think so. If no one in the community would've complained about the change, I don't think this would've been reverted a day before the stable release even though no lang team discussion had happened yet. That might've been used as justification to actually do the revert, but it's certainly not the only reason. |
This comment has been minimized.
This comment has been minimized.
|
@jethrogb It most definitely was the only reason; the release team cannot undo language team decisions and had there been one we would not have reverted. |
This comment has been minimized.
This comment has been minimized.
|
@Centril I'm saying that if there hadn't been any backlash no one would've even proposed to undo the change. |
This comment has been minimized.
This comment has been minimized.
|
@jethrogb Yes, there was backlash, but that was irrelevant to the acceptance or non-acceptance of the undo-PR itself. The sole reason for accepting the undo-PR was the lack of a completed T-Lang FCP. |
This comment has been minimized.
This comment has been minimized.
|
I don't have enough information to argue about procedural details and people's rationale for r+'ing this or that PR, and I wouldn't be very interested in doing so anyway. I just want to say that in the light of the the ongoing discussions and continued lack of consensus on how to address the legitimate needs of some projects to unwind through FFI, it seems premature to me to take this step now, just as it was premature the last times. Soundness is ultimately not negotiable, but there can absolutely be bad times and ways to roll out soundness fixes. |
This comment has been minimized.
This comment has been minimized.
It was suggested to me in a private conversation that this might lead to performance loss. I'd like to see some numbers on that. Because things “work” most of the time right now, it seems to me that LLVM currently generates code that would be similar to the code it would generate without nounwind. I wholeheartedly agree with @rkruppe. I feel like not emitting nounwind is a good alternative to fix the unsoundness now (although not solving UB in general), while keeping users happy, and it gives us time to search for a real solution. For this real solution, I'd like to see an RFC-style discussion with a solid motivation and discussion of alternatives. |
This comment has been minimized.
This comment has been minimized.
|
As long as the soundness hole is closed one way or the other (aborting, not emitting However, I think we should separate discussion about new mechanisms like |
This comment has been minimized.
This comment has been minimized.
I don't feel it is helpful to draw such an antagonistic picture. There literally is no way to do FFI unwinding safely in Rust currently, and some people got frustrated enough by that that they went with something that "happens to work". I have hacked around limitations in ugly ways often enough that I can totally sympathize. Sure, they should instead have written an RFC to provide a defined way to do what they needed to do, but that's a lot of work and not everyone is up for that kind of contribution. The |
This comment has been minimized.
This comment has been minimized.
|
@Centril It is indeed a possible outcome that the relevant teams ultimately decide "damn those programs and use cases, we won't provide a way to unwind through FFI". However, that would be a decision with severe downsides (more social ones than technical ones) which I don't think should be taken lightly, and IMO not at this very moment but rather after the other options have been explored and rejected -- as @RalfJung said, the trajectory of While it's all good and well to say "this is UB, we've always said so, and programs with UB are completely invalid", the Rust project really made its bed itself here by not acting on the subject for years and in particular not providing an alternative way to address the very reasonable needs that cause users to write programs with this UB. We now have the situation that people trying to do certain (fairly reasonable!) things with Rust not only have no way to achieve it without writing programs that have UB, they do not even have an alternative in sight that they could switch to when those programs break. Rust is well within its rights to break those programs, and I am definitely not arguing that the de facto behavior of today should be ad-hoc blessed as defined behavior, but it will cause users serious problems to not provide some alternative way to do what they need to do. We should not cause users such problems if we can reasonably avoid it, even if it means delaying a soundness fix. For comparison, some type system soundness bugs get a long grace period of time where the compiler warns instead of erroring on wrong programs to help people fix it before they get broken. Such a warning is not possible in this case (as it's about runtime behavior), but we should similarly do our best to ease the pain. Holding off pulling the trigger for another couple months (peanuts compared to how long the soundness issue has been open!) while waiting on other issues get worked out is a quite easy way to do that. |
This comment has been minimized.
This comment has been minimized.
I think there are also severe social downsides to not going ahead with this. Namely, we legitimize "There's no way to do X currently, so we'll do something that happens to work".
I first heard of the existence of
None of this suggests that "The
Yes, I'm quite unhappy about the inaction here. I think the reason for the inaction has precisely been that we didn't want to break anyone. In the future I hope that we set deadlines for and better track soundness holes and C-future-compatibility issues.
I think it is entirely reasonable that people use nightly until such time and help test the
I'm well aware of C-future-compatibility issues and but I think we let them sit around for far too long without actionable and well-triaged plans to address them. I think we are in need of schedules and deadlines. |
This comment has been minimized.
This comment has been minimized.
On a philosophical level, I disagree. It's not a question of "legitimizing". It's a fact of life that people will rely on implementation details whether they're supposed to or not, unless you actively prevent them from doing so. Ideally you do prevent them, like rustc does with On a more practical note, among the links in the original post, I think this (from here) is a key quote:
My thoughts:
Of course, this needs to go through an RFC. I don't think it needs to be a particularly "hard" RFC, at least if we're just stabilizing unwinding across C; it could be accepted quickly enough that there would be basically no benefit in changing the implementation to abort by default in the meantime. But this seems to have been rather controversial so far, so who knows... For that to happen, someone needs to write the RFC. Does anyone want to volunteer to do that? Should I? I also think it would be useful to fix and stabilize unwinding across C++, as a feature which might have an even narrower set of supported platforms, but which is not at all hard to implement on most existing platforms and could be quite useful for mixed C++-Rust codebases. But that comes later. |
This comment has been minimized.
This comment has been minimized.
|
Oh, one more thing (I'd edit this in, but that doesn't help people reading via email): If the unwind attribute is stabilized, rather than |
This comment has been minimized.
This comment has been minimized.
|
@rfcbot concern need-documented-replacement In the absence of a documented replacement for how people should handle errors in C libraries that only support handling through unwinding, closing this would break a common use case. Another way to fix the UB might be to drop the LLVM "nounwind" attribute. We could also add I'm happy to support this as a sensible default after we document exactly what we expect people to do when interacting with inflexible C libraries that expect unwind-based error handling. |
This comment has been minimized.
This comment has been minimized.
|
@joshtriplett Yes, if they want to be portable. If you need to "unwind through C code", you either need to compile the C code yourself with a compiler that supports a form of unwinding as an extension, or try to find another way, I believe that's one of the things that's impossible/UB in the general case. ( |
This comment has been minimized.
This comment has been minimized.
Why not? This is precisely what those C libraries that only support unwind-based error handling expect callbacks to do. In any case, unwinding works in Rust today for this particular use case (Rust -> C -> Rust callback) |
This comment has been minimized.
This comment has been minimized.
Crates can reasonably do that by doing the build as part of |
This comment has been minimized.
This comment has been minimized.
|
@joshtriplett One thing off the top of my head (but there might be more reasons): libraries written without unwinding support in mind and/or not compiled with |
This comment has been minimized.
This comment has been minimized.
MOZGIII
commented
Mar 13, 2019
|
I'd say in general it's really UB, however we can have defined behavior with certain known conditions. For example, in Elbrus architecture frames are explit on the CPU level, and stack unwinding is guaranteed to actually jump through the stack frames (no longjmp there). So, on Elbrus, for the whole platform is seems reasonable to assume that the bahevior is known, and that the unwinding works for everything by default. I'm not an expert in Elbrus though, I just head it somewhere. (Please validate my before actually relying on this!) The point is, maybe it can be different per-platform? |
This comment has been minimized.
This comment has been minimized.
|
Certainly unsafe code is allowed to assume its destructors won't be skipped when unrolling up the stack, otherwise things like the crossbeam scoped API would be completely unsound. I think the only thing being discussed here is whether it's okay to unwind past specific, known chunks of Rust code. |
This comment has been minimized.
This comment has been minimized.
|
We are mostly worried about unwinding past C code. |
This comment has been minimized.
This comment has been minimized.
MOZGIII
commented
Mar 13, 2019
•
|
I'm confused. Unwinding Rust code can be made safe within the sequence of Rust frames, being that sequence the top-level app stack, or a callback inside of a C/C++/whatever passed the control to the Rust code, right? Problems and uncertantries arise when we unwind across Rust and non-Rust stack frames, like from Rust code (running in callback), to C++ code (that invoked the callback), running in Rust code (in the Rust app, that uses C++ library that has a call that takes a callback). In this case it might be tricky to guarantee that unwinding will correctly pass through the C++ layer (Rust -> C++ -> Rust). And so on with other combinations. My point about Elbrus is all such cases baheve very similarly with their architecture with multiple stacks (they have separate frames and data stacks). |
This comment has been minimized.
This comment has been minimized.
|
C code compiled with C++ exceptions enabled is not C. ABI-wise it is probably more C++ than C. If the aim is to convert Rust panics to C++ exceptions, and then catch C++ exceptions and convert them to Rust panics, supporting that via |
This comment has been minimized.
This comment has been minimized.
|
What's the status on this? We reverted the abort in #58795 for the stable branch, but AFAICS it's still there on beta and master. If the decision is still pending, we should also revert on beta so it doesn't land in 1.34 next week! |
This comment has been minimized.
This comment has been minimized.
|
Rust programs have UB without the PR, we want to exploit that UB to improve the performance of Rust panics in the near future (e.g. CraneStation/cranelift#553), and the PR makes those programs have defined behavior by guaranteeing an The revert was intended to buy us more time to explore some solutions and we have done so. Somebody needs to put in the work to write RFCs, implement them, etc. Programs affected by this, like Therefore I think we should revert the revert. While it is sad that Rust cannot directly interoperate with C++ in C FFI (automatically inserting shims to convert from Rust panics to C++ exceptions and vice-versa), the right way to solve that problem is to submit an RFC with a solution. |
This comment has been minimized.
This comment has been minimized.
The revert was only on stable, so if we do nothing right now, the unwind-abort will be in 1.34. Maybe that's fine, but it doesn't seem like there's consensus per rfcbot #58794 (comment). |
This comment has been minimized.
This comment has been minimized.
|
I agree that we should propagate this to stable to avoid a stable regression, yes. This should never have been broken in the first place without discussion before changing the behavior. This worked in prior stable versions, and that it happens to not be well defined doesn't change that it worked in many cases and people were able to successfully use it in those cases. I'd be all for seeing RFCs, to propose alternatives that would allow optimizations like the proposed one in cranelift. And in the meantime, let's not break stable users. I'm aware that "this is unsound" is a permissible justification for breaking stable. However, there's a difference between "this is unsound" and "this is undefined (but people know how it works)". By all means, let's find a solution for this, and until then let's not stick a crowbar through the engine of a running vehicle to stop it for maintenance. ;) |
cuviper
added a commit
to cuviper/rust
that referenced
this issue
Apr 2, 2019
cuviper
referenced this issue
Apr 2, 2019
Merged
[beta] Permit unwinding through FFI by default #59640
This comment has been minimized.
This comment has been minimized.
|
I opened #59640 on beta to preserve the current stable behavior. |
This comment has been minimized.
This comment has been minimized.
It is not that the behavior "wasn't well defined" - the behavior is undefined by design. The main reason being that we can change the implementation to alert users that they are invoking undefined behavior, as well as to enable optimizations in FFI code and in the Rust panic implementation.
There is already a straightforward solution to this. People arguing that it's not what they wish it would be does not change that fact.
LLVM is allowed to optimize all this code under the assumption that it won't I'd rather have users complaints of the form "You changed the implementation of some code that had undefined behavior and now I have to fix my code" than complaints of the form "You knew my code had undefined behavior, had an implementation of a way to alert me, yet decided not to do so, which resulted in my software having a security vulnerability".
The current stable C FFI already allows these optimizations. If people want language features to more ergonomically interface with the unwinding strategies of other programming languages, like C++, they should open RFCs to do that. |
joshtriplett
added
the
I-nominated
label
Apr 2, 2019
This comment has been minimized.
This comment has been minimized.
|
Process question: Should we revert this on nightly too so we don't have to keep reverting this until the final decision has been made? |
This comment has been minimized.
This comment has been minimized.
That link does not show that we want to exploit it "in the near future", given that Cranelift does not perform optimizations and so any small performance benefit from changing the calling convention would be purely academic – not to mention that rustc doesn't even support Cranelift yet. Arguably it shows that we want to exploit it in the far future; that gives us plenty of time to add FFI unwinding attributes first, rather than rushing to break things and, at best, making crates like
Then remove |
bors
added a commit
that referenced
this issue
Apr 3, 2019
This comment has been minimized.
This comment has been minimized.
As discussed yesterday on discord (cc @Centril @joshtriplett ), I believe that while these libraries have knowingly decided to rely on a particular implementation of undefined behavior, we might have failed to communicate during the last cycle that it's up to them to put in the work. Iff these libraries show willingness to fix their code, and require more time to do that, I'll be fine with delaying the landing of Penalizing correct C FFI code (e.g. by removing
No other programming language allows unwinding through C FFI. The only standard way of propagating exceptions in C++ and D through FFI boundaries is catching all exceptions at the FFI boundary, and passing error codes instead - depending on the language at the other side, those error codes can be re-raised as panics/exceptions/sjlj/etc. The behavior of throwing from In the last 6 week, no pre-RFC or RFC has been filled to extend the language to support this use case, this issue has received very little attention, the design work required to extend the language with a more ergonomic solution is IMO significant (should we have So I am not as certain as you are about "how soon" this workarounds will land. There have been cases where there just was no way to do something in Rust without invoking UB (e.g. taking the address of a packed struct field), but this is not one of those cases. These libraries had a perfectly valid stable Rust alternative available, yet decided to pursue the UB route instead. I think that's unfortunate, but I think it would be more unfortunate to penalize those using C FFI correctly, as well as leaving vulnerable those who are invoking undefined behavior by accident instead of by design. |
This comment has been minimized.
This comment has been minimized.
I agree in principle. However, notice that the same is true for the kind of UB that we will eventually introduce with whatever derivative of or alternative to Stacked Borrows becomes "the real thing": we currently definitely do not exploit many kinds of UB in this space (we only emit I appreciate that unwinding across FFI might be different, but "it is UB just on paper, not in practice" is a slippery slope. |
This comment has been minimized.
This comment has been minimized.
I started one (which is largely based on jcranmer's post). If jcranmer's ideas sound reasonable and interesting to people, I can try expediting writing the RFC (at the cost of interacting with my family...). I have a crate (objrs) that makes Rust code adhere to the Objective-C ABI, allowing Rust and Objective-C to interop. Throwing exceptions across Rust+Objective-C boundaries is a critical part of my crate (libobjc even has a C FFI function ( In short, there's a good argument to make for exceptions across FFI boundaries. It definitely needs an RFC to iron out the details and make clear what is and is not permissible. I just hope people are at least open to the idea.
Focusing on a "standard way" for a language is a red herring. This kind of stuff is in the land of implementation-defined behavior, not standard-defined behavior. There's a reason GCC has |
This comment has been minimized.
This comment has been minimized.
|
@mjbshaw jcranmers' post is one of the many directions in which we could go, I think the direction is worth exploring, but not at the cost of interacting with your family. Before writing an RFC, it would probably make sense to first open an internal threads to collect all the use cases we want to support and all the constraints that we have, and try to reach a consensus on that, since those are going to restrict the design space for the RFC. On discord we also discussed that maybe we could add a |
This comment has been minimized.
This comment has been minimized.
I take this as an ultimatum for the lang team -- from the release team perspective, IMO we should keep maintaining the status quo on stable until this issue is decided.
Would you make this flag available to stable users? |
This comment has been minimized.
This comment has been minimized.
That's a good question. |
This comment has been minimized.
This comment has been minimized.
|
cc @cuviper @kornelski @alexcrichton Based on discussion in the language team meeting: We'd like to address the undefined behavior, and in the process of doing so we want to provide a stable Rust alternative for this. We'd like to set a reasonable deadline for a stable release making undeclared unwinds through FFI abort, something in the ~12 week range. In order to make a plan that seems likely to succeed, we'd like to have the folks who need this feature (either unwind-through-FFI or some manner of well-defined setjmp/longjmp) involved in the conversation and specifying the replacement feature. Could we please get some positive confirmation, from the Rust-bindings-to-mozjpeg folks or others who need this, that it seems reasonable to develop a replacement for this in the near-term future? |
Mark-Simulacrum commentedFeb 28, 2019
•
edited
This is the tracking issue for the behavior of unwinding through FFI functions.
There are two choices here: we can abort if unwinding occurs through an
extern "C"boundary. We abort on beta 1.34 and nightly 1.35, but will permit unwinding in stable 1.33.We previously attempted this change in 1.24 and reverted in 1.24.1. We attempted to do so again in 1.33, but reverted once again pending lang team discussion on the topic.
There has been discussion on this topic in #52652, #58760, and #55982.
The stable behavior of permitting unwinding is UB, and can be triggered in safe code (#52652 (comment)). Notably,
mozjpegdepends on this behavior and seems to have no good stable alternatives; there's been some discussion on internals.