Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Treat as errors the uncaught exceptions in finalisers #8873
This old patch fixes the bug whereby catchable exceptions (e.g. Not_found) can in theory appear out of thin air, due to exceptions being raised at allocation points from finalisers. After discussion with @lpw25 and @stedolan, I decided to resurrect it and propose it as a PR.
Warning: this was my first modification to the runtime, initially meant as an exercise to find my way around the compiler. I cannot guarantee that I understand everything I wrote, so please review carefully. And no hard feelings if rejected!
The strategy is the same as for
Right now, there is a consensus that uncaught exceptions in finalisers should be ignored or terminate the program (probably with a customisable dbuenzli-hook). I too think this is a good general direction, and the current PR does not aim to change what this ideal solution should be. There are a few comparison points where one might still want the current PR as a solution available right here and now:
The current PR fixes the bug now, and up until our multicore people ask that finalisers run in their own threads.
As an alternative design to the current
Failing tests: currently, I have the following kind of test failure, e.g.
Is this a bug? Can you give me clues to fix it?
Did you consider simply not letting uncaught exceptions in finalizers subvert the control flow of the program but still letting the program be made aware that something bad happened ?
I always thought these kind of problems would be better solved by a trap (again sorry...) rather than effectively letting them turn any kind of exception into a asynchronous one (before this PR) or introducing a new, but single, one to represent them (this PR).
Basically the idea would be to either use or extend (my preference) the uncaught exception handler and simply pass these uncaught exceptions there hereby easily letting the program decide whatever action it wishes to perform according to where they occur. Something like:
type ctx = [ `Signal_handler | `Program | `Finalizer | `Self (* the uncaught handler itself raised *) ] val set_uncaught_exception_handler : (ctx -> exn -> raw_backtrace -> unit) -> unit val handle_uncaught_exception : ctx -> exn -> raw_backtrace -> unit
Yes, I explained in detail the challenges of this alternative approach. The point is that this simple fix is available right now, does not raise as many questions, and does not preclude the solution you suggest from being implemented once OCaml is ready for it.
(No need to apologize for being insistent :)
This is what I meant with a dbuenzli-hook. In the handler, there is indeed not much else you can do besides aborting or swallowing the exception (and logging). Maybe you were thinking of being able to do something else?
Right but somehow I don't see it fixing much. I mean the only thing you get is more information right ? That is: rather than a puzzling exception out of of the blue you still get it but wrapped in another one. Agreed it's better, but still... I'm not sure the problem is common enough to warrant introducing a half-fix in the stdlib which will have to be deprecated.
Unless there's something I deeply misunderstand I don't really buy the "backwards-compatible" argument you are making in the first point above: these errors are non-deterministic and if they happen to occur outside an exn handler your program is already abortive now.
So why not introduce the trap immediately ? You could even elect not to abort on
No that pretty much sums it...
Okay sorry of course not if you have a global
No, you get the (theoretical) assurance that when you catch for instance Not_found, it means what you expect. And it does not claim to achieve anything else.
I agree that it is quite theoretical, but it's pretty bad and gets mentioned often when discussing asynchronous exceptions. I do not think it is a half-fix, but a proper fix for OCaml as it currently stands (where you've currently got OOM, Sys.Break, and where you have to be careful about breaking changes).
I'd be in favour of an alternative PR which introduces a trap right away, but note that this currently needs to at the very least implement masking first, and to investigate the consequences of the change.
In general I am not sure that ignoring bugs amounts to a healthy default, which is part of the complications. But I agree it might end up the best choice, i.e. not necessarily consider uncaught exceptions as a bug, e.g. if you have a mental model where a finalizer runs in an isolated thread.
Note that once you make ignoring the default, then it's hard to go back. And if you use libraries written when ignoring is the default, they might not be well-tested against changing this behaviour.
With the current PR we have the time to wait and see where multicore is headed before settling on a design.
Yes, I used to think the same, which is why I did not think opening a PR until recently when I was encouraged to do so.
Developers have been speaking about sanitizing uncaught exceptions in finalisers for a while now, everybody agrees it's bad, but I have not seen much progress.
Sorry if it looks like a lot of discussion around tiny details, but it's important.
I did not see your edit.
Not all exceptions out of finalisers are programming errors (currently).
Also, note the proposition of introducing a common exception
Continuing with the idea of
I am curious to audit the uses of Sys.Break in the opam repository. Has anybody some guidance on where to start?
There's a challenging problem for backwards compatibility. Maybe it is necessary to introduce deprecation warnings for patterns like
I'll be around at the OCaml workshop in the morning if people want to discuss this.
You mention in
This PR does not go in a direction I'm interested in, but it doesn't seem to break anything I rely on.
@lthls Thanks for your input. One aspect is that you do not want to catch without a catch-all, but you are allowed to match for display or logging purposes for instance (hence the "general rule").
This is in theory. I am now closing this PR as I am going to explain.
@lpw25 @stedolan @gasche (and other people who have shown interest or displayed opinions about asynchronous exceptions), this is for the record. I did an audit of the use of asynchronous exceptions in OCaml packages published on opam. I performed the following searches.
Then I curated the results by hand by assessing whether they 1) purposefully raise exceptions in signal handlers or finalisers, 2) whether they explicitly match on this exception. Note that the goal was not to be exhaustive but to receive some concrete usage examples, hence the coarse search patterns and the search limited to opam. Nobody has the time nor the possibility to do an exhaustive audit.
Here is a list of packages that rely on asynchronous exceptions by criterion 1):
The the best of my understanding, here is the sublist of packages that subscribe to the discipline of not discriminating the asynchronous exception from other kinds of unexpected exceptions (criterion 2):
The most prevalent use of async exceptions is raising an exception on SIGINT in interactive programs. Quite importantly, in many cases a specific action is performed, from a specific message ("Interrupted"), to a specific user interaction ("press Ctrl-C two more times to exit"), even to a distinct behaviour (taking the decision to respawn a task instead of aborting). It appears important to be able to discriminate
As for finalisers, the first remark is that
The prize of the most innovative use of asynchronous exceptions goes to
Note again the empirical evidence in favour of the need to compute until some space or time bound is reached.
So, contrary to what I understood of the maintainer folklore, the established ocaml codebase is not ready even for the mildest proposed change (wrap exceptions in
As for multicore, I see new challenges. I clearly see a design for asynchronous exceptions in multicore, but there are two new questions: 1) is it possible to discriminate signals from other unexpected exceptions? (there are a few obvious ideas to explore), 2) for any such design, is there a clear evolution path from the established code base to the new design? (that sounds more challenging to me).
Hope to see you in Berlin!
I understand you went quickly over this but I'd just like to mention that:
does not as far as I understand your definitions belong to 1: it only matches over known asynchronous OCaml runtime system exceptions to let them flow up and traps any other one. It does not raise exceptions from signal handlers or use finalizers. So I think it's rather in 2.
Ah, yes. I remember it as a special case because I remember even looking up details on opam and seeing you were the author. There are a few other packages I marked as "?" because they matched on
In doubt, in case I missed something, I registered the intent to match on Sys.Break.
For completeness, here's the list of ones I have marked as "bad" (1 but not 2 nor above):
Edit: Here are my raw notes, for the record: https://gitlab.com/gadmm/stdlib-experiment/blob/master/other/async_audit/break_or_catch_break
Before this patch, a Memprof callback can potentially raise any exception, including for instance Not_found out of thin air. This means that when doing "try ... with Not_found ->...", this might not always mean what the programmer intends. This is a well-known theoretical bug affecting the raising behaviour of finalisers (ocaml#8873). In general, any exception raised by Memprof is an asynchronous exception subject to the discipline of ML interrupts (see e.g. http://isabelle.in.tum.de/dist/library/Doc/Implementation/ML.html): one must not discriminate interrupts, and an interrupt must always be re-raised promptly except at isolation boundaries. This patch avoids this sort of bugs, and it encourages and facilitates following the discipline of interrupts for catching exceptions arising from Memprof callbacks. Note that some code in the wild exists of the form `try ... with Sys.Break -> ...`. Such code does not follow the interrupt discipline as it discriminates on interrupts. With this patch this practice becomes incompatible with Memprof (similarly to what it is already with Fun.protect). Users of such code who want to use Memprof must fix it to properly catch all exceptions, and can record the cancellation state into a boolean flag inside the signal handler of SIGINT for the current thread.