Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use reraise_raw_backtrace primitive in the compiler #374

Merged
merged 3 commits into from Jul 26, 2018

Conversation

Projects
None yet
6 participants
@bobot
Copy link
Contributor

commented Dec 24, 2015

(splitted after creation to #378)

The function Misc.try_finally use reraise_raw_backtrace function for keeping correct backtrace even if the cleaning function use exceptions.

This PR also add a function Misc.try_finally' that superseed Misc.try_finally, and it uses it for fixing backtrace in the compiler. So it should superseed half of #358 by using Misc.try_finally' when Misc.remove_file is used for cleaning.

val try_finally' :
  ?always:(unit -> unit) ->
  ?exceptionally:(unit -> unit) ->
  (unit -> 'a) -> 'a
(** [try_finally' ~always ~exceptionally f] always executes the
    function [always] when the function [f] return normally or raise an
    exception. Executes the function [exceptionally] if the
    function [f] or [always] raise an exception.
*)

This PR goes a little further and convert many other try with to Misc.try_finally', even in case where the raise to reraise heuristic gives correct backtrace, since that reduce code duplication, and that is clearer,

EDIT: remove what is in #378

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 25, 2015

I like the general work very much and I would be in favor of merging it for 4.03. The thing that needs discussing is the reraise_raw_backtrace primitive, as the Misc changes and compiler error-handling overhaul are internal changes that are rather orthogonal to release management choices (except maybe in terms of regression risks).

I have three reservations: the API of try_finally', the call-site style that you use, and the code of try_finally' itself.

Api of try_finally'

First, there should be no function named try_finally'. Either it is try_finally and it should be named try_finally, or it is something else and it needs a better name. I think this function is try_finally, which means that it should replace the current try_finally function and you should update the legacy callsites (I count 4 of them).

Second, I would have naively expected the exceptionally case to receive the exn argument. Is there a particular reason for taking a unit argument instead?

Style of callsites

begin fun is an acquired taste. Nicolas experimented a lot with it in the ocamlbuild codebase, and people wince at it. I don't know whether it's for good or bad reasons (I personally think it is fine when used at the right place), but I would recommend just avoiding it, to avoid this discussion altogether.

In any case (whether you use parents or not), I think it is wrong to start a multi-line argument on the line of the function call if there are extra arguments following it. You should never have

try_finally' begin fun () ->
   ...
end
   ~always:(....)

but rather

try_finally'
  (fun () -> ...)
  ~always:(....)

Note that

try_finally' (fun () ->
   ...
   ...)
   ~always:(....)

is acceptable because the first argument's end is not less indented than the second argument's start.

Code of try_finally'

I feel that the current code of try_finally' (quoted below for ease of discussion) is too complex for its own good. At least there should be comments to explain what is happening.

let try_finally' ?(always=(fun () -> ())) ?(exceptionally=(fun () -> ())) work =
  let always () =
    try
      always ()
    with e ->
      let bt = Printexc.get_raw_backtrace () in
      exceptionally ();
      Printexc.reraise_raw_backtrace e bt
  in
  let result =
    try work ()
    with e ->
      let bt = Printexc.get_raw_backtrace () in
      always ();
      exceptionally ();
      Printexc.reraise_raw_backtrace e bt
  in
  always ();
  result

The control flow is unclear, there are two calls to get_raw_backtrace and I'm not sure whether each will correspond to what the user expects, and it looks like reviewing the function for correctness would be exhausting.

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 25, 2015

Here would be one attempt to de-spaghettify the try_finally' code. I think it should be equivalent to the code you wrote, and there is a design question arising in the case where both work and always failed (I would rather be of the opinion to prioritize the work failure):

let try_finally ?(always=(fun () -> ())) ?(exceptionally=(fun _exn -> ())) work =
  match work () with
    | result -> 
      begin match always () with
        | () -> result
        | exception always_exn ->
          let always_bt = Printexc.get_raw_backtrace () in
          exceptionally always_exn;
          Printexc.reraise_raw_backtrace always_exn always_bt
      end
    | exception work_exn ->
      let work_bt = Printexc.get_raw_backtrace () in
      begin match always () with
        | () ->
          exceptionally work_exn;
          Printexc.reraise_raw_backtrace work_exn work_bt
        | exception always_exn ->
          let always_bt = Printexc.get_raw_backtrace () in
          exceptionally always_exn;
          (* why are we not raising work_exn, work_bt in this case? *)
          Printexc.reraise_raw_backtrace always_exn always_bt
      end
@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Dec 25, 2015

try_finally' replaces now try_finally.

For the addition of the exception I'm less convinced, it was not useful in the cases I translated and since the caller doesn't know if the exception comes from work or from always I don't think it is useful. So I prefer simplicity.

If we always raise work_exn when work raises, it should be also the case even if exceptionally raises. Which will complicate the code. Moreover the exceptions from always or exceptionally will be forever lost since after fixing work they will perhaps not appear anymore.

For the style of callsites, my only goal was to have something reasonable that doesn't change the indentation (using ocp-indent) so that the commit is simple to review. Now that it is done, I can use your third style if you prefer and re-indent.

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 25, 2015

For the addition of the exception I'm less convinced, it was not useful in the cases I translated and since the caller doesn't know if the exception comes from work or from always I don't think it is useful. So I prefer simplicity.

Ok.

If we always raise work_exn when work raises, it should be also the case even if exceptionally raises. Which will complicate the code. Moreover the exceptions from always or exceptionally will be forever lost since after fixing work they will perhaps not appear anymore.

After thinking about the issue more, I understand it less. In particular, why is your implementation not catching exceptions thrown by exceptionally? My understanding of #358 was precisely that the cause of the problem was backtrace thrown inside exception-handling code, with Misc.remove_file (which is commonly called inside your exceptionally clause) the common suspect of backtrace trashing.

If this is correct, to correctly preserve backtraces a try_finally code should always catch (and ignore, or warn about, but not (re)raise) exceptions raised during exceptionally execution. (Whether or not we decide to raise work_exn or always_exn).

I do see your point about errors being lost. Maybe we should warn explicitly about them (have a runtime warning when an exception is ignored in try_finally). Another approach would be to build a "slightly wrong" backtrace by concatenating the traces of the various handlers, which corresponds to what happens when reraise is used too liberally, and that I think @def-lkb previously described as "more wrong and more useful".

I still suspect, from the examples in the compiler callsites, that the always and exceptionally work will contain less important logic, and that the one backtrace we really want to see by default is the one coming from work.

For the style of callsites, my only goal was to have something reasonable that doesn't change the indentation (using ocp-indent) so that the commit is simple to review. Now that it is done, I can use your third style if you prefer and re-indent.

That's a good point. Another thing I was thinking is that sometimes placing the always and exceptionally first would actually make clearer code (because then the resource cleanup logic would be placed close to the resource acquisition logic), but that's a thing to decide locally at the callsite.

@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Dec 26, 2015

My understanding of #358 was precisely that the cause of the problem was backtrace thrown inside exception-handling code, with Misc.remove_file (which is commonly called inside your exceptionally clause) the common suspect of backtrace trashing.

The root of the problems is not exceptions that are raised outside exceptionally because in this case the backtrace corresponds to the exception. But exceptions that are raised and catched inside exceptionally.

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 26, 2015

After thinking more about this issue, I wonder if a much simpler specification wouldn't be better for try_finally: we could only run exceptionally if work fails, and not catch always failure at all (so propagate any error, without running exceptionally). Consider the following use-cases found in the PR:

-    close_out outchan
-  with x ->
-    close_out outchan;
-    remove_file lib_name;
-    remove_file archive_name;
-    raise x
+  end
+    ~always:(fun () -> close_out outchan)
+    ~exceptionally:(fun () ->
+        remove_file lib_name;
+        remove_file archive_name)

Note that in the original code, if close_out fails, then the remove_file functions are not called. If flushing outchan failed, it might be that the corresponding file is incomplete (so removing it sorts of makes sense), but the other file does not need to be removed as the code producing it ran to completion.

Also in the following case:

Misc.try_finally begin fun () ->
  comp (Pparse.parse_implementation ~tool_name ppf sourcefile)
end
  ~always:(fun () -> Stypes.dump (Some (outputprefix ^ ".annot")))
  ~exceptionally:(fun () -> remove_file objfile; remove_file cmxfile)

removing object files and cmx files if the production of the .annot file failed does not make much sense, as they should be complete and correct.

Finally, in

let read_ast magic fn =
  let ic = open_in_bin fn in
  Misc.try_finally begin fun () ->
    let buffer = really_input_string ic (String.length magic) in
    assert(buffer = magic); (* already checked by apply_rewriter *)
    Location.input_name := input_value ic;
    input_value ic
  end
   ~always:(fun () -> close_in ic; Misc.remove_file fn)

there is no exceptionally case, and one can see that if close_in fails it will prevent the remove_file call from running -- this is different from the current semantics of the exceptionally block.

I would therefore propose the following, much simplified form:

(** [try_finally work ~always ~exceptionally] is designed to run code
    in [work] that may fail with an exception, and has two kind of
    cleanup routines: [always], that must be run after any execution
    of the function (typically, freeing system resources), and
    [exceptionally], that should be run only if [work] failed with an
    exception (typically, undoing user-visible state changes that
    would only make sense if the function completes correctly). For
    example:

    {|
      let objfile = outputprefix ^ ".cmo" in
      let oc = open_out_bin objfile in
      Misc.try_finally
        (fun () ->
           bytecode
           ++ Timings.(accumulate_time (Generate sourcefile))
               (Emitcode.to_file oc modulename objfile);
           Warnings.check_fatal ())
        ~always:(fun () -> close_out oc)
        ~exceptionally:(fun _exn -> remove_file objfile);
    |}

    If [always] or [exceptionally] fail with an exception, it is
    propagated as usual. In particular, if [always] fails with an
    exception, then [exceptionally] is *not* called.

    On the other hand, if they use exceptions internally for
    control-flow but do not raise, then [try_finally] is careful to
    preserve any exception backtrace coming from [work] for easier
    debugging.
*)
let try_finally ?(always=(fun () -> ())) ?(exceptionally=(fun _exn -> ())) work =
  match work () with
    | result -> always (); result
    | exception exn ->
      let bt = Printexc.get_raw_backtrace () in
      always ();
      exceptionally exn;
      Printexc.reraise_raw_backtrace exn bt

With this specification, exceptionally only runs when work failed, so which exception value is passed to it is clearly defined (it is always work's): I reinstated the exn argument.

@bobot , it seems to me that your PR is in two parts, the primitive implementation and related Printexc change, and the try_finally function and its use in the compiler. Maybe it would make sense to split those in two PRs (keeping this one for try_finally, but with a new PR with only a prefix of the commits for the primitive only)? Currently the primitive-related changes are a bit lost in the try_finally diffs if people use "Files changed" to review the patch, and the discussions on these two aspects are of rather different nature.

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 26, 2015

The root of the problems is not exceptions that are raised outside exceptionally because in this case the backtrace corresponds to the exception. But exceptions that are raised and catched inside exceptionally.

Indeed, I had misunderstood this previously, and only got it when looking at it again this morning. This supports my intuition that maybe not trying to catch failures in always and exceptionally at all is the best semantics.

@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Dec 26, 2015

OK I'm going to split the PR as you proposed. Your last try_finally is
sensible. I wait for comments of others before using it. I prefer that
4.03 have nice backtraces for better bug report, but this splitted PR can
wait.

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 26, 2015

I don't think the split would necessarily delay the PR, on the contrary it may be a way to converge more quickly. The try_finally stuff is relatively low-risk, so if we can both agree on something I would be confident in merging it. On the contrary, the implementation details for reraise are more sensitive and I would like to get acknowledgements from others before deciding a merge; it is easier to ask them to review a smaller PR.

@bobot bobot force-pushed the bobot:feature/reraise_raw_backtrace branch from 3540eb2 to 94844f5 Dec 27, 2015

@bobot bobot changed the title Add reraise_raw_backtrace primitive Use reraise_raw_backtrace primitive in the compiler Dec 27, 2015

@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Dec 27, 2015

Split done (#378). In the end I disagree with exceptionally should only be triggered if work raise. I agree that was the previous behavior but I think we should improve it. With your example:

-    close_out outchan
-  with x ->
-    close_out outchan;
-    remove_file lib_name;
-    remove_file archive_name;
-    raise x
+  end
+    ~always:(fun () -> close_out outchan)
+    ~exceptionally:(fun () ->
+        remove_file lib_name;
+        remove_file archive_name)

outchan is in this case let outchan = open_out_bin lib_name in, so I think that if they is a problem during the closing of outchan (ex: problem during flushing) we want to remove lib_name because the information stored in it could be partial. exceptionally is like rolling back a database transaction, and the transaction is work and always.

@damiendoligez damiendoligez added this to the 4.04-or-later milestone Jan 22, 2016

@bobot bobot force-pushed the bobot:feature/reraise_raw_backtrace branch 2 times, most recently from b0de4a1 to aec9983 Jul 7, 2016

@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Jul 8, 2016

@gasche I rebased and replaced my version of try_finally by the first of yours. So the semantic is that exceptionally is run whenever an exception is raised by workor always.

@bobot bobot force-pushed the bobot:feature/reraise_raw_backtrace branch from aec9983 to 767263d Jul 27, 2016

@mshinwell mshinwell removed this from the 4.04 milestone Sep 7, 2016

@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Feb 17, 2017

Going through pull requests that have been quiet for a while, I find this one which looks pretty good to me. Any remaining problem? The 4.04 and 4.05 deadlines are passed, but it could go in master and then in 4.06 if there is agreement.

@gasche

This comment has been minimized.

Copy link
Member

commented Feb 17, 2017

Indeed, thanks! We should check that it rebases as expected (the primitive itself was only merged recently due to my own slowness), and I think it would be fine in trunk. @bobot, would you have time to work on the rebase?

@damiendoligez damiendoligez added this to the 4.07-or-later milestone Sep 27, 2017

@damiendoligez damiendoligez removed this from the consider-for-4.07 milestone Jun 1, 2018

@gasche

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

I regret that we dropped the ball on this one, and I hope that someone (not me right now) will be able to move it forward. Having useful backtraces in the compiler sounds like a nice property to gain.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

@gasche I rebased this PR on trunk:

https://github.com/ocaml/ocaml/compare/trunk...nojb:feature/reraise_raw_backtrace?w=1

Apart from the mechanical adaptations, I removed the bootstrap commit and added a commit replacing begin/end with parentheses (and reindenting as necessary - one can still easily review the diff putting w=1 in the github diff url in order to ignore whitespace changes, as in the above link).

@bobot I could force push into your branch if you agree.

@bobot

This comment has been minimized.

Copy link
Contributor Author

commented Jul 25, 2018

@nojb you can. Thank you for the effort.

@nojb nojb force-pushed the bobot:feature/reraise_raw_backtrace branch from 767263d to 0defbbe Jul 25, 2018

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Thanks, done!

As mentioned, for reviewing add w=1 to the diff URL: https://github.com/ocaml/ocaml/pull/374/files?w=1

Changes Outdated
has changed by using optional arguments. Cleanup in exceptional case
have been added (`~exceptionally`). It uses
`Printexc.raise_with_backtrace` for preserving backtraces.
(François Bobot and Gabriel Scherer)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

This entry should be moved up, and @nojb's contribution added to the credit line.

then raise(Error(Assembler_error asm_filename));
if create_asm && not keep_asm then remove_file asm_filename
)
~exceptionally:(fun () -> remove_file obj_filename)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

The code of this whole block could be simplified.

  • the inner exceptionally block is code duplicated with the end of the outer action, you should remove the duplication by moving the code to an outer always block.
  • the outer exceptionally code is much shorter than the outer action, I would rather have it come first in this case.

This comment has been minimized.

Copy link
@nojb

nojb Jul 25, 2018

Contributor

In the case where the call to Proc.assemble_file fails, then the file is not removed, so we can't just move the inner exceptionally block to the outer scope.

then raise(Error(Archiver_error archive_name));
)
~always:(fun () -> close_out outchan)
~exceptionally:(fun () ->

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Here as well, always and exceptionally are short, could they be placed before the action? As a side-effect, they would be closer to the resource allocation code, which makes the whole thing easier to read.

output_binary_int outchan pos_toc;
)
~always:(fun () -> close_out outchan)
~exceptionally:(fun () -> remove_file lib_name)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

The short named arguments could be passed first.

Bytesections.write_toc_and_trailer outchan;
)
~always:(fun () -> close_out outchan)
~exceptionally:(fun () -> remove_file exec_name)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Named arguments could go first.

in
cmi, cmt
)
~always:(fun () -> close_in ic)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

could go first

umode := old_unification_mode;
generate_equations := old_gen;
assume_injective := old_inj
)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Misc.protect_refs

try let res = f t1 t2 in univar_pairs := old_univars; res
with exn -> univar_pairs := old_univars; raise exn
Misc.try_finally (fun () -> f t1 t2)
~always:(fun () -> univar_pairs := old_univars)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Misc.protect_refs.

This comment has been minimized.

Copy link
@nojb

nojb Jul 25, 2018

Contributor

Not sure about this one. A priori, the code between the definition of old_univars and the call to f may be mutating univar_pairs, in which the case using Misc.protect_refs would not be correct.

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

That's a fair point. Two remarks:

  1. We could use univar_paris := old_univars and then call Misc.protect_refs, but that would be evil.
  2. I don't know the type-checker better than I do, so I can't say whether the code above can reach this point with !univar_paris <> old_univars, but I can say with confidence that this code would be extremely fishy if it could, and probably wrong. So maybe it's worth asking someone that does do type-checker stuff to confirm here. (@trefis?)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

(I won't let this question delay the PR; it's fine to merge if this point remains unresolved.)

Misc.try_finally (fun () ->
eqtype_list rename type_pairs subst env tl1 tl2
)
~always:(fun () -> backtrack snap)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

could go first

gadt_equations_level := None;
r
)
~always:(fun () -> gadt_equations_level := None)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Misc.protect_refs.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Thanks for the review @gasche! I think I addressed all the suggestions (except for the univar_pairs one).

@gasche

gasche approved these changes Jul 25, 2018

Copy link
Member

left a comment

I believe this is good to go if CI passes, but see minor Changes remark and small possibly-fixup-inspiring question.

Changes Outdated
arguments. Cleanup in exceptional case have been added (`~exceptionally`). It
uses `Printexc.raise_with_backtrace` for preserving backtraces.
(François Bobot, Gabriel Scherer, and Nicolás Ojeda Bär, review by Gabriel
Scherer)

This comment has been minimized.

Copy link
@gasche

gasche Jul 25, 2018

Member

Reading this change entry again, I believe that it doesn't reflect the content of the PR anymore. Here would be a suggestion for a replacement:

- GPR#374: use Misc.try_finally for resource cleanup in the compiler codebase. This should
  fix the problem of catch-and-reraise `try .. with` blocks destroying backtrace information.
  (credits)

(Semi-related question: does the use of protect_refs drops backtraces? Its implementation is not using try_finally itself, it probably should, but it may be that the "detect raises that are re-raises" logic suffice in this particular case, as the exception-cleanup work does not involved raising exceptions.)

This comment has been minimized.

Copy link
@nojb

nojb Jul 25, 2018

Contributor

Thanks, amended as suggested. And yes, I believe Misc.protect_refs preserves backtraces.

@nojb nojb force-pushed the bobot:feature/reraise_raw_backtrace branch from e910cfc to 685d898 Jul 25, 2018

@gasche

This comment has been minimized.

Copy link
Member

commented Jul 25, 2018

(Personally I would squash {[ and "Cosmetics" into François' commit, kill his entirely-subsumed "Changes entry" commit, and preserve the rest of the history.)

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Yes, will do that before merging.

bobot and others added some commits Dec 23, 2015

Use reraise_raw_backtrace in Misc.try_finally
    And add labels ~always for previous cleanup function and
    ~exceptionally for new cleanup function in exceptional case

@nojb nojb force-pushed the bobot:feature/reraise_raw_backtrace branch from 685d898 to 40bab2d Jul 25, 2018

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Squashed & rebased. Will merge once CI passes to avoid new conflicts cropping up.

@gasche gasche merged commit b1ee013 into ocaml:trunk Jul 26, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.