Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unfortunate backtrace clearing by with_return #44

Closed
jberdine opened this issue May 2, 2018 · 18 comments · Fixed by ocaml/opam-repository#12339
Closed

Unfortunate backtrace clearing by with_return #44

jberdine opened this issue May 2, 2018 · 18 comments · Fixed by ocaml/opam-repository#12339

Comments

@jberdine
Copy link

jberdine commented May 2, 2018

The use of With_return.with_return in e.g. Container.fold_result leads to a call to Exn.raise_without_backtrace, which calls clear_backtrace. Perhaps in other contexts it makes sense for Exn.raise_without_backtrace to clear the backtrace, but it makes debugging any code using with_return rather harder.

@ghost
Copy link

ghost commented May 2, 2018

I looked in my mail archives and the rationale was that if an exception raised with raise_notrace is not caught, then we end up capturing a completely unrelated backtrace.

However, I agree that this case is annoying. Maybe we should revisit this decision.

I always found backtrace management to be messy in OCaml. It would be much simpler if we just stored bakctraces in exceptions directly.

@ghost
Copy link

ghost commented May 3, 2018

Do you have an example where this is problematic? The ony case I can think of is if you write something like this:

try
  ...
with exn ->
  <code that ends up calling clear_backtrace>
  reraise exn

But even without the explicit call to clear_backtrace this code could clear the backtrace.

@jberdine
Copy link
Author

jberdine commented May 3, 2018

One case (that is probably an ill-advised approach, I admit) is something along the lines of:

match Array.fold_result ~f ... with
| Ok _ -> ...
| Error _ -> Printexc.get_raw_backtrace ...

But looking again, I may have mis-blamed the clear_backtrace call. Instead, consider the implementation of with_return:

  try
    let a = f { return } in
    is_alive := false;
    a
  with exn ->
    is_alive := false;
    match exn with
    | M.Return a -> a
    | _ -> raise exn

Perhaps raise exn should instead be reraise or Printexc.raise_with_backtrace. What I observe is that in code like the fold_result snippet above, any exceptions raise from f have backtraces truncated below the fold_result call.

@ghost
Copy link

ghost commented May 8, 2018

Ah, yes that makes sense. I guess we didn't catch this internally as we just patch the compiler so that raise is in fact reraise.

ghost pushed a commit that referenced this issue May 8, 2018
@ghost
Copy link

ghost commented May 8, 2018

I submitted the change for review internally and made the change in the v0.11 branch

@jberdine
Copy link
Author

jberdine commented May 9, 2018

Aside: I think it would be great if the patch to change raise to be reraise made it into the standard compiler. Cc @jvillard who has fought a lot with getting reasonable backtraces.

@ghost
Copy link

ghost commented May 9, 2018

Agreed

@trefis
Copy link
Contributor

trefis commented May 14, 2018

I'm a bit surprised by this, the compiler should already be detecting cases like the one discussed here and turn them into a reraise, and indeed if I define /tmp/with_return.ml as:

type 'a return = { return : 'b. 'a -> 'b }

let with_return (type a) f =
  let module M = struct
    (* Raised to indicate ~return was called.  Local so that the exception is tied to a
       particular call of [with_return]. *)
    exception Return of a
  end in
  let is_alive = ref true in
  let return a =
    if not !is_alive
    then failwith "use of [return] from a [with_return] that already returned";
    raise_notrace (M.Return a);
  in
  try
    let a = f { return } in
    is_alive := false;
    a
  with exn ->
    is_alive := false;
    match exn with
    | M.Return a -> a
    | _ -> raise exn

Then

$ which ocamlc
/home/trefis/.opam/4.06.1/bin/ocamlc
$ ocamlc -c -dlambda /tmp/with_return.ml
(setglobal With_return!
  (let
    (with_return/1004 =
       (function f/1006
         (let
           (M/1008 =
              (module-defn(M/1008) /tmp/with_return.ml(4):86-87
                (let
                  (Return/1007 =
                     (makeblock 248 "Return" (caml_fresh_oo_id 0)))
                  (makeblock 0 Return/1007)))
            is_alive/1009 = (makemutable 0 1a)
            return/1206 =
              (function a/1207
                (seq
                  (if (not (field 0 is_alive/1009))
                    (apply (field 1 (global Pervasives!))
                      "use of [return] from a [with_return] that already returned")
                    0a)
                  (raise_notrace (makeblock 0 (field 0 M/1008) a/1207)))))
           (try
             (let (a/1208 = (apply f/1006 (makeblock 0 return/1206)))
               (seq (setfield_imm 0 is_alive/1009 0a) a/1208))
            with exn/1209
             (seq (setfield_imm 0 is_alive/1009 0a)
               (if (== (field 0 exn/1209) (field 0 M/1008))
                 (field 1 exn/1209) (reraise exn/1209)))))))
    (makeblock 0 with_return/1004)))

And I don't think this is a new feature of 4.06.

@lpw25
Copy link
Member

lpw25 commented May 15, 2018

I suspect that the automatic converting of raise to reraise only works if raise is defined as the external %raise. This probably isn't the case with the raise available in Base since we build it internally with raise defined as %reraise.

@ghost
Copy link

ghost commented May 15, 2018

It might be because of the let raise = Caml.raise in import0.ml.

I remember that the heuristics for turning raise into reraise improved some time ago, but they were still not good enough so that we could remove the hack we have inside Jane Street.

@jberdine
Copy link
Author

I will check -dlambda the next time I hit one of these issues and report back.

But a higher-level point, which is perhaps not really about Base, is that the difference between raise and reraise is much too important to leave to a compiler heuristic. It is a bad day when a binary where the compiler heuristic happened to be defeated gets pushed to production. The current situation is the for code we control, it is essentially always necessary from the predictability and maintainability perspective to explicitly call Printexc.get_raw_backtrace and Printexc.raise_with_backtrace.

@ghost
Copy link

ghost commented May 15, 2018

Agreed. We were talking about it the other day with @mshinwell and we want to experiment with the idea of storing backtraces inside exceptions, since it would solve all backtrace related problems.

The only issue is that it causes constant exceptions to allocate again. However, we have an idea to preserve this optimization when the backtrace is not needed. We are planning to do this as an intern project.

@jberdine
Copy link
Author

jberdine commented May 15, 2018 via email

@ghost
Copy link

ghost commented May 15, 2018

I think we would be happy with that too at Jane Street. However I remember that this was one of the argument for changing the representation of exceptions some time ago, so some people might rely on it.

The analysis we are thinking of should be simple. The idea is the following: if the backtrace is attached to the exception, then the function to retrieve the backtrace should have type:

val get_raw_backtrace : exn -> raw_backtrace

So for a given exception X constant or not, if a handler matches all possible values for X and doesn't bind the exception to a variable, then there is no way for the user to get the backtrace. This property could be stored in the binary and at runtime raise would check whether the current handler can get the backtrace or not for the exception being raised.

I think this should catch all the relevant use cases. Moreover this would benefit functions such as Map.find since they would now capture the backtrace in less cases.

@trefis
Copy link
Contributor

trefis commented May 15, 2018

I suspect that the automatic converting of raise to reraise only works if raise is defined as the external %raise. This probably isn't the case with the raise available in Base since we build it internally with raise defined as %reraise.

Ah yes indeed, that makes sense.

It might be because of the let raise = Caml.raise in import0.ml.

And we don't want to define it as external raise : exn -> _ = "%raise" because that would defeat the change we have internally where raise is defined as reraise?
Perhaps we could do with some preprocessing before releasing externally to change that particular point, rather than fa42117 which seems very adhoc to me.

Another option would be to explicitely reraise everywhere in base (but that become tedious).

@ghost
Copy link

ghost commented May 17, 2018

Well, we never really discussed it. Initially it must have been me or Stephen who wrote let raise = Caml.raise as it seemed like the most natural way to rebind Caml.raise. It's easy to forget that doing so breaks compiler optimizations.

Regarding pre-processing before pushing to github, it seems to me that the issue is the same for everyone so the solution should probably be the same as well. If the compiler heuristics are now good enough, then we should just define it as %raise for everyone, and if not then define it as %reraise for everyone.

@ghost
Copy link

ghost commented Jun 6, 2018

I looked more into this. The heuristic should work in all cases in 4.07, so we just changed the declaration of raise in Base to be external raise : exn -> _ = "%raise". I'll backport the patch once it reaches github. This should fix this issue.

@ghost
Copy link

ghost commented Jul 12, 2018

This is fixed in both master and v0.11.1

@ghost ghost closed this as completed Jul 12, 2018
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants