Print function names in backtraces #9096

stedolan · 2019-11-06T12:07:31Z

This PR adds function names to OCaml backtraces. As an example, here's what I see when I add a 'assert false' randomly somewhere in the middle of ctype.ml:

Fatal error: exception File "typing/ctype.ml", line 728, characters 5-11: Assertion failed
Raised at Ctype.generalize_structure in file "typing/ctype.ml", line 728, characters 5-17
Called from Typecore.type_expect_ in file "typing/typecore.ml", line 2859, characters 6-29
Called from Builtin_attributes.warning_scope in file "parsing/builtin_attributes.ml", line 236, characters 14-18
Re-raised at Builtin_attributes.warning_scope in file "parsing/builtin_attributes.ml", line 241, characters 10-13
Called from Typecore.type_expect in file "typing/typecore.ml", line 2327, characters 4-156
Called from Typecore.type_expect_ in file "typing/typecore.ml", line 3305, characters 17-39
[...]

I have two motivations:

I want function names in backtraces
In Statmemprof, it's natural to group backtraces of allocations that share a suffix (statmemprof-emacs does this). But the backtraces have precise locations, so two callsites in the same function are distinct backtrace entries, making the grouping too fine-grained to be useful.

Spacetime has some code to deal with 2 (which lives in spacetime_lib), which associates backtrace entries (i.e. callsites) with functions by parsing the ELF symbol table and looking up function names via DWARF (using owee). This works, but with a couple of disadvantages: it's fairly complicated, ELF-specific, and exposes the user to unpalatable generated names. (It also doesn't help with 1).

Instead, this patch extends the existing debug info with function names, which are looked up using the same code that currently finds source locations.

Names and scopes

In OCaml, not all functions have names, and not all names are for functions.

This PR assigns names by dividing the source file into "scopes", which may nest but not overlap. Locations are named according to their stack of enclosing scopes, and new scopes are introduced by:

named functions
anonymous functions
modules
values bindings in modules
classes
methods

(See the type Lambda.scoped_location for how these are represented)

As an example, here's a simple program and the backtrace it prints:

let f () =
  List.iter (fun _ -> assert false) [1;2;3];
  print_string "done!"
let () = f ()

Fatal error: exception Assert_failure("raise_iter.ml", 2, 22)
Raised at Raise_iter.f.(fun) in file "raise_iter.ml", line 2, characters 22-34
Called from Stdlib__list.iter in file "list.ml", line 110, characters 12-15
Called from Raise_iter.f in file "raise_iter.ml", line 2, characters 2-43
Called from Raise_iter in file "raise_iter.ml", line 4, characters 9-13

The names printed above (intentionally) look like module paths. However, these are not in general paths: for instance, some mention (fun), and it is possible to have locations reported as e.g. Foo.f.Bar.g (by using let module to nest a module inside an expression), which is not a valid path.

For a more comprehensive example, see names.ml in the testsuite

Since patterns need not bind only a single name, it can be ambiguous what the name of a scope is. In such cases, this patch arbitrarily picks one of the names: let (f, g) = assert false will be reported as an exception in f. If no names are bound by a pattern, then no scope is introduced. So, if a module M contains let () = assert false, the exception is reported as simply coming from M.

Binary size impact

In native code, the extra debug information makes ocamlopt.opt about 2.5% bigger than current trunk, and similar for Flambda. Strings are shared when they occur multiple times. (An earlier version of this patch included a rather complicated compression scheme for sharing prefixes. I can revisit it later if the debug info size becomes a problem).

Bytecode uses a simpler format, and the impact there is much larger (about 10%). I note that debug info is already very large on bytecode (it includes a typing environment for ocamldebug's benefit) and
binary sizes are larger than native (bytecode ocamlopt is about 2x bigger than ocamlopt.opt). From this I conclude that nobody cares very much about bytecode binary size, so I did the simplest possible thing there and wrote strings directly. (More optimised versions are of course possible).

Reviewing notes

This is a large patch, but I've arranged the commit history to make it easier to review. The commits are:

  1. 888ce57634 Introduce the Lambda.scoped_location type
  2. 541d829457 Replace Location.t with Lambda.scoped_location in Lambda code
  3. 07da0a18e4 Print function names (derived from Lambda.scoped_location) in backtraces
  4. b4550dd1b3 Generate function names during translation from Typedtree to Lambda
  5. dde9b1e471 Update testsuite with new backtrace format, and add test for names
  6. 88f157703a  Bootstrap

Commits 1, 3 and 4 contain the actual work, and require review. They're relatively small.

Commit 2 is very large but very simple. The only thing it does is thread a new ~scopes parameter around all of the transl_foo functions that translate Typedtree to Lambda, so that the raw locations of Typedtree can be turned into scoped_locations in Lambda.

Commit 5 is also large, because it updates all of the backtraces in the testsuite. (Commit 6 is a bootstrap).

gasche · 2019-11-06T12:48:59Z

Note to reviewers: this patch looks large only because it's on top of the not-yet-merged #8805. There's only one new commit in this PR, currently at +275 -140 lines.

I solved that for you.

I haven't looked at the code. The description of the design is sensible and I agree it's a general feature. I would be interested in rephrasing the current wording of the backtrace lines, Called from <function> at <location>, to suggest less strongly that <function> is the caller, given the approximations introduced by the naming scheme. What about Called within <function> at <location>, for example?

stedolan · 2019-11-06T14:10:25Z

I solved that for you.

Thanks!

I haven't looked at the code. The description of the design is sensible and I agree it's a general feature. I would be interested in rephrasing the current wording of the backtrace lines, Called from <function> at <location>, to suggest less strongly that <function> is the caller, given the approximations introduced by the naming scheme. What about Called within <function> at <location>, for example?

I'd claim that it's not an approximation, but that Called from <foo> ... means "called from within the definition of foo" rather than "called from within the body of the function named foo" :)

Pedantry aside, I'm not attached to the current wording (I didn't put much thought into it), but I'm not really seeing the same subtle distinction between "called from" and "called within" - they mean the same to me, but the latter sounds a bit more awkward.

The functions vs. scopes thing is a bit subtle, though (cf. the wall of text above), and I'm not really sure the best way to get it across here.

lpw25 · 2019-11-06T15:49:20Z

I think that using the statically enclosing bindings like this is likely to be very confusing for users. It seems almost certain to be misread by people trying to track down a failure and to lead to wasted time looking in the wrong function.

I've tackled this issue before as part of work (not yet upstreamed) to improve flambda's inlining reports. I found that the nicest "name" for a function was along the lines of:

type parent_ =
  | Unit of string
  | Submodule of parent_ * string
  | Class of parent_ * string
  | Function of function_ * string

and function_name =
  | Name of string
  | Anonymous of Location.t

and function_ = parent_ * function_name

Which allows you to print things like: the anonymous function at foo.ml:3,1-10 within Foo.bar or Bar.Baz.foo.

That might be a bit too detailed for use in backtraces, but at the very least I think you can get them to distinguish: an anonymous function within Foo.bar from Foo.bar.

alainfrisch · 2019-11-06T16:08:26Z

If we have a nice symbolic path to point to a location, it would be nice to include it in the payload for Assert_failure (for cases where the stacktrace is not available, or not stored in logs, etc). Unfortunately, this exception constructor takes a tuple argument, and extending it could break some code (do people explicitly actually pattern-match on it, in practice?). One could always add the symbolic path to the filename component of the payload, but it's a bit ugly. Switching to an (inline?) record would be better, if we can convince ourselves the amount of code breakage is small enough.

alainfrisch · 2019-11-06T16:15:41Z

A new scope is introduced by a module binding or a module-level let binding (but not an expression-level let ... in ... binding)

What's the argument for keeping local module binding names but not local value names? If the location is within a local function (or even within a non-functional local binding) with a clear name, it seems useful to keep that name in the path.

stedolan · 2019-11-11T08:55:44Z

@lpw25

I found that the nicest "name" for a function was along the lines of: [...]

This sounds like a solution to a slightly different problem.

In an inlining report you need to be able to refer to particular functions, since inlining is something that happens to functions. The problem being solved is "name this function", and writing M.foo for two separate functions (the one bound to M.foo and an anonymous function inside it) would indeed be confusing.

In a backtrace, the problem being solved is "describe this location", where locations in question are not functions but callsites (or raises, etc.). In particular, text like the anonymous function at foo.ml:3,1-10 is wrong: there's no function at the described location.

The goal here is to describe locations approximately. Half the point is that multiple nearby locations get the same description. I'm open to suggestions about what should contribute to a description (classes should definitely be in, as @alainfrisch points out that it's weird to include local modules but not local functions, and I'm still on the fence about marking anonymous functions). But I think it's important to realise that this is a different problem from naming functions for an inlining report.

lpw25 · 2019-11-11T09:09:33Z

That difference is why I suggested that the full description might be too detailed.

In particular, text like the anonymous function at foo.ml:3,1-10 is wrong: there's no function at the described location.

To clarify, the full backtrace slot with the full description would be something like:

Called from the anonymous function at foo.ml:3,1-10 within Foo.bar in file "list.ml", line 110, characters 12-15

i.e. the location I was mentioning above was an additional location, not to be confused with the location of the call site.

I'm open to suggestions about what should contribute to a description

I think the distinction between:

Called from Typecore.type_expect in file "typing/typecore.ml", line 2327, characters 4-156

and

Called from an anonymous function within Typecore.type_expect in file "typing/typecore.ml", line 2327, characters 4-156

is the most important thing to distinguish.

stedolan · 2019-11-27T16:22:00Z

Update: This patch got rewritten, which made it both longer and less hacky.

Instead of adding Levent nodes to Lambda which get converted to debuginfo separately by each of the backends (native, bytecode, flambda), the updated PR replaces the Location.t type in Lambda with a new scoped_location (which is a pair of a Location.t and some information about scopes). This means the scoping information is only computed in one place.

To the reviewer: read this commit-by-commit. Commits 1 and 4 are large and boring (1 threads a scopes parameter through transl{obj,class,mod} and 4 updates tests), while 2 and 3 are small and interesting (2 is the backend support for printing names, 3 specifies how the names are generated)

stedolan · 2020-01-09T11:13:18Z

ping?

- Introduce the Lambda.scoped_location type - Replace Location.t with Lambda.scoped_location in Lambda code - Print function names (derived from Lambda.scoped_location) in backtraces - Generate function names during translation from Typedtree to Lambda - Update testsuite with new backtrace format, and add test for names

shindere · 2020-04-09T17:13:27Z

I'm not able to review the code, sorry about that. I just wanted to show
appreciation for the work accomplished here and all the efforts put in
making it understandable even to a non-expert like me.

Not sure this will receive enough reviews to be ready before the 4.11
freeze, but hoping for the best.

avsm · 2020-04-16T14:03:11Z

Developer meeting discussion: @gasche: is there consensus on the interface here? @lpw25 / @stedolan both agree on the interface now (but PR description has not been updated). Decision: consider this for merge into 4.11.

gasche · 2020-04-16T15:29:59Z

I think the most practical thing to do for this PR is to accept to merge it after the branch/freeze date, once it is reviewed. I'm not in favor of forcing people to work at full speed before next Monday, because this risks decreasing the quality of the result, increasing stress, and I don't see any advantage.

stedolan · 2020-04-16T17:04:54Z

(but PR description has not been updated)

This is done now.

gasche · 2020-04-16T17:18:13Z

For me the missing piece was that @lpw25 is happy with the new messages that are proposed. He said so during the meeting, but there is no sign of this in the PR that I can find. I shamefully admit that this decreased my own motivation/effectiveness to do a review: it's a lot of work to get into these low-level details, and it's less exciting to do so when one has the feeling that the PR may linger forever because people disagree (for reasonable reasons) about how to print the result.

P.S.: thinking about this again, you mentioned during the meeting that you have tools that depend on this PR. Again, maybe this could have been emphasized in the PR discussion, to help prioritize review work. (Also, are the tools available somewhere for the curious?)

lpw25

I've reviewed the compiler parts up to cmm and they look pretty good. A few parts could be tidied up. In particular, I think that the scope types should drop the lambda_ from their name and move to Debuginfo, and I would like the locations generated in matching.ml to also have scopes.

lambda/debuginfo.mli

middle_end/flambda/closure_conversion.ml

asmcomp/cmm_helpers.ml

lambda/debuginfo.ml

lambda/lambda.ml

lambda/lambda.mli

lambda/matching.ml

lambda/translprim.ml

lpw25 · 2020-04-18T11:45:35Z

bytecomp/instruct.mli

 type debug_event =
  { mutable ev_pos: int;                (* Position in bytecode *)
    ev_module: string;                  (* Name of defining module *)
    ev_loc: Location.t;                 (* Location in source file *)
    ev_kind: debug_event_kind;          (* Before/after event *)
+    ev_defname: string;                 (* Enclosing definition *)
    ev_info: debug_event_info;          (* Extra information *)


It doesn't need to be in this PR, but it might be worth considering how this information could be used in ocamldebug. For example, I think updating the code in Show_information.show_one_frame would allow changing the result of ocamldebug's backtrace command to match the new backtrace printing.

gasche · 2020-04-18T12:21:45Z

Thanks @lpw25 for your review; I like the style of your review comments, some I could have written myself. (Yay for complaining about optional arguments.) Let me know I can help for the Matching/Switch part. I worked on having more locations in Switch in trunk...gasche:switch-loc when reviewing #2294. There the purpose was to have more precise locations for the tests; I'm not sure you need this here (I haven't looked at the code), @lpw25 suggests that just parametrization is enough.

mshinwell · 2020-04-20T10:20:26Z

@lpw25 I will read the backend parts, probably today -- can you confirm that I need to read the runtime changes, Cmm_helpers and Emitaux?

The regression was introduced by ocaml#9096 in this part of the diff: | Tcf_method (name, _, Tcfk_concrete (_, exp)) -> - let met_code = msubst true (transl_exp exp) in + let scopes = enter_method_definition ~scopes name.txt in + let met_code = + msubst true (transl_scoped_exp ~scopes exp) in transl_exp would use Translobj.oo_wrap on functions, while the new transl_scoped_expr does not. The regression was first reported by Sacha Ayoun in ocaml#10092, as it results in a large slowdown (2x slower) on the Gillian codebase, which uses an object-oriented visitor in a performance-critical way.

stedolan force-pushed the backtrace-names branch from e74f74f to c11469b Compare November 6, 2019 12:48

stedolan force-pushed the backtrace-names branch from c11469b to 62d05f4 Compare November 6, 2019 15:31

stedolan force-pushed the backtrace-names branch 2 times, most recently from b5632e7 to 52f9743 Compare November 20, 2019 16:31

stedolan force-pushed the backtrace-names branch from 52f9743 to db99621 Compare March 10, 2020 10:17

stedolan force-pushed the backtrace-names branch from db99621 to 63f2d09 Compare March 17, 2020 15:36

stedolan force-pushed the backtrace-names branch from 99668c9 to 7774374 Compare March 30, 2020 09:55

stedolan force-pushed the backtrace-names branch from fad7a3c to 88f1577 Compare April 16, 2020 10:07

avsm added this to the 4.11 milestone Apr 16, 2020

stedolan changed the title ~~[WIP] Print function names in backtraces~~ Print function names in backtraces Apr 16, 2020

lpw25 reviewed Apr 18, 2020

View reviewed changes

stedolan force-pushed the backtrace-names branch 2 times, most recently from bd7a725 to a348b2d Compare April 20, 2020 14:29

Octachron mentioned this pull request Jun 30, 2020

OCaml 4.11 failure in tests: Should return a position. gildor478/ounit#17

Closed

xclerc mentioned this pull request Jul 9, 2020

Share the strings representing scopes #9748

Closed

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 16, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

4934aa2

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

854028c

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 20, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

e426610

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

a77e82f

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 21, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

b13d89f

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

d542712

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 30, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

1f7e4b0

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 3, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

1d63f62

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 4, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

e4e0149

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 5, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

2d4e1cf

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 7, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

f34fc0b

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

fe982ff

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 10, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

90cdd42

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 17, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

6f0febe

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 18, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

4eba91d

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 19, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

afba025

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 20, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

9f4627e

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Aug 28, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

e3053c3

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

b0fc162

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

66aa73e

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 2, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

d6f71d7

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Sep 7, 2020

ocaml#9096 from stedolan/backtrace-names (Review)

815cca6

This was referenced Dec 22, 2020

fix a caching problem in OO code causing a 4.11 performance regression #10096

Merged

Big slow down when upgrading from 4.10 to 4.11 #10092

Closed

Octachron mentioned this pull request Jan 14, 2021

Ocaml 4.12.0, first beta ocaml/opam-repository#17978

Merged

dbuenzli mentioned this pull request Aug 24, 2021

Compilation mode regexps ocaml/caml-mode#5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print function names in backtraces #9096

Print function names in backtraces #9096

stedolan commented Nov 6, 2019 •

edited

gasche commented Nov 6, 2019

stedolan commented Nov 6, 2019

lpw25 commented Nov 6, 2019 •

edited

alainfrisch commented Nov 6, 2019

alainfrisch commented Nov 6, 2019

stedolan commented Nov 11, 2019

lpw25 commented Nov 11, 2019 •

edited

stedolan commented Nov 27, 2019

stedolan commented Jan 9, 2020

shindere commented Apr 9, 2020

avsm commented Apr 16, 2020

gasche commented Apr 16, 2020

stedolan commented Apr 16, 2020

gasche commented Apr 16, 2020

lpw25 left a comment

lpw25 Apr 18, 2020

gasche commented Apr 18, 2020

mshinwell commented Apr 20, 2020

Print function names in backtraces #9096

Print function names in backtraces #9096

Conversation

stedolan commented Nov 6, 2019 • edited

Names and scopes

Binary size impact

Reviewing notes

gasche commented Nov 6, 2019

stedolan commented Nov 6, 2019

lpw25 commented Nov 6, 2019 • edited

alainfrisch commented Nov 6, 2019

alainfrisch commented Nov 6, 2019

stedolan commented Nov 11, 2019

lpw25 commented Nov 11, 2019 • edited

stedolan commented Nov 27, 2019

stedolan commented Jan 9, 2020

shindere commented Apr 9, 2020

avsm commented Apr 16, 2020

gasche commented Apr 16, 2020

stedolan commented Apr 16, 2020

gasche commented Apr 16, 2020

lpw25 left a comment

Choose a reason for hiding this comment

lpw25 Apr 18, 2020

Choose a reason for hiding this comment

gasche commented Apr 18, 2020

mshinwell commented Apr 20, 2020

stedolan commented Nov 6, 2019 •

edited

lpw25 commented Nov 6, 2019 •

edited

lpw25 commented Nov 11, 2019 •

edited