New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print function names in backtraces #9096
Conversation
e74f74f
to
c11469b
Compare
I solved that for you. I haven't looked at the code. The description of the design is sensible and I agree it's a general feature. I would be interested in rephrasing the current wording of the backtrace lines, |
Thanks!
I'd claim that it's not an approximation, but that Pedantry aside, I'm not attached to the current wording (I didn't put much thought into it), but I'm not really seeing the same subtle distinction between "called from" and "called within" - they mean the same to me, but the latter sounds a bit more awkward. The functions vs. scopes thing is a bit subtle, though (cf. the wall of text above), and I'm not really sure the best way to get it across here. |
c11469b
to
62d05f4
Compare
I think that using the statically enclosing bindings like this is likely to be very confusing for users. It seems almost certain to be misread by people trying to track down a failure and to lead to wasted time looking in the wrong function. I've tackled this issue before as part of work (not yet upstreamed) to improve flambda's inlining reports. I found that the nicest "name" for a function was along the lines of:
Which allows you to print things like: That might be a bit too detailed for use in backtraces, but at the very least I think you can get them to distinguish: |
If we have a nice symbolic path to point to a location, it would be nice to include it in the payload for |
What's the argument for keeping local module binding names but not local value names? If the location is within a local function (or even within a non-functional local binding) with a clear name, it seems useful to keep that name in the path. |
This sounds like a solution to a slightly different problem. In an inlining report you need to be able to refer to particular functions, since inlining is something that happens to functions. The problem being solved is "name this function", and writing In a backtrace, the problem being solved is "describe this location", where locations in question are not functions but callsites (or raises, etc.). In particular, text like The goal here is to describe locations approximately. Half the point is that multiple nearby locations get the same description. I'm open to suggestions about what should contribute to a description (classes should definitely be in, as @alainfrisch points out that it's weird to include local modules but not local functions, and I'm still on the fence about marking anonymous functions). But I think it's important to realise that this is a different problem from naming functions for an inlining report. |
That difference is why I suggested that the full description might be too detailed.
To clarify, the full backtrace slot with the full description would be something like:
i.e. the location I was mentioning above was an additional location, not to be confused with the location of the call site.
I think the distinction between:
and
is the most important thing to distinguish. |
b5632e7
to
52f9743
Compare
Update: This patch got rewritten, which made it both longer and less hacky. Instead of adding Levent nodes to Lambda which get converted to debuginfo separately by each of the backends (native, bytecode, flambda), the updated PR replaces the Location.t type in Lambda with a new To the reviewer: read this commit-by-commit. Commits 1 and 4 are large and boring (1 threads a scopes parameter through transl{obj,class,mod} and 4 updates tests), while 2 and 3 are small and interesting (2 is the backend support for printing names, 3 specifies how the names are generated) |
ping? |
52f9743
to
db99621
Compare
db99621
to
63f2d09
Compare
99668c9
to
7774374
Compare
- Introduce the Lambda.scoped_location type - Replace Location.t with Lambda.scoped_location in Lambda code - Print function names (derived from Lambda.scoped_location) in backtraces - Generate function names during translation from Typedtree to Lambda - Update testsuite with new backtrace format, and add test for names
I'm not able to review the code, sorry about that. I just wanted to show Not sure this will receive enough reviews to be ready before the 4.11 |
fad7a3c
to
88f1577
Compare
I think the most practical thing to do for this PR is to accept to merge it after the branch/freeze date, once it is reviewed. I'm not in favor of forcing people to work at full speed before next Monday, because this risks decreasing the quality of the result, increasing stress, and I don't see any advantage. |
This is done now. |
For me the missing piece was that @lpw25 is happy with the new messages that are proposed. He said so during the meeting, but there is no sign of this in the PR that I can find. I shamefully admit that this decreased my own motivation/effectiveness to do a review: it's a lot of work to get into these low-level details, and it's less exciting to do so when one has the feeling that the PR may linger forever because people disagree (for reasonable reasons) about how to print the result. P.S.: thinking about this again, you mentioned during the meeting that you have tools that depend on this PR. Again, maybe this could have been emphasized in the PR discussion, to help prioritize review work. (Also, are the tools available somewhere for the curious?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed the compiler parts up to cmm and they look pretty good. A few parts could be tidied up. In particular, I think that the scope types should drop the lambda_
from their name and move to Debuginfo
, and I would like the locations generated in matching.ml
to also have scopes.
type debug_event = | ||
{ mutable ev_pos: int; (* Position in bytecode *) | ||
ev_module: string; (* Name of defining module *) | ||
ev_loc: Location.t; (* Location in source file *) | ||
ev_kind: debug_event_kind; (* Before/after event *) | ||
ev_defname: string; (* Enclosing definition *) | ||
ev_info: debug_event_info; (* Extra information *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to be in this PR, but it might be worth considering how this information could be used in ocamldebug. For example, I think updating the code in Show_information.show_one_frame
would allow changing the result of ocamldebug's backtrace
command to match the new backtrace printing.
Thanks @lpw25 for your review; I like the style of your review comments, some I could have written myself. (Yay for complaining about optional arguments.) Let me know I can help for the Matching/Switch part. I worked on having more locations in Switch in trunk...gasche:switch-loc when reviewing #2294. There the purpose was to have more precise locations for the tests; I'm not sure you need this here (I haven't looked at the code), @lpw25 suggests that just parametrization is enough. |
@lpw25 I will read the backend parts, probably today -- can you confirm that I need to read the runtime changes, Cmm_helpers and Emitaux? |
bd7a725
to
a348b2d
Compare
The regression was introduced by ocaml#9096 in this part of the diff: | Tcf_method (name, _, Tcfk_concrete (_, exp)) -> - let met_code = msubst true (transl_exp exp) in + let scopes = enter_method_definition ~scopes name.txt in + let met_code = + msubst true (transl_scoped_exp ~scopes exp) in transl_exp would use Translobj.oo_wrap on functions, while the new transl_scoped_expr does not. The regression was first reported by Sacha Ayoun in ocaml#10092, as it results in a large slowdown (2x slower) on the Gillian codebase, which uses an object-oriented visitor in a performance-critical way.
The regression was introduced by ocaml#9096 in this part of the diff: | Tcf_method (name, _, Tcfk_concrete (_, exp)) -> - let met_code = msubst true (transl_exp exp) in + let scopes = enter_method_definition ~scopes name.txt in + let met_code = + msubst true (transl_scoped_exp ~scopes exp) in transl_exp would use Translobj.oo_wrap on functions, while the new transl_scoped_expr does not. The regression was first reported by Sacha Ayoun in ocaml#10092, as it results in a large slowdown (2x slower) on the Gillian codebase, which uses an object-oriented visitor in a performance-critical way.
This PR adds function names to OCaml backtraces. As an example, here's what I see when I add a 'assert false' randomly somewhere in the middle of ctype.ml:
I have two motivations:
Spacetime has some code to deal with 2 (which lives in spacetime_lib), which associates backtrace entries (i.e. callsites) with functions by parsing the ELF symbol table and looking up function names via DWARF (using owee). This works, but with a couple of disadvantages: it's fairly complicated, ELF-specific, and exposes the user to unpalatable generated names. (It also doesn't help with 1).
Instead, this patch extends the existing debug info with function names, which are looked up using the same code that currently finds source locations.
Names and scopes
In OCaml, not all functions have names, and not all names are for functions.
This PR assigns names by dividing the source file into "scopes", which may nest but not overlap. Locations are named according to their stack of enclosing scopes, and new scopes are introduced by:
(See the type
Lambda.scoped_location
for how these are represented)As an example, here's a simple program and the backtrace it prints:
The names printed above (intentionally) look like module paths. However, these are not in general paths: for instance, some mention
(fun)
, and it is possible to have locations reported as e.g.Foo.f.Bar.g
(by usinglet module
to nest a module inside an expression), which is not a valid path.For a more comprehensive example, see names.ml in the testsuite
Since patterns need not bind only a single name, it can be ambiguous what the name of a scope is. In such cases, this patch arbitrarily picks one of the names:
let (f, g) = assert false
will be reported as an exception inf
. If no names are bound by a pattern, then no scope is introduced. So, if a moduleM
containslet () = assert false
, the exception is reported as simply coming fromM
.Binary size impact
In native code, the extra debug information makes ocamlopt.opt about 2.5% bigger than current trunk, and similar for Flambda. Strings are shared when they occur multiple times. (An earlier version of this patch included a rather complicated compression scheme for sharing prefixes. I can revisit it later if the debug info size becomes a problem).
Bytecode uses a simpler format, and the impact there is much larger (about 10%). I note that debug info is already very large on bytecode (it includes a typing environment for ocamldebug's benefit) and
binary sizes are larger than native (bytecode ocamlopt is about 2x bigger than ocamlopt.opt). From this I conclude that nobody cares very much about bytecode binary size, so I did the simplest possible thing there and wrote strings directly. (More optimised versions are of course possible).
Reviewing notes
This is a large patch, but I've arranged the commit history to make it easier to review. The commits are:
Commits 1, 3 and 4 contain the actual work, and require review. They're relatively small.
Commit 2 is very large but very simple. The only thing it does is thread a new
~scopes
parameter around all of thetransl_foo
functions that translate Typedtree to Lambda, so that the raw locations of Typedtree can be turned intoscoped_locations
in Lambda.Commit 5 is also large, because it updates all of the backtraces in the testsuite. (Commit 6 is a bootstrap).