-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing displaying multi-line backtrace locations in the same way as errors #10111
Conversation
2fc86ed
to
e718e31
Compare
This sounds good! I've definitely grown tired of seeing Is the optimisation of partially packed debuginfos important? Naively I would have thought that since most debuginfos are fully packed, we don't lose much by not bothering to do any bitpacking in the overflow case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks beneficial but technical work. It needs a reviewer. Do we have a volunteer? Would one of the people investing their time in debugging/tracing tools be willing to look at this?
(Naive question from just reading the PR description: have you kept some space in the format for eventual future extension? What do we do for locations, possibly in generated code, that actually have huge line numbers, or large character positions? Should we have a "safe/robust" mode where everything is 64bits, instead of making various assumptions?)
This is also partly waiting for me to update the code to address @stedolan's comment! |
285992a
to
bc71853
Compare
Rebased to get rid of the conflict and also updated to use a pair of @stedolan and I had an |
bc71853
to
a598655
Compare
a598655
to
9fa9e96
Compare
9fa9e96
to
e1b6ee5
Compare
e1b6ee5
to
1b259ac
Compare
d20ad0b
to
f35155f
Compare
f35155f
to
1a4665f
Compare
1a4665f
to
c9dbffb
Compare
c9dbffb
to
5dea248
Compare
5dea248
to
6abc613
Compare
6abc613
to
98935bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at the review, I am not extremely familiar with this part of the codebase, but it does look correct to me.
@stedolan did you get a chance to take another look at it?
da02bb2
to
85141ec
Compare
Thanks, @Engil! I updated the |
85141ec
to
9a24195
Compare
9a24195
to
bb47354
Compare
bb47354
to
07edcd7
Compare
This PR paves the way for changing how backtrace locations are printed to be the same as #8541, so with locations spanning multiple lines reporting the line and column of the end location, rather than simply the number of characters offset from the start line. The current format is fine for machines, but a nuisance for the occasions when humans have to look at the output!
As it happens, though, my primary motivation for this is a series of (smaller) PRs which eliminate the special handling needed for CRLF-checkouts of the compiler's codebase. I've quietly maintained the .gitattributes entries for these for a few years (we all have our foibles) with the intention of one day sorting it out properly. That one day arrived in September, because there are recent changes to our codebase which mean that it's not just some testsuite files which need forcing to use LF-endings but some of the compiler's sources too, it's just taken me a few months to get to tidy up the code.
There are three parts:
struct ev_info
to record the extra fields in bytecode (simple change)debuginfo
to record the extra data in native code (less simple - more below)Printexc.location
Actually displaying this information will be in a future PR.
At present, the 64-bit words for debug info allow 20 bits for the line number, 8 bits for the first character location and 10 bits for the end character location. In our tree, that at present leads to just over 1000 locations which overflow the representation (not that they're necessarily ever displayed). Now I steal 1 bit from the line number (so we technically can no longer display locations from lines 524288-1048575 in a file) which is used to indicate two possible formats for the 64-bit word. The first adds no further overhead by using 12 bits for the line number, 6 bits for the character offset of the start, 3 bits to encode the offset to the end-line (so 0 when the location is on the same line), 7 bits to encode the character offset of the end (relative to the end-line) and - in order to keep the information we had before - 9 bits to store the end character offset as before. This encoding captures 92% of the locations in a build of the compiler tree. When that overflows, the representation instead stores an extra 64bits in the name_info struct containing the defname (thus adding 8 bytes overhead), allowing 16bits each for the start and end character offset (the other int is used for the original end character offset relative to the start). I haven't shared
defname
strings as when I checked the tree there were only 50 locations where the strings were being duplicated, so the overhead (at least for the compiler sources) is small (indeed, the extra 4 byte offset in the other locations would dwarf the saving on the strings).With these two representations, no locations in the compiler's tree are now unrepresentable.
Having stored that information, it clearly makes sense to be able to access it in
Printexc
- in this version I've simply added two fields to the end of thePrintexc.location
. The alternative would be an additional type with a different retrieval function.