Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue https://github.com/gimli-rs/addr2line/issues/198 #199

Merged
merged 1 commit into from
Dec 3, 2020

Conversation

jsgf
Copy link
Contributor

@jsgf jsgf commented Dec 2, 2020

Fixes #198 - find_units_range can repeatedly return different subranges of the same unit, so make sure the iteration is constrained to that range. Otherwise it re-iterates the entire range repeatedly.

Minimal perf effect, mostly an improvement:

 name                                        base ns/iter  pr ns/iter  diff ns/iter   diff %  speedup
 context_new_and_query_location_rc           2,395,624     2,320,514        -75,110   -3.14%   x 1.03
 context_new_and_query_location_slice        860,152       889,665           29,513    3.43%   x 0.97
 context_new_and_query_with_functions_rc     2,692,048     2,751,044         58,996    2.19%   x 0.98
 context_new_and_query_with_functions_slice  1,135,428     1,128,007         -7,421   -0.65%   x 1.01
 context_new_parse_functions_rc              24,323,022    23,978,048      -344,974   -1.42%   x 1.01
 context_new_parse_functions_slice           18,395,236    18,714,890       319,654    1.74%   x 0.98
 context_new_parse_inlined_functions_rc      69,419,362    71,851,586     2,432,224    3.50%   x 0.97
 context_new_parse_inlined_functions_slice   55,134,387    53,693,540    -1,440,847   -2.61%   x 1.03
 context_new_parse_lines_rc                  15,798,558    14,513,238    -1,285,320   -8.14%   x 1.09
 context_new_parse_lines_slice               9,334,580     9,179,604       -154,976   -1.66%   x 1.02
 context_new_rc                              2,724,285     2,100,336       -623,949  -22.90%   x 1.30
 context_new_slice                           711,510       710,517             -993   -0.14%   x 1.00
 context_query_location_rc                   1,116,060     1,117,681          1,621    0.15%   x 1.00
 context_query_location_slice                1,129,313     1,093,725        -35,588   -3.15%   x 1.03
 context_query_with_functions_rc             3,426,348     3,270,841       -155,507   -4.54%   x 1.05
 context_query_with_functions_slice          2,975,904     2,981,725          5,821    0.20%   x 1.00

The `find_units_range` iterator can return multiple sub-ranges from the
same unit. Make sure that the probe range is constrained to that
specific subrange rather than the whole unit, or we'll return the same
location info over and over.
@philipc
Copy link
Contributor

philipc commented Dec 2, 2020

The second commit doesn't seem correct to me. At the least, we now have a max_end field that is never used, and that max_end check was required for correctness. Before I spend time trying to figure this out, can you confirm that your changes take overlapping ranges into account?

@jsgf
Copy link
Contributor Author

jsgf commented Dec 2, 2020

Ah, I'd assumed the ranges couldn't overlap. I think it will only need a bit of a tweak to the logic though, and I don't think max_end is needed for forward traversal. Hm, it might need a corresponding min_begin though.

@jsgf
Copy link
Contributor Author

jsgf commented Dec 2, 2020

Looks like #163 is the thing to understand. The tests there pass with this PR as-is, but I'm not sure how comprehensive they are.

FWIW I don't think this kind of sorted array with lowest/highest tracking can possibly handle doing queries over arbitrary overlapping/nested ranges, so either the overlapping is constrained in a way where it can work, or there are cases where it's already failing. I don't really feel like I understand the root problem enough to be sure either way. In the past I've implemented interval skip lists to solve the general problem.

Also if ranges can overlap, it means that returning multiple locations for a given address is inherent (assuming all the ranges are covered properly), but that's OK. I'm not sure if they'll return the same location or not though (is this how inlining is represented?).

@philipc
Copy link
Contributor

philipc commented Dec 2, 2020

Looks like #163 is the thing to understand. The tests there pass with this PR as-is, but I'm not sure how comprehensive they are.

Yep. I can try to reproduce the original failure if I get time.

FWIW I don't think this kind of sorted array with lowest/highest tracking can possibly handle doing queries over arbitrary overlapping/nested ranges, so either the overlapping is constrained in a way where it can work, or there are cases where it's already failing. I don't really feel like I understand the root problem enough to be sure either way. In the past I've implemented interval skip lists to solve the general problem.

Why not? It might be inefficient (O(n)), but it should eventually find the results. However, it may give out of order results, even with the change in this PR.

Also if ranges can overlap, it means that returning multiple locations for a given address is inherent (assuming all the ranges are covered properly), but that's OK. I'm not sure if they'll return the same location or not though (is this how inlining is represented?).

It's not used for inlining. Duplicate locations can occur for weak functions (edit: the correct term might be common functions, it's for duplicate C++ template instantiations). I don't know what exactly caused #163, but I don't think it resulted in duplicate locations (the CUs ranges overlapped, but not the locations within them).

@jsgf
Copy link
Contributor Author

jsgf commented Dec 2, 2020

OK, I think I'll split this aspect into a separate PR, since the fix for #198 is what's really important here.

Copy link
Contributor

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@philipc philipc merged commit 0b6b601 into gimli-rs:master Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

find_location_range returns duplicate addresses
2 participants