Add find_location_range #196

jsgf · 2020-11-13T09:43:20Z

I have a use-case where I want to get the source locations for a range of addresses. The simple approach is to just call Context::find_location on each address in the range - this works, but it is extremely slow since:

it repeats all the setup cost for each byte
the caller has no idea how much space each Location spans, so ends up redundantly probing the same locations over and over

This PR implements find_location_range which returns an iterator of all the locations in the range. It is much more efficient because it amortizes all the setup cost and directly returns the locations in the range. In addition to the Location, it also returns the address and length of the byte range the location describes.

It also implements find_location in terms of returning a single Location from a 1 byte range.

jsgf · 2020-11-14T08:08:36Z

Coverage check failure looks unrelated to this change.

philipc

Thanks for the PR! Just letting you know that I'll probably bit a slow to properly review this, since it's a complex change. It'd be good if we could avoid this iterator complexity somehow, but I'm not sure if that will be possible. Also, it's a significant loss in the context_query_location_slice benchmark:

 context_query_location_rc                   892,787         1,459,657           566,870  63.49%   x 0.61 
 context_query_location_slice                898,126         1,452,600           554,474  61.74%   x 0.62 
 context_query_with_functions_rc             3,093,417       3,340,108           246,691   7.97%   x 0.93 
 context_query_with_functions_slice          2,763,489       2,912,667           149,178   5.40%   x 0.95

philipc · 2020-11-15T05:03:31Z

src/lib.rs

            })
            .filter_map(move |i| {
                // If this CU doesn't actually contain this address, move to the
                // next CU.
-                if probe > i.range.end {
+                if probe_low > i.range.end || probe_high <= i.range.begin {


Should this be probe_low >= i.range.end?

I was replicating the original logic - does that mean the original should be probe >= i.range.end? (I suspect so.)

No I think that's correct, but I got probe_low <= i.max_end wrong a couple of lines above.

A reason I commented is because this check and the one a couple of lines above need to be consistent, unless my understanding is wrong. They are consistent on master, but this PR made them inconsistent, and 6001cec changes them both again so they are still inconsistent. I think probe_low >= i.range.end is correct here, because i.range.end is exclusive (and i.max_end is exclusive too), meaning this was a bug on master. So you need to change the one a couple of lines above again to be probe_low < i.max_end.

jsgf · 2020-11-15T23:41:21Z

@philipc:

It'd be good if we could avoid this iterator complexity somehow, but I'm not sure if that will be possible

In my initial implementation I just made it return impl Iterator<...> and use combinators - but I couldn't work out how to handle errors properly without eating them (albeit unlikely). So I made it return a concrete iterator type so that it could also implement FallibleIterator so errors were there for the use-cases where it matters (it doesn't particularly for mine).

But I wasn't sure what the conventions were for this library, since none of the other iterator-like return values seem to do this.

Iteration is important for me, since quite often I'll not consume the whole range, so there's a benefit to generating the values lazily.

Also, it's a significant loss in the context_query_location_slice benchmark:

I was wondering about that. The new code is definitely does more for the single lookup case. I was mostly focused on removing any code duplication, but I guess it could be factored in a way that allows the single lookup to have a fastpath.

jsgf · 2020-11-16T02:02:25Z

I fixed the performance regression by quite a bit:

 context_query_location_rc           669,495          739,975                      70,480  10.53%   x 0.90 
 context_query_location_slice        665,773          723,207                      57,434   8.63%   x 0.92 
 context_query_with_functions_rc     2,533,924        2,747,176                   213,252   8.42%   x 0.92 
 context_query_with_functions_slice  2,265,792        2,366,429                   100,637   4.44%   x 0.96

philipc · 2020-11-16T09:09:04Z

src/lib.rs

+
+        match unit_iter.next() {
+            Some(unit) => unit.find_location(probe, sections),
+            None => Ok(None),


Can you change this function back to exactly how it was on master? It's important that we loop through the units here if required. Unfortunately we don't seem to have any tests that check this case...

philipc · 2020-11-16T09:13:16Z

I fixed the performance regression by quite a bit:

Great! I guess that is mainly due to avoiding the allocation.

Can you see if it makes sense to apply the changes in philipc@de7aee3? I haven't tested these, but I think they simplify things a little.

jsgf · 2020-11-17T07:28:05Z

OK I incorporated your simplifications with a tweak or two and fixed up the boundary check in find_units_range (I'm pretty sure it's correct now), and folded it all down to one change.

Bench comparison is now

 name                                bench.0 ns/iter  bench.1 ns/iter  diff ns/iter  diff %  speedup 
 context_query_location_rc           680,660          770,646                89,986  13.22%   x 0.88 
 context_query_location_slice        672,500          770,194                97,694  14.53%   x 0.87 
 context_query_with_functions_rc     2,618,079        2,678,960              60,881   2.33%   x 0.98 
 context_query_with_functions_slice  2,237,228        2,421,786             184,558   8.25%   x 0.92

I think it very slightly regressed with the simplifications, but not enough to be confident.

This adds `find_location_range` which is the equivalent of `find_location` except that it takes an lower and upper address bounds and returns an iterator of all source locations found in that range.

philipc · 2020-11-17T23:57:51Z

I think it very slightly regressed with the simplifications, but not enough to be confident.

I think this is due to the correctness fix in Context::find_location.

philipc

Thanks!

philipc · 2020-11-18T00:00:01Z

By the way, what application are you using this for such that you don't want to know the inlined function locations too?

jsgf · 2020-11-19T08:14:32Z

Yeah I've been thinking about inlining. I'm using find_frames which is not too bad since I know the location boundaries but it would be nice to get better perf. For my use case I'm only interested in the top noninlined frame so I'm thinking it should return a double ended iterator.

jsgf force-pushed the master branch 2 times, most recently from 7549f55 to d2678d3 Compare November 13, 2020 10:04

philipc reviewed Nov 15, 2020

View reviewed changes

philipc reviewed Nov 16, 2020

View reviewed changes

jsgf force-pushed the master branch from 6001cec to 701b97d Compare November 17, 2020 07:21

Add find_location_range

3305fde

This adds `find_location_range` which is the equivalent of `find_location` except that it takes an lower and upper address bounds and returns an iterator of all source locations found in that range.

jsgf force-pushed the master branch from 701b97d to 3305fde Compare November 17, 2020 07:53

philipc approved these changes Nov 17, 2020

View reviewed changes

philipc merged commit caa2932 into gimli-rs:master Nov 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add find_location_range #196

Add find_location_range #196

jsgf commented Nov 13, 2020 •

edited

Loading

jsgf commented Nov 14, 2020

philipc left a comment

philipc Nov 15, 2020

jsgf Nov 15, 2020

jsgf Nov 16, 2020

philipc Nov 16, 2020

jsgf commented Nov 15, 2020 •

edited

Loading

jsgf commented Nov 16, 2020

philipc Nov 16, 2020

jsgf Nov 16, 2020

philipc commented Nov 16, 2020

jsgf commented Nov 17, 2020

philipc commented Nov 17, 2020

philipc left a comment

philipc commented Nov 18, 2020

jsgf commented Nov 19, 2020

Add find_location_range #196

Add find_location_range #196

Conversation

jsgf commented Nov 13, 2020 • edited Loading

jsgf commented Nov 14, 2020

philipc left a comment

Choose a reason for hiding this comment

philipc Nov 15, 2020

Choose a reason for hiding this comment

jsgf Nov 15, 2020

Choose a reason for hiding this comment

jsgf Nov 16, 2020

Choose a reason for hiding this comment

philipc Nov 16, 2020

Choose a reason for hiding this comment

jsgf commented Nov 15, 2020 • edited Loading

jsgf commented Nov 16, 2020

philipc Nov 16, 2020

Choose a reason for hiding this comment

jsgf Nov 16, 2020

Choose a reason for hiding this comment

philipc commented Nov 16, 2020

jsgf commented Nov 17, 2020

philipc commented Nov 17, 2020

philipc left a comment

Choose a reason for hiding this comment

philipc commented Nov 18, 2020

jsgf commented Nov 19, 2020

jsgf commented Nov 13, 2020 •

edited

Loading

jsgf commented Nov 15, 2020 •

edited

Loading