Skip to content

Conversation

@jhawthorn
Copy link
Member

@jhawthorn jhawthorn commented Mar 3, 2023

Opening this as draft because I'm not sure of the idea, but wanted to share it early. I haven't benchmarked realistic code or profiled or optimized yet.

This implements rb_shape_get_iv_index_with_hint: a version of rb_shape_get_iv_index which uses a previous shape with a valid index for the given IV in order to speed up searches on a IV inline cache miss.

This works by walking up both the hint shape's and current shape's tree branches in parallel until they meet at a common ancestor. If the common ancestor is deeper than the edge name we're looking for we can skip the remainder of the search. We also check for the edge name as we search.

The thinking behind this is that in many cases we have different, but very similar shapes, who share the common ancestor defining the IV we're looking for. This should also work well for a frozen version of a shape.

This is significantly faster when the divergence between the current and hint shape is shallow, but when it is not (there is no relation between the two shapes) it still resorts to a linear scan. The full scan is slower with this, as more work is being done, but I believe not 2x slower (the LL means we'll be stalling all the time on memory so walking two lists in parallel isn't so bad).

Benchmark: https://gist.github.com/jhawthorn/e9db7a1443eaa957cef772dccdc14d9a

With a shallow tree divergence

before: ruby --disable-gems test_shape_search.rb 1.64s user 0.00s system 99% cpu 1.648 total
after: ./miniruby test_shape_search.rb 0.67s user 0.00s system 99% cpu 0.666 total

With a deep tree divergence (worse, assumed uncommon)

before: ruby --disable-gems test_shape_search_deep.rb 1.64s user 0.00s system 99% cpu 1.643 total
after: ./miniruby test_shape_search_deep.rb 1.88s user 0.01s system 99% cpu 1.889 total

cc @tenderlove @jemmaissroff

This implements +rb_shape_get_iv_index_with_hint: a version of
rb_shape_get_iv_index which uses a previous shape with a valid index for
the given IV in order to speed up searches on a IV inline cache miss.

This works by walking up both the hint shape's and current shape's tree
branches in parallel until they meet at a common ancestor.

The thinking behind this is that in many cases we have different, but
very similar shapes, but who share the common ancestor defining the IV
we're looking for. This should also work well for a frozen version of a
shape.

This is significantly faster when the divergence between the current and
hint shape is shallow, but when it is not (there is no relation between
the two shapes) it still resorts to a linear scan. The full scan is
slower with this, as more work is being done, but I believe not 2x
slower (the LL means we'll be stalling all the time on memory so walking
two lists in parallel isn't _so_ bad).

Benchmark: https://gist.github.com/jhawthorn/e9db7a1443eaa957cef772dccdc14d9a

With a shallow tree divergence

before: ruby --disable-gems test_shape_search.rb  1.64s user 0.00s system 99% cpu 1.648 total
after: ./miniruby test_shape_search.rb  0.67s user 0.00s system 99% cpu 0.666 total

With a deep tree divergence (assumed uncommon)

before: ruby --disable-gems test_shape_search_deep.rb  1.64s user 0.00s system 99% cpu 1.643 total
after: ./miniruby test_shape_search_deep.rb  1.88s user 0.01s system 99% cpu 1.889 total
@casperisfine
Copy link
Contributor

So as discussed on Slack, I came up with a very similar approach, but less efficient and clean than yours: Shopify@6133ca3

One thing from my patch that I think could be interesting to port, is that when I find a common ancestor shape, I update the cache with it. The idea being that it will be slightly less work the next time over, and that it will reduce writes into inline-cache.

If you are interested I can try to integrate this part.

@jhawthorn
Copy link
Member Author

Closing in favour of other approaches

@jhawthorn jhawthorn closed this Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants