Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finer-grained inline constant cache invalidation #5433

Merged
merged 3 commits into from
Mar 24, 2022

Conversation

kddnewton
Copy link
Contributor

@kddnewton kddnewton commented Jan 12, 2022

Current behavior

Caches depend on a global counter. All constant mutations cause caches to be invalidated.

class A
  B = 1
end

def foo
  A::B # inline cache depends on global counter
end

foo # populate inline cache
foo # hit inline cache

C = 1 # global counter increments, all caches are invalidated

foo # misses inline cache due to `C = 1`

Proposed behavior

Caches depend on name components. Only constant mutations with corresponding names will invalidate the cache.

class A
  B = 1
end

def foo
  A::B # inline cache depends constants named "A" and "B"
end

foo # populate inline cache
foo # hit inline cache

C = 1 # caches that depend on the name "C" are invalidated

foo # hits inline cache because IC only depends on "A" and "B"

Examples of breaking the new cache:

module C
  # Breaks `foo` cache because "A" constant is set and the cache in foo depends
  # on "A" and "B"
  class A; end
end

B = 1

We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT.

Benchmarks

The following benchmark (included in this pull request) performs about 2x faster than master.

CONSTANT1 = 1
CONSTANT2 = 1
CONSTANT3 = 1
CONSTANT4 = 1
CONSTANT5 = 1

def constants
  [CONSTANT1, CONSTANT2, CONSTANT3, CONSTANT4, CONSTANT5]
end

500_000.times do
  constants
  INVALIDATE = true
end

In terms of macro benchmarks, I ran with this code on railsbench and there was not a statistically significant different in startup time or overall runtime performance.

In terms of memory, this includes an increase in VM size by about 500KiB when running on railsbench. This is because we're now tracking cache associations ({ ID => IC[] }) on the VM to know how to invalidate specific caches when constants change.

Ticket: [Feature #18589]

vm.c Outdated
Comment on lines 499 to 507
static int
vm_stat_constant_cache_i(st_data_t id, st_data_t table, st_data_t constant_cache)
{
st_index_t size = ((st_table *) table)->num_entries;
rb_hash_aset((VALUE) constant_cache, ID2SYM((ID) id), LONG2NUM(size));
return ST_CONTINUE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm being silly but what does this function do? 😅 could use a comment

@maximecb
Copy link
Contributor

This PR ended up being bigger than I thought 😅 but it looks quite good 👍

Are the failing tests related to the changes made here?

*/
void rb_clear_constant_cache(void);
void rb_clear_constant_cache(ID id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this declaration to include/ruby/backward.h, and mark as RBIMPL_ATTR_DEPRECATED_INTERNAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even, let rb_clear_constant_cache be a no-op function and have ID-taking variant in a different name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's internal and we've removed all usage of this function, do we still need to mark it as deprecated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declaration has been visible from extension libraries in the wild. Deleting it could break them. It is not a bad idea to tell them that we want to delete it, before actually doing so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @shyouhei that you should make the old one as deprecated and give the new one which takes an id parameter a different name.

vm.c Outdated Show resolved Hide resolved
@kddnewton kddnewton force-pushed the constant-invalidation branch 2 times, most recently from 572c60b to f8bda4f Compare March 8, 2022 18:57
iseq.c Show resolved Hide resolved
class.c Outdated Show resolved Hide resolved
Copy link

@michalmalecki michalmalecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU impact - can it regress?

size = body->iseq_size;
code = body->iseq_encoded;

for (index = start_index; index < size;) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to extra memory consumed, is there measurable time penalty for doing more precise cache invalidation? commenting here, since this is the only loop I found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, make sense. Sorry for missing it in the first place

kddnewton and others added 3 commits March 24, 2022 12:13
Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated.

```ruby
class A
  B = 1
end

def foo
  A::B # inline cache depends on global counter
end

foo # populate inline cache
foo # hit inline cache

C = 1 # global counter increments, all caches are invalidated

foo # misses inline cache due to `C = 1`
```

Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache.

```ruby
class A
  B = 1
end

def foo
  A::B # inline cache depends constants named "A" and "B"
end

foo # populate inline cache
foo # hit inline cache

C = 1 # caches that depend on the name "C" are invalidated

foo # hits inline cache because IC only depends on "A" and "B"
```

Examples of breaking the new cache:

```ruby
module C
  # Breaks `foo` cache because "A" constant is set and the cache in foo depends
  # on "A" and "B"
  class A; end
end

B = 1
```

We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT.
Co-authored-by: John Hawthorn <john@hawthorn.email>
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
8 participants