Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug #20022] Fix GC.verify_compaction_references not moving every object #9041

Merged

Conversation

KJTsanaktsidis
Copy link
Contributor

@KJTsanaktsidis KJTsanaktsidis commented Nov 27, 2023

The intention of GC.verify_compaction_references is, I believe, to force every single movable object to be moved, so that it's possible to debug native extensions which not correctly updating their references to objects they mark as movable.

To do this, it doubles the number of allocated pages for each size pool, and sorts the heap pages so that the free ones are swept first; thus, every object in an old page should be moved into a free slot in one of the new pages.

This worked fine until movement of objects between size pools during compaction was implemented. That causes some problems for verify_compaction_references:

  • We were doubling the number of pages in each size pool, but actually if some objects need to move into a different pool, there's no guarantee that they'll be enough room in that one.
  • It's possible for the sweep & compact cursors to meet in one size pool before all the objects that want to move into that size pool from another are processed by the compaction.

You can see these problems by changing some of the movement tests in test_gc_compact.rb to try and move e.g. 50,000 objects instead of 500; the test is not able to actually move all of the objects in a single compaction run.

To fix this, we do two things in verify_compaction_references:

  • Firstly, we add enough pages to every size pool to make them the same size. This ensures that their compact cursors will all have space to move during compaction (even if that means empty pages are pointlessly compacted)
  • Then, we examine every object and determine where it wants to be compacted into. We use this information to add additional pages to each size pool to handle all objects which should live there.

With these two changes, we can move arbitrary amounts of objects into the correct size pool in a single call to verify_compaction_references.

My motivation for performing this work was to try and fix some test stability issues in test_gc_compact.rb. I now no longer think that we actually see this particular bug in rubyci.org, but I also think verify_compaction_references should do what it says on the tin, so it's worth keeping.

[Bug #20022]

@KJTsanaktsidis KJTsanaktsidis changed the title Fix GC.verify_compaction_references not moving every object [Bug #20022] Fix GC.verify_compaction_references not moving every object Nov 27, 2023
@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/fix_verify_compaction branch 2 times, most recently from 34b9bd0 to 0b5af26 Compare November 28, 2023 10:57
This works like objspace_each_obj, except instead of being called with
the start & end address of each page, it's called with the page
structure itself.

[Bug #20022]
@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/fix_verify_compaction branch from 0b5af26 to bea23b6 Compare December 3, 2023 02:33
Copy link
Member

@peterzhu2118 peterzhu2118 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nitpick, otherwise this makes sense to me. Thank you for working on this!

gc.c Outdated

heap_add_pages(objspace, size_pool, heap, pages_to_add);
}
} else if (RTEST(double_heap)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: style.

Suggested change
} else if (RTEST(double_heap)) {
}
else if (RTEST(double_heap)) {

The intention of GC.verify_compaction_references is, I believe, to force
every single movable object to be moved, so that it's possible to debug
native extensions which not correctly updating their references to
objects they mark as movable.

To do this, it doubles the number of allocated pages for each size pool,
and sorts the heap pages so that the free ones are swept first; thus,
every object in an old page should be moved into a free slot in one of
the new pages.

This worked fine until movement of objects _between_ size pools during
compaction was implemented. That causes some problems for
verify_compaction_references:

* We were doubling the number of pages in each size pool, but actually
  if some objects need to move into a _different_ pool, there's no
  guarantee that they'll be enough room in that one.
* It's possible for the sweep & compact cursors to meet in one size pool
  before all the objects that want to move into that size pool from
  another are processed by the compaction.

You can see these problems by changing some of the movement tests in
test_gc_compact.rb to try and move e.g. 50,000 objects instead of
500; the test is not able to actually move all of the objects in a
single compaction run.

To fix this, we do two things in verify_compaction_references:

* Firstly, we add enough pages to every size pool to make them the same
  size. This ensures that their compact cursors will all have space to
  move during compaction (even if that means empty pages are
  pointlessly compacted)
* Then, we examine every object and determine where it _wants_ to be
  compacted into. We use this information to add additional pages to
  each size pool to handle all objects which should live there.

With these two changes, we can move arbitrary amounts of objects into
the correct size pool in a single call to verify_compaction_references.

My _motivation_ for performing this work was to try and fix some test
stability issues in test_gc_compact.rb. I now no longer think that we
actually see this particular bug in rubyci.org, but I also think
verify_compaction_references should do what it says on the tin, so it's
worth keeping.

[Bug #20022]
@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/fix_verify_compaction branch from bea23b6 to 9981254 Compare December 7, 2023 10:45
@peterzhu2118 peterzhu2118 merged commit cbc0e0b into ruby:master Dec 7, 2023
98 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants