-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that the mark stack push optimisation handles naked pointers. #9951
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the patch and I believe it is correct.
There are many places in major_gc.c
and weak.c
that are doing this, checking !Is_young
or Is_in_heap
depending on whether naked pointers are allowed:
runtime/major_gc.c-407-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:408: if (Is_block (child) && !Is_young (child)) {
runtime/major_gc.c-409-#else
runtime/major_gc.c-410- if (Is_block (child) && Is_in_heap (child)) {
runtime/major_gc.c-411-#endif
runtime/major_gc.c-468- if ( data != caml_ephe_none &&
runtime/major_gc.c-469- Is_block (data) &&
runtime/major_gc.c-470-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:471: !Is_young(data) &&
runtime/major_gc.c-472-#else
runtime/major_gc.c-473- Is_in_heap (data) &&
runtime/major_gc.c-474-#endif
runtime/major_gc.c-475- Is_white_val (data)){
runtime/major_gc.c-487- if (key != caml_ephe_none &&
runtime/major_gc.c-488- Is_block (key) &&
runtime/major_gc.c-489-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:490: !Is_young(key)
runtime/major_gc.c-491-#else
runtime/major_gc.c-492- Is_in_heap(key)
runtime/major_gc.c-493-#endif
runtime/major_gc.c-494- ){
runtime/weak.c-77-#ifdef NO_NAKED_POINTERS
runtime/weak.c:78: return Is_white_val(x) && !Is_young (x);
runtime/weak.c-79-#else
runtime/weak.c-80- return Is_white_val(x) && Is_in_heap (x);
runtime/weak.c-81-#endif
runtime/weak.c-90-#ifdef NO_NAKED_POINTERS
runtime/weak.c:91: return Is_block (x) && !Is_young (x);
runtime/weak.c-92-#else
runtime/weak.c-93- return Is_block (x) && Is_in_heap (x);
runtime/weak.c-94-#endif
Two comments:
- This is a nitpick, but I have a slight preference for having the ifdef logic only on that test, and not the rest of the conditional, as done on line 470.
- I would prefer if we had a macro or function to check this, instead of having to duplicate the conditional logic in each place. This could be done with
Is_in_heap_or_young(v) && !Is_young(v)
, but it's not particularly readable, it would be nice to have anIs_valid_old_block(v)
macro for this.
In favor of "in heap or young but not young", several places have CAMLassert(Is_in_heap_or_young(v)); ... !Is_young(v)
.
I've fixed the first nitpick. Regarding the second one, I'm struggling to come up with an unambiguous name. |
I think it would be fine to merge the PR as-is (when the CI agrees), it fixes a real bug. Hopefully we would keep discussing the larger refactoring; maybe other people have better proposals. I looked for other suspicious uses of |
On a related note this line looks suspect,
I would think that the order of the tests should be reversed. |
I don't know if you want a Changes entry (to credit the reporter) or prefer not to have any because this is a small bugfix for a PR in the same development version. |
ed2fe28
to
b7d6327
Compare
I've added a Changes entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code is correct, but the comments raise more questions than they explain the code!
Also, note that the cost of the optimization in naked-pointers mode skyrockets, as Is_in_heap
is an expensive test involving a page table lookup. It would make sense to benchmark again in naked-pointers mode, with and without the optimization, possibly with different values of the "8" optimization parameter. Just to make sure that the optimization is still an optimization...
runtime/major_gc.c
Outdated
/* With NO_NAKED_POINTERS, every block has a valid header. */ | ||
!Is_young(v) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code did not ignore pointers to the minor heap. In other words, if a small major heap block contains only pointers to the minor heap, it is now skipped (not pushed on the mark stack), while it was pushed on the mark stack before. That is probably correct, but I'd like to be sure.
Also, the comment doesn't match the test that follows. If "every block has a valid header", why check !Is_young
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code did not ignore pointers to the minor heap. In other words, if a small major heap block contains only pointers to the minor heap, it is now skipped (not pushed on the mark stack), while it was pushed on the mark stack before. That is probably correct, but I'd like to be sure.
This is correct.
runtime/major_gc.c
Outdated
/* In the presence of naked pointers, only blocks in the heap are | ||
* guaranteed to have a valid header. */ | ||
Is_in_heap(v) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, the comment is off. Only blocks that satisfy the Is_in_value_area
test have valid headers, but these include minor heap blocks, major heap blocks, and static data. Here Is_in_heap
is tested, not Is_in_value_area
.
The comments are indeed confusing now as they are attached to the wrong tests after refactoring. I shall remove the comments. Regarding the Let me run the benchmarks on this PR to evaluate whether the optimisation is still useful. |
I think we're both right :-) You're right that fields that are skipped by the optimization will not be tested again, so the Is_in_heap test is performed earlier, but not twice. (I missed that on my first reading of the code.) However the first field that fails the optimization will be tested again when the object is scanned, so two Is_in_heap tests are performed for this particular field. That's probably acceptable, but, yes, it would be nice to check that the optimization is still a win. On the other hand, measuring with other values of the magic "8" constant is probably not useful. |
ugh. off by one error :-) running the benchmarks now. |
b7d6327
to
b0e3499
Compare
I went down the rabbit hole a little bit and evaluated several variants:
Except
These numbers are obtained from the geometric mean of the normalised running times for each benchmark. The normalized (against 4.12.0+trunk variant) running time graph is here, but isn't very useful: My takeaway is that we should change the code to if (Is_block(v) && !Is_young(v)) {
/* found something to mark */
break;
} Not only is this the fastest (though the gains are minuscule), we avoid having the |
I agree with the conclusion. The numbers are noisy, but not_is_young is always better than noopt, and it is the only variant with this property. (It's slightly more code than noopt obviously, but still fairly simple.) |
+1 for the |
b0e3499
to
88a8dad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nice work! Merging now. I'll think about adding a variant of #9950 to the test suite. |
In the presence of naked pointers, only blocks that are in the heap are guaranteed to have valid headers. Hence, we guard the test to see if the header is black with a test to check that object is in the major heap. This fixes #9950.
The check for whether we should push the object into the mark stack is now more precise. However, this is not additional work as we would have done this check elsewhere during marking.