Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that the mark stack push optimisation handles naked pointers. #9951

Merged
merged 1 commit into from
Oct 2, 2020

Conversation

kayceesrk
Copy link
Contributor

In the presence of naked pointers, only blocks that are in the heap are guaranteed to have valid headers. Hence, we guard the test to see if the header is black with a test to check that object is in the major heap. This fixes #9950.

The check for whether we should push the object into the mark stack is now more precise. However, this is not additional work as we would have done this check elsewhere during marking.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the patch and I believe it is correct.

There are many places in major_gc.c and weak.c that are doing this, checking !Is_young or Is_in_heap depending on whether naked pointers are allowed:

runtime/major_gc.c-407-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:408:  if (Is_block (child) && !Is_young (child)) {
runtime/major_gc.c-409-#else
runtime/major_gc.c-410-  if (Is_block (child) && Is_in_heap (child)) {
runtime/major_gc.c-411-#endif

runtime/major_gc.c-468-  if ( data != caml_ephe_none &&
runtime/major_gc.c-469-       Is_block (data) &&
runtime/major_gc.c-470-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:471:       !Is_young(data) &&
runtime/major_gc.c-472-#else
runtime/major_gc.c-473-       Is_in_heap (data) &&
runtime/major_gc.c-474-#endif
runtime/major_gc.c-475-       Is_white_val (data)){

runtime/major_gc.c-487-      if (key != caml_ephe_none &&
runtime/major_gc.c-488-          Is_block (key) &&
runtime/major_gc.c-489-#ifdef NO_NAKED_POINTERS
runtime/major_gc.c:490:          !Is_young(key)
runtime/major_gc.c-491-#else
runtime/major_gc.c-492-          Is_in_heap(key)
runtime/major_gc.c-493-#endif
runtime/major_gc.c-494-          ){

runtime/weak.c-77-#ifdef NO_NAKED_POINTERS
runtime/weak.c:78:  return Is_white_val(x) && !Is_young (x);
runtime/weak.c-79-#else
runtime/weak.c-80-  return Is_white_val(x) && Is_in_heap (x);
runtime/weak.c-81-#endif

runtime/weak.c-90-#ifdef NO_NAKED_POINTERS
runtime/weak.c:91:  return Is_block (x) && !Is_young (x);
runtime/weak.c-92-#else
runtime/weak.c-93-  return Is_block (x) && Is_in_heap (x);
runtime/weak.c-94-#endif

Two comments:

  • This is a nitpick, but I have a slight preference for having the ifdef logic only on that test, and not the rest of the conditional, as done on line 470.
  • I would prefer if we had a macro or function to check this, instead of having to duplicate the conditional logic in each place. This could be done with Is_in_heap_or_young(v) && !Is_young(v), but it's not particularly readable, it would be nice to have an Is_valid_old_block(v) macro for this.

In favor of "in heap or young but not young", several places have CAMLassert(Is_in_heap_or_young(v)); ... !Is_young(v).

@kayceesrk
Copy link
Contributor Author

I've fixed the first nitpick. Regarding the second one, I'm struggling to come up with an unambiguous name. Is_valid_old_block(v) also would be true for if v is outside the heap and has a valid header in the case of NO_NAKED_POINTERS. Any suggestions? Otherwise, I'm tempted to leave the conditional code as is as we will remove support for naked pointers soon.

@gasche
Copy link
Member

gasche commented Oct 1, 2020

I think it would be fine to merge the PR as-is (when the CI agrees), it fixes a real bug. Hopefully we would keep discussing the larger refactoring; maybe other people have better proposals.

I looked for other suspicious uses of !Is_young in the runtime, but I didn't locate any. (Then I looked fairly quickly.)

@kayceesrk
Copy link
Contributor Author

On a related note this line looks suspect,

runtime/weak.c-80- return Is_white_val(x) && Is_in_heap (x);

I would think that the order of the tests should be reversed.

@gasche gasche added the merge-me label Oct 1, 2020
@gasche
Copy link
Member

gasche commented Oct 1, 2020

I don't know if you want a Changes entry (to credit the reporter) or prefer not to have any because this is a small bugfix for a PR in the same development version.

@kayceesrk
Copy link
Contributor Author

I've added a Changes entry.

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is correct, but the comments raise more questions than they explain the code!

Also, note that the cost of the optimization in naked-pointers mode skyrockets, as Is_in_heap is an expensive test involving a page table lookup. It would make sense to benchmark again in naked-pointers mode, with and without the optimization, possibly with different values of the "8" optimization parameter. Just to make sure that the optimization is still an optimization...

/* With NO_NAKED_POINTERS, every block has a valid header. */
!Is_young(v) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code did not ignore pointers to the minor heap. In other words, if a small major heap block contains only pointers to the minor heap, it is now skipped (not pushed on the mark stack), while it was pushed on the mark stack before. That is probably correct, but I'd like to be sure.

Also, the comment doesn't match the test that follows. If "every block has a valid header", why check !Is_young ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code did not ignore pointers to the minor heap. In other words, if a small major heap block contains only pointers to the minor heap, it is now skipped (not pushed on the mark stack), while it was pushed on the mark stack before. That is probably correct, but I'd like to be sure.

This is correct.

Comment on lines 249 to 248
/* In the presence of naked pointers, only blocks in the heap are
* guaranteed to have a valid header. */
Is_in_heap(v) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the comment is off. Only blocks that satisfy the Is_in_value_area test have valid headers, but these include minor heap blocks, major heap blocks, and static data. Here Is_in_heap is tested, not Is_in_value_area.

@kayceesrk
Copy link
Contributor Author

The comments are indeed confusing now as they are attached to the wrong tests after refactoring. I shall remove the comments.

Regarding the Is_in_heap test, we're not adding an additional test here but performing this test earlier than it is done now. In trunk, we would have pushed this object into the mark stack and would have performed the Is_in_heap test in mark_slice_darken (https://github.com/ocaml/ocaml/blob/trunk/runtime/major_gc.c#L410).

Let me run the benchmarks on this PR to evaluate whether the optimisation is still useful.

@xavierleroy
Copy link
Contributor

Regarding the Is_in_heap test, we're not adding an additional test here but performing this test earlier than it is done now.

I think we're both right :-)

You're right that fields that are skipped by the optimization will not be tested again, so the Is_in_heap test is performed earlier, but not twice. (I missed that on my first reading of the code.)

However the first field that fails the optimization will be tested again when the object is scanned, so two Is_in_heap tests are performed for this particular field.

That's probably acceptable, but, yes, it would be nice to check that the optimization is still a win. On the other hand, measuring with other values of the magic "8" constant is probably not useful.

@kayceesrk
Copy link
Contributor Author

I think we're both right :-)

ugh. off by one error :-) running the benchmarks now.

@kayceesrk
Copy link
Contributor Author

kayceesrk commented Oct 2, 2020

I went down the rabbit hole a little bit and evaluated several variants:

  • 4.12.0+trunk - trunk
  • 4.12.0+pr9951 - this PR
  • 4.12.0+noopt - Removes this optimisation, and unconditionally pushes into the mark stack
  • 4.12.0+is_block - Optimisation which only tests for whether the fields of the object being pushed are blocks (Is_block(v))
  • 4.12.0+not_is_young - Optimisation which tests whether the fields of the object being pushed are blocks and not young (Is_block(v) && !Is_young(v))

Except 4.12.0+pr9951, none of the other variants perform the page table lookup. The magic constant remains 8. Here is the summary. Compared to trunk:

variant performance
4.12.0+not_is_young 0.11% faster
4.12.0+is_block 0.03% faster
4.12.0+noopt 0.27% slower
4.12.0+pr9951 0.77% slower

These numbers are obtained from the geometric mean of the normalised running times for each benchmark. The normalized (against 4.12.0+trunk variant) running time graph is here, but isn't very useful:

image

My takeaway is that we should change the code to

if (Is_block(v) && !Is_young(v)) {
  /* found something to mark */
  break;
}

Not only is this the fastest (though the gains are minuscule), we avoid having the #ifdef.

@gasche
Copy link
Member

gasche commented Oct 2, 2020

I agree with the conclusion. The numbers are noisy, but not_is_young is always better than noopt, and it is the only variant with this property. (It's slightly more code than noopt obviously, but still fairly simple.)

@xavierleroy
Copy link
Contributor

+1 for the Is_block(v) && !Is_young(v). Not only it avoids page table lookups, but it also avoids dereferencing v to check the color of its header.

Copy link
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xavierleroy
Copy link
Contributor

Nice work! Merging now. I'll think about adding a variant of #9950 to the test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Naked pointers crash on trunk
4 participants