Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix custom block promotion #12439

Merged
merged 3 commits into from Oct 12, 2023

Conversation

damiendoligez
Copy link
Member

@damiendoligez damiendoligez commented Jul 28, 2023

Tying a loose end from the multicore work: the minor GC obviously shouldn't promote custom blocks that are dead, but should instead run their finalization functions.

PR based on #12436 (which is the first commit here) because it makes that bug much more likely to occur (triggers with about 10 to 20% probability in tests/lib-marshal/intext_par.ml).

@xavierleroy
Copy link
Contributor

I rebased on trunk to get rid of the conflict with #12445 .

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a GC expert, so this PR deserves a second approval, but I note that the new code is essentially the same that we have in 4.14:

ocaml/runtime/minor_gc.c

Lines 403 to 415 in 6de2914

/* Run custom block finalisation of dead minor values */
for (elt = Caml_state->custom_table->base;
elt < Caml_state->custom_table->ptr; elt++){
value v = elt->block;
if (Hd_val (v) == 0){
/* Block was copied to the major heap: adjust GC speed numbers. */
caml_adjust_gc_speed(elt->mem, elt->max);
}else{
/* Block will be freed: call finalization function, if any. */
void (*final_fun)(value) = Custom_ops_val(v)->finalize;
if (final_fun != NULL) final_fun(v);
}
}

This inspires much confidence. In contrast, the 5.0 code being replaced is... strange! "unconditionally promote custom blocks so accounting is correct" sounds like the wrong thing to do.

@xavierleroy
Copy link
Contributor

@kayceesrk : how can we move forward on this PR? Who could give a second review? To me it looks like an obvious fix to an obvious mistake, so let's give it some consideration.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that what is going on here is fairly subtle, and indeed a parallel-minor-GC expert would be helpful to tell whether the change is correct. As far as I can see, the choice of the current behavior for custom finalizers was done by @ctk21 in ocaml-multicore/ocaml-multicore@c56f5d5, and the last person to touch this code was @sadiqj in a8fa947.

To start with, a basic question: where is the bug in the current code? Sure, promoting unconditionally sounds less efficient, but why would it trigger a failure in the testsuite? I haven't understood this point from the PR description or code.

Regarding the promotion logic, the point that is delicate, I think, is when to run the code to finalize custom blocks in the minor GC, if at all. The multicore-GC code started with a minor custom promotion logic very close to the 4.14 code, but at a place where that logic was incorrect: it was inside the part of the STW section where all domains promote their roots in parallel, and so at this point there was no guarantee that all live values had been promoted yet, and a custom block held alive by a different domain could be wrongly finalized. I believe that the "unconditional promotion" code was designed to be (less efficient but) safer / more clearly correct than the original multicore code.

The new code placement chosen by @damiendoligez looks more correct that the initial-multicore-GC placement -- it is after the second barrier of the minor GC, at a point where we know that all domains are done scanning their roots. But is it actually correct? (And why did the multicore people not think of putting it there, in that case?)

@@ -612,6 +597,7 @@ void caml_empty_minor_heap_promote(caml_domain_state* domain,
|| (get_header_val(vnew) != 0 && !Is_young(vnew)));
}

struct caml_custom_elt *elt;
for (elt = self_minor_tables->custom.base;
elt < self_minor_tables->custom.ptr; elt++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the assertions in this debug-only code will not hold anymore, now that promotion of custom blocks has been moved to later in the minor GC logic. (The assertion will still not hold after the new promotion code, as it may now leave header-0 entries in the custom table which is cleared shortly after.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this. Github UI seems to point to the wrong location. The assertions are here:

ocaml/runtime/minor_gc.c

Lines 615 to 620 in bd952ad

for (elt = self_minor_tables->custom.base;
elt < self_minor_tables->custom.ptr; elt++) {
value vnew = elt->block;
CAMLassert (!Is_block(vnew)
|| (get_header_val(vnew) != 0 && !Is_young(vnew)));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may explain the CI failures.

if (final_fun != NULL) final_fun(*v);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: while the code of caml_empty_minor_heap_promote is a sprawling mess, the present function is neatly separated with all sub-phases called as auxiliary functions. Could you move this code to an auxiliary function to follow that style? I would suggest a static function named custom_update_last_minor, by analogy with caml_final_update_last_minor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this one as well.

while the code of caml_empty_minor_heap_promote is a sprawling mess

The whole file needs spring cleaning and better organisation without making semantic changes. The code is hard to get into every time I come back to this file. For example, caml_empty_minor_heap_promote is only called from within this file and should be made static (and caml_ should be dropped by convention). There are a few others such as these.

Copy link
Contributor

@kayceesrk kayceesrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall solution looks good to me. It restores the behaviour that we had in OCaml 4.

This assertion needs to go

ocaml/runtime/minor_gc.c

Lines 615 to 620 in bd952ad

for (elt = self_minor_tables->custom.base;
elt < self_minor_tables->custom.ptr; elt++) {
value vnew = elt->block;
CAMLassert (!Is_block(vnew)
|| (get_header_val(vnew) != 0 && !Is_young(vnew)));
}

@xavierleroy
Copy link
Contributor

To start with, a basic question: where is the bug in the current code?

There is no bug in the sense of "it crashes", but there's a bug in the sense of "the GC is doing useless work and fails to recover memory quickly enough". If memory serves:

  • In the early days of OCaml, custom blocks with finalizers were allocated directly in the major heap, as the minor GC did not know how to handle them. (Solution 1.)
  • Then @damiendoligez added the machinery to quickly find custom blocks in the minor heap, making it possible to allocate them in the minor heap and call their finalization functions when they die while still in the minor heap. (Solution 2.)
  • The current solution in OCaml 5 is worse than either solution above: we pay the cost of tracking custom blocks in minor heaps PLUS the cost of allocating them in the major heap (since both live and dead blocks are promoted by the minor GC).
  • The PR restores solution 2, if I'm not mistaken.

Sure, promoting unconditionally sounds less efficient, but why would it trigger a failure in the testsuite? I haven't understood this point from the PR description or code.

It's the other way around (possibly): with this PR, finalization functions are called much earlier following the death of a custom block, so root registration problems like we had in #12436 may show up more easily. Not sure about this one.

@gasche
Copy link
Member

gasche commented Aug 31, 2023

@xavierleroy ah, I somehow misread the original description of @damiendoligez as assuming this PR fixes an issue with intext_par. Thanks!

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new commits. I believe the PR is correct and the code looks good, I would be in favor of merging.

There remains a small question about the runtime events related to this specific part of the minor heap code, I will discuss this with @damiendoligez right away.

@gasche gasche dismissed kayceesrk’s stale review October 11, 2023 15:10

suggestions taken into account

@damiendoligez damiendoligez merged commit 210cc86 into ocaml:trunk Oct 12, 2023
10 checks passed
@damiendoligez damiendoligez deleted the fix-custom-block-promotion branch October 12, 2023 12:41
Octachron pushed a commit that referenced this pull request Nov 28, 2023
Dead custom blocks in the minor heap are now finalized and not promoted to the major heap.

(cherry picked from commit 210cc86)
(cherry picked from commit d06f9165738be93ae61fbe75b0eaa9cf8cb60b73)
@Octachron
Copy link
Member

Cherry-picked as a part of the GC performance fix package for 5.1.1 as dcd29ea .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants