Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

callback.c: unrooted implementations of caml_callback*_exn #12121

Merged
merged 8 commits into from Mar 22, 2023

Conversation

gasche
Copy link
Member

@gasche gasche commented Mar 18, 2023

This is a follow-up to a discussion initiated by @kayceesrk in #12112 (comment) : when calling an OCaml callback from C (caml_callback_exn(closure, arg)), we don't want the callback closure and argument to be rooted during the OCaml-side computation, they should be unrooted so that they can be collected by the GC as soon as they are not needed in the computation anymore -- just like function calls written in OCaml itself, thanks to liveness analysis.

The runtime code previously ensured this property (except for caml_callbackN_exn in native code, see below), which I call "unrooted callbacks", but it regressed in 20d107c : the Multicore logic adds an allocation before the callback itself gets called (for correctness reasons in presence of effects performed in the OCaml callback), and everything got rooted at this point for correctness.

The present PR restores the property that callbacks are unrooted in both bytecode and native code, following the suggestion in the discussion #12112 (comment) .

The trickiest change of the PR is the implementation of caml_callbackN_exn in native code. Architecture-specific runtimes only implement calls of arity up to 3, and the N-ary version for N > 3 is implemented by repeated partial application. The liveness property that we should preserve is that each argument is rooted up to just before the corresponding partial application.
This function would in fact not implement unrooted callbacks in its 4.x implementation, see https://github.com/ocaml/ocaml/blob/4.14.0/runtime/callback.c#L128-L157 . In this respect the present PR also improves on 4.x.

CAMLexport value caml_callbackN_exn(value closure, int narg, value args[]) {
switch (narg) {
case 0:
return closure;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the previous implementation gracefully handles the case narg == 0, so I kept this property in the new implementation. I suspect that it is not handled well in the bytecode implementation, so presumably there is an implicit precondition that narg > 0. We could clarify this condition, or ensure that the bytecode version also works well with narg == 0, I went with keeping the behavior unchanged in both versions.

@gasche gasche force-pushed the caml_callback_memory_reachability branch 3 times, most recently from 154bb1c to 1a3da96 Compare March 19, 2023 06:00
@gasche
Copy link
Member Author

gasche commented Mar 19, 2023

Notes:

  • the CI passes with no failure (but then: the native callbackN implementation is never used in the compiler distribution or testsuite)
  • reviewing this requires familiarity with the FFI local-root-registration API
  • cc @kayceesrk @xavierleroy @Engil @TheLortex

@xavierleroy
Copy link
Contributor

Some general comments:

  • I'm not a big fan of the "unrooted" neologism. "not registered as a root" or "not kept alive" is clearer.
  • To register roots with the GC within a C block and not the whole function body, I like the old root registration API (Begin_rootsN / End_roots) better than block-scoped CAMLparam0 / CAMLlocalN / CAMLdrop.
  • For caml_callbackN_exn in native code, you also have the option of registering the args array as roots, like today, then replace entries with Val_unit to hide them from the GC. Something like:
CAMLexport value caml_callbackN_exn(value closure, int narg, value args[])
{
  CAMLparam1 (closure);
  CAMLxparamN (args, narg);
  CAMLlocal1 (res);
  int i;

  res = closure;
  for (i = 0; i < narg; /*nothing*/) {
    /* Pass as many arguments as possible */
    switch (narg - i) {
    case 1:
      res = caml_callback_exn(res, args[i]);
      if (Is_exception_result(res)) CAMLreturn (res);
      i += 1;
      break;
    case 2:
      res = caml_callback2_exn(res, args[i], args[i + 1]);
      if (Is_exception_result(res)) CAMLreturn (res);
      i += 2;
      break;
    default:
      value arg1 = args[i]; args[i] = Val_unit;
      value arg2 = args[i + 1]; args[i + 1] = Val_unit;
      value arg3 = args[i + 2]; args[i + 2] = Val_unit;
      res = caml_callback3_exn(res, arg1, arg2, arg3);
      if (Is_exception_result(res)) CAMLreturn (res);
      i += 3;
      break;
    }
  }
  CAMLreturn (res);
}

@gasche
Copy link
Member Author

gasche commented Mar 19, 2023

"Callbacks whose argument (or closure) are not kept alive" is a bit of a mouthful compared to "unrooted callbacks". Any suggestion? "Non-leaking callbacks"? "Memory-preserving callbacks"? "Liveness-preserving callbacks"?

Begin_roots / End_roots are not used in the runtime anymore, all use-sites were removed in #11002. After looking at them (I had never seen them before) I don't disagree with your comment that they would be slightly simpler here, so I will take the suggestion and move to use them -- reintroducing them in the runtime code.

I will use your suggestion for caml_callbackN_exn. I don't find it much clearer, but it's always better to meet the reviewer halfway. It has the slight advantage of avoiding repeated root registration on recursive calls. (Note that we do repeated stack-parent-allocation still, obviously nobody cares about the performance of this version.)

@xavierleroy
Copy link
Contributor

For the terminology, I'll offer a rewrite of some of your comments later, when we've agreed about the functionality and the code.

Come to think of it, my suggestion for caml_callbackN_exn has the downside of modifying the args array, which is passed by the user. In typical use it should be no problem, but it means we need to mention it in the FFI documentation. So maybe the repeated root registration is more transparent.

It has the slight advantage of avoiding repeated root registration on recursive calls. (Note that we do repeated stack-parent-allocation still, obviously nobody cares about the performance of this version.)

I agree that caml_callbackN_exn could be made more efficient, but there's no clear need yet.

@gasche
Copy link
Member Author

gasche commented Mar 19, 2023

Another approach to callbackN would be to have a recursive function that:

  1. registers the arguments [narg-3; narg-1] as roots,
  2. recursively calls itself to compute the callback with arguments [0; nargs-4],
  3. unregisters the last 3 arguments, and
  4. passes them to the recursively-computed closure.

This would give the property that each argument is registered at most once. But I don't see any performance benefit compared to the version I propose, my understanding is that registering narg-3 arguments or 3 arguments has the same cost. Both perform narg/3 registrations and unregistrations, and this new proposal has the minor downside of not being tail-recursive.

@gasche gasche force-pushed the caml_callback_memory_reachability branch from 1a3da96 to 32cefba Compare March 19, 2023 21:51
@gasche
Copy link
Member Author

gasche commented Mar 19, 2023

@xavierleroy I pushed a further commit to use {Begin,End}_roots.

Copy link
Contributor

@kayceesrk kayceesrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the new functionality and they look correct to me.

closure = caml_callback3_exn(closure, args[0], args[1], args[2]);
End_roots();

return caml_callbackN_exn(closure, remaining_narg, remaining_args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is better to avoid recursion in C and retain the loop structure (perhaps a while(1) { } loop). Do we introduce the risk of stack overflow by introducing recursion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tail-recursive function, so I expect both gcc and clang to optimize it to take constant stack space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kayceesrk: I pushed a new commit which rewrites this function to use an explicit loop instead of a recursive function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look correct to me. Thanks.

@gasche gasche force-pushed the caml_callback_memory_reachability branch from e8c996e to 9814d3b Compare March 21, 2023 05:43
@gasche
Copy link
Member Author

gasche commented Mar 21, 2023

@xavierleroy this is good to merge as far as I'm concerned. When would you like to reword the comments?

@gasche gasche force-pushed the caml_callback_memory_reachability branch from 9814d3b to c1b152c Compare March 21, 2023 20:56
@xavierleroy
Copy link
Contributor

When would you like to reword the comments?

I wanted to send a PR over your branch with my proposed rewording, but my Github kung-fu is weak and I could not find how to do it. So I took the liberty to push directly to your branch. Feel free to ignore or reword again to your liking, it's just an example of what can be written without neologisms :-)

@gasche
Copy link
Member Author

gasche commented Mar 22, 2023

@xavierleroy thanks! The new wording is allright with me, so I will go ahead and merge.

In passing: GHC has a formalized system of "notes", see its coding style documentation, which are basically explanatory comments with an explicit name Note [Foo Bar] .... Notes can be referred by name easily (See Note [Foo Bar]) from any other part of the code. I would like to have such a convention for OCaml, but maybe this is a matter of writing style and others would find it too heavy.

@gasche gasche merged commit dab4e3d into ocaml:trunk Mar 22, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants