Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update compatibility.h with Caml_state #9014

Closed
gadmm opened this issue Oct 4, 2019 · 23 comments
Closed

Update compatibility.h with Caml_state #9014

gadmm opened this issue Oct 4, 2019 · 23 comments

Comments

@gadmm
Copy link
Contributor

@gadmm gadmm commented Oct 4, 2019

caml/compatiblity.h is not up-to-date for a backwards-compatible experience.

  • macros refer to come identifiers that are no longer defined
  • some identifiers that have been removed require a compatibility macro

From #8940 :

the following symbols are still mentioned in caml/compatibility.h but not defined anywhere: caml_stack_low, caml_stack_high, caml_stack_threshold, caml_extern_sp, caml_trapsp, caml_trap_barrier, caml_external_raise, caml_backtrace_active, caml_backtrace_pos, caml_backtrace_buffer, caml_backtrace_last_exn, caml_stat_heap_wsz, caml_stat_top_heap_wsz.

You still need #define caml_x (Caml_state->x). See young_limit for example. The other ones that are missing from compatibility.h are ref_table, external_raise, compare_unordered.

caml_ref_table denotes ambiguously a field of Caml_state and a type in minor_gc.h. I am afraid there is no backwards-compatible macro definition possible here, is there?

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Oct 4, 2019

@gasche I think you can put your consider-for-release sticker on this one

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Oct 4, 2019

A small note for reproducing: I came up with the list by looking up occurrences of the previous domain state field names (caml_...) one by one in the codebase with grep. It is be a good idea to do it again as a way to double-check.

@hhugo

This comment has been minimized.

Copy link
Contributor

@hhugo hhugo commented Nov 12, 2019

Would @kayceesrk or someone else have time to look at this.
base is currently broken with trunk because of this.

@nojb

This comment has been minimized.

Copy link
Contributor

@nojb nojb commented Nov 12, 2019

Patch in #9115

@kayceesrk

This comment has been minimized.

Copy link
Contributor

@kayceesrk kayceesrk commented Nov 13, 2019

Thanks @nojb for the patch. @hhugo Does #9115 fix base?

@hhugo

This comment has been minimized.

Copy link
Contributor

@hhugo hhugo commented Nov 13, 2019

I didn't have time to test yet

@xavierleroy

This comment has been minimized.

Copy link
Contributor

@xavierleroy xavierleroy commented Nov 13, 2019

"base" is currently broken with trunk because of this.

Does this mean "base" contains C files that still uses the old function names for the FFI ? It is time to switch to the new names.

@stedolan

This comment has been minimized.

Copy link
Contributor

@stedolan stedolan commented Nov 13, 2019

I don't think base is broken because of compatibility.h. It's broken because it contains this code (in src/exn_stubs.c):

extern int caml_backtrace_pos;

CAMLprim value Base_clear_caml_backtrace_pos () {
  caml_backtrace_pos = 0;
  return Val_unit;
}

This links against an unexported variable inside the OCaml runtime, which no longer exists. Fixing this will require changing base.

Incidentally, this function in base is only used to implement the following:

let raise_without_backtrace e =
  (* We clear the backtrace to reduce confusion, so that people don't think whatever
     is stored corresponds to this raise. *)
  clear_backtrace ();
  Caml.raise_notrace e

This is quite sensible behaviour: perhaps raise_notrace should always work like this?

@gasche

This comment has been minimized.

Copy link
Member

@gasche gasche commented Nov 13, 2019

@stedolan one potential issue is that I may want to use raise_notrace, inside an error handler, for control-flow without corrupting the existing trace for later re-raising (so the reraise should work as expected). I can do this by explicitly saving the backtrace first, but that feature was added after reraise and is less efficient than the doing-nothing scheme.

@gasche

This comment has been minimized.

Copy link
Member

@gasche gasche commented Nov 14, 2019

(It would be nice if advanced users of backtraces would report their wishes and needs upstream, so that we learned about them and could design better solutions before the moment where their code breaks because it uses internal details of an API we changed.)

@gasche

This comment has been minimized.

Copy link
Member

@gasche gasche commented Dec 3, 2019

(copied from #9115 (comment) as the present issue may be a better place for discussion)

So what are we going to do for ref_table?

Could user code be written in a version-portable way by using typeof (caml_ref_table) instead of struct caml_ref_table in their code? If that works, maybe we could provide a macro for the expression form rather than the type?

@stedolan

This comment has been minimized.

Copy link
Contributor

@stedolan stedolan commented Dec 4, 2019

I think we should just drop the ref_table alias?

User code cannot manipulate ref_table in a version-portable way. It's an internal undocumented part of the garbage collector, and its interface and semantics changes from time to time.

The only use of ref_table / caml_ref_table that I could find in OPAM is in the outofheap package of ocamlnet. This package reaches into the implementation of the GC and pokes various internal structures, and already has several codepaths for different OCaml versions (depending on how OCaml's page table works, and how caml_modify is implemented. See configure).

(I don't have any problem with packages reaching into the implementation of the runtime and poking its internals, but I don't think it's our job to support such uses as the runtime evolves)

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 4, 2019

Alternatively one can rename the struct caml_ref_table in minor_gc.h, which is internal.

@kayceesrk

This comment has been minimized.

Copy link
Contributor

@kayceesrk kayceesrk commented Dec 4, 2019

@gadmm IIUC renaming the type struct caml_ref_table will also break external code (ocamlnet) that uses this type. In which case, the proposal to drop the ref_table seems reasonable to me. ocamlnet will have to access ref_table using the Caml_state_field(ref_table) macro.

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 5, 2019

@kayceesrk, sure, but ocamlnet will not have to just use Caml_state_field but to define a macro themselves so that their code compiles with several versions. The role of compatibility.h is to make it easier to write code compatible across several versions, and this is just shifting this burden onto users (in an area which is not going to be documented much outside of compatibility.h). (Apart from this, I tend to agree with @stedolan's analysis.)

@kayceesrk

This comment has been minimized.

Copy link
Contributor

@kayceesrk kayceesrk commented Dec 5, 2019

It is a bit annoying that runtime system development has to take into account undocumented APIs. The manual does not mention minor_gc.h as one of the header files that C code should use. The type definition for struct caml_ref_table comes from minor_gc.h. Given the lack of documented API for these internal structures, are the compiler developers still obligated to ensure that libraries using these are not broken? This is sort of breakage is likely to happen more frequently as more multicore features land.

@xclerc

This comment has been minimized.

Copy link
Contributor

@xclerc xclerc commented Dec 5, 2019

Would the following patch make sense to fix the base issue?

diff --git a/lib/base/src/exn_stubs.c b/lib/base/src/exn_stubs.c
--- 2b950f22b3b6/lib/base/src/exn_stubs.c       2019-12-05 12:38:07.257682751 +0000
+++ working-dir/lib/base/src/exn_stubs.c        2019-12-05 11:57:04.288006823 +0000
@@ -1,8 +1,19 @@
 #include <caml/mlvalues.h>
 
+#if OCAML_VERSION >= 41000
+
+CAMLprim value Base_clear_caml_backtrace_pos () {
+  Caml_state_field(backtrace_pos) = 0;
+  return Val_unit;
+}
+
+#else
+
 extern int caml_backtrace_pos;
 
 CAMLprim value Base_clear_caml_backtrace_pos () {
   caml_backtrace_pos = 0;
   return Val_unit;
 }
+
+#endif

(Apparently, an #include <caml/version.h> should be added.)

@xavierleroy

This comment has been minimized.

Copy link
Contributor

@xavierleroy xavierleroy commented Dec 5, 2019

I agree with @kayceesrk that we don't have to maintain compatibility for libraries that use <caml/*.h> header files not documented in the OCaml manual, or (worse) variables from the OCaml runtime that are not exported. Those libraries should adapt to changes in OCaml's runtime system using #if OCAML_VERSION >= NNNNNN directives. Even better, those libraries should reconsider why they are using undocumented internal details of the OCaml runtime system.

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 6, 2019

I do not see what point this is trying to make: nobody here says the contrary, in fact we already said the same on a PR by Jacques-Henri recently. When I split this issue from the original PR, I did not think that two months later we would be there discussing straw-man arguments.

My point was that somebody for some reason has decided that ref_table deserves to be defined publicly in compatibility.h, and there is nothing inherently wrong with it: there exists a fix that only changes internals of the runtime. You have decided instead that the public definition is a mistake: fine. Stephen's points are convincing, and I can find you other good reasons. I just do not see the necessity of having to gauge a breakage, audit opam packages... in the first place. That was all.

The allusion to other breaking changes coming with multicore is off-topic, but I'll bite: the criteria of APIs being documented or not is not the good one. Indeed, OCaml is rarely documented down to the fine details that matters for some real world programs, and it will not always be as obvious as the current issue. Of the various ideas of breaking change I have heard floating around (I made a list!), there are enough to increment OCaml's major version several times over. One of them is radical enough to result in a split of the language. It was decided that multicore developers would have some leeway for introducing breaking changes, nothing is secret when you get a chance to chat with them, one can find open discussions on the multicore github, but currently there is a lack of visibility. I have more to offer than negative comments on the topic, but currently we miss a place where this can be discussed that lists which changes are seriously being considered and explains the migration strategy. The new RFC repo for instance would be a good place to start a discussion like this, and I warmly encourage it.

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 6, 2019

Closing: #9115 has been merged.

@gadmm gadmm closed this Dec 6, 2019
@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 6, 2019

@xclerc

I believe that if you want to minimise code duplication, you can also simply do:

+#if OCAML_VERSION < 41000
 extern int caml_backtrace_pos;
+#endif

Indeed, caml_backtrace_pos is defined as a compatibility macro.

What you propose makes sense to me as well.

@xclerc

This comment has been minimized.

Copy link
Contributor

@xclerc xclerc commented Dec 6, 2019

Indeed, we will quite probably end up with something as simple
as that. (Our testing is/was based on the 4.10.0+trunk switch,
which -I think- does/did not include the compatibility macro.)

@gadmm

This comment has been minimized.

Copy link
Contributor Author

@gadmm gadmm commented Dec 6, 2019

(Our testing is/was based on the 4.10.0+trunk switch,
which -I think- does/did not include the compatibility macro.)

(Indeed, it has just landed with #9115.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.