Do not rely on accumulators having tag 0 in native compilation. #14048

ppedrot · 2021-04-01T15:33:41Z

Instead, we add a test for accumulator branches. We cannot use Obj.tag because the latter is inefficient, so we roll up our own test to check that a value has a closure tag and the right code.

Added / updated test-suite

Corresponding documentation was added / updated (including any warning and error messages added / removed / modified).
Entry added in the changelog (see https://github.com/coq/coq/tree/master/doc/changelog#unreleased-changelog for details).
Overlay pull requests (if this breaks 3rd party developments in CI, see
https://github.com/coq/coq/blob/master/dev/ci/user-overlays/README.md for details)

gares · 2021-04-02T08:17:31Z

kernel/byterun/coq_values.c

+
+value coq_native_is_accu(value v) {
+  if (Tag_val(v) == Closure_tag) {
+    if (Code_val(v) == coq_accumulator_code) return Val_true;


FYI C has short circuited "logical" conjunction

Yes, I know :) This code is not correct anyways, this PR is a draft for a good reason.

Anyway, the inner test is useless. If it were to fail, the value will get matched upon (while it has a closure tag) and Coq will segfault.

That is the equivalent of the following macro in the bytecode interpreter:

#define Is_accu(v) (Is_block(v) && Tag_val(v) == Closure_tag)

By the way, I am a bit surprised you are not testing Is_block(v) in your version. Are you really sure the value is a block there?

Anyway, the inner test is useless.

Not for all calls to is_accu. This is only true for inductive accumulators encountered in match nodes. In the reification code kind_of_value, we don't have this invariant since we don't have the type around, and we can produce functions Vfun with a closure tag.

Then I suggest making two functions, one for execution and one for reification. (Especially since I am still under the impression that the execution one is supposed to test for Is_block.)

it will be easier to convince OCaml developers to optimize for this case

I'd rather compile to malfunction and have a low-level enough language of patterns to jump on the tag directly.

That is my point. The following seems just fine as a low-level language of patterns:

match foo with | x when Obj.is_closure x -> ...

Hm, then we would also need language support for ADT containing functions. If OCaml is clever enough it should rule out this branch statically since no constructor can ever be a function.

(Native compilation is likely to be a huge UB anyways, I really think targetting OCaml and playing fast and loose with the type system is a footgun waiting to be shot in our face when we start using flambda.)

If OCaml is clever enough it should rule out this branch statically since no constructor can ever be a function.

Note that OCaml currently allows to perform a match on a polymorphic values, so it cannot be too clever. Just because a value is passed to match does not mean that it is a constructor.

let f (g:'a -> bool) x = match x with | x when g x -> true | x -> false

Now, what if the match also contains proper branches? I guess that is one of those things that were never officially stated (and for which flambda might indeed be playing fast and loose).

kernel/nativevalues.mli

silene · 2021-04-02T13:42:20Z

kernel/nativevalues.ml

+external is_closure : 'a -> bool = "coq_native_is_clos" [@@noalloc]
+
+(** TODO: do this more efficiently in C? *)
+(* Dummy wrapper to ensure the closure is not inlined by the compiler *)


I do not understand why this would prevent any inlining. It seems better to use the old code and put the attribute [@inline never] on it:

let [@inline never] rec accumulate data x = if x == ret_accu then Obj.repr data else Obj.repr (accumulate { data with acc_arg = x :: data.acc_arg })

Another way to prevent inlining (if you do not trust the attribute) is to simply store the accumulator in a public mutable reference.

This doesn't work because we will get partially applied functions, which have not the accumulator code. I need to have a function of arity one that is exactly the one performing accumulation for my trick to work.

Indeed, my mistake. Still, I do not see how your code guarantees the uniqueness of the inner function. Perhaps something like the following would help, but I am not even sure OCaml guarantees that the [@inline] attribute is respected for inner functions.

let rec accumulate data = let [@inline never] accfun x = if x == ret_accu then Obj.repr data else let data = { data with acc_arg = x :: data.acc_arg } in Obj.repr (accumulate data).accfun in { accfun }

Here is another possibility, which is documented (or at least folklore enough to not be broken that easily):

let [@inline never] rec accumulate data = (); fun x -> if x == ret_accu then Obj.repr data else Obj.repr (accumulate { data with acc_arg = x :: data.acc_arg })

I believe flambda explicitly breaks the second kind of trick. I witnessed that when observing the code generated by the proof monad with flambda.

At least, with 4.12.0+flambda, it is compiled as expected, that is, ();fun forces an explicit closure creation.

Quoting Vincent Laviron:

Flambda doesn’t try to coalesce a function returning a function into a function taking more arguments. Some transformations earlier in the compiler do that, but at that point (); will not have been simplified yet so you should be fine.

Other suggestion by Vincent:

let [@inline never] rec accumulate data = Sys.opaque_identity (fun x -> if x == ret_accu then Obj.repr data else Obj.repr (accumulate { data with acc_arg = x :: data.acc_arg }))

silene · 2021-04-03T07:49:58Z

kernel/nativecode.ml

@@ -1866,7 +1870,6 @@ let pp_global fmt g =
        Format.fprintf fmt "  | %s %s@\n" cstr sig_str
    in
    let pp_const_sigs fmt lar =
-      Format.fprintf fmt "  | %s of Nativevalues.t@\n" (string_of_accu_construct "" ind);


I would suggest keeping the dummy tag-0 constructor at first (with assert false branches), just to leave the array case unchanged. Indeed, while it is quite easy to adapt Nativenorm.nf_val to not care for Varray anymore, it seems a lot more complicated for Vconv.conv_whd, if I am not missing something obvious.

@silene the two code paths for VM and native are completely different, so you can change one without affecting the other (to the notable exception that native uses the same relocation tables for some mysterious reason). I agree that this approach is simpler though, so I've just pushed a version of this PR doing that. Let's see what happens.

Sorry, for the confusion, I did not intend to talk about the VM at all. I was referring to Nativeconv.conv_val and I just mixed both names. (Their code is the same, hence my mistake.) My point was that, if Vblock and Varray share the same tag, I have no idea how to fix that function.

Ah, OK. My plan was to box persistent arrays in a dedicated block with a reserved tag (namely Obj.last_non_constant_constructor_tag). I don't know how bad this would be for performance, since that means we have to reallocate 2 words for every array modification. It's probably not that costly given that persistent arrays are already doing this kind of allocation in the back of the user.

Instead, we add a test for accumulator branches. We cannot use Obj.tag because the latter is inefficient, so we roll up our own test to check that a value has a closure tag and the right code. We keep the reserved status of blocks with tag 0 for native arrays. To do this we preserve the dummy accumulator constructor for all inductive types and replace all accumulating branches with an assertion failure.

ppedrot · 2021-04-07T15:27:09Z

As advised by @JasonGross I had a try at the UnsaturatedSolinasHeuristics/Tests.v from fiat_crypto. This PR is terrible, half of the time is passed in the accumulator test function and the evaluation is about 5 times as slow as the master branch (~200s vs. ~40s).

silene · 2021-04-07T15:55:10Z

Could you try replacing the C function by its OCaml counterpart?

let is_closure x =
  not (Obj.is_int x) && Obj.tag x = Obj.closure_tag

The idea is that Obj.is_int is a builtin, so the C function would only be called for non-trivial constructors.

Note that the code of the standard Obj.tag function is a bit awful, so you might want to keep a C function just for it:

external is_closure : 'a -> bool = "coq_native_is_clos" [@@noalloc]
let is_closure x =
  not (Obj.is_int x) && is_closure x

I do not expect much of an improvement, but perhaps we will no longer be in the 5x range.

ppedrot · 2021-04-07T15:58:35Z

@silene I can try, but it's going to be unsound for inductive types without non-constant constructors since the compiler might be allowed to statically remove this branch.

silene · 2021-04-07T16:02:47Z

Same trick as before, just put Sys.opaque_identity around x before calling Obj.is_int.

ppedrot · 2021-04-07T16:03:29Z

But then this defeats the optimization purpose?

silene · 2021-04-07T16:07:02Z

No, that is the nice thing about Sys.opaque_identity. It is the builtin identity, so it has no impact on the generated assembly code, except that the type system does not know about it.

ppedrot · 2021-04-07T17:25:57Z

No time difference when moving the is_int check to OCaml.

ppedrot · 2021-04-07T21:24:44Z

I will try to cook up a quick-and-dirty compilation scheme using the OCaml compiler-libs and see what happens. Maybe we can close this for the time being?

silene · 2021-04-08T08:12:15Z

Maybe we can close this for the time being?

Sure.

By the way, just for the record, it seems like we have a lot of leeway in how dirty our code can be without confusing the compiler. Indeed, I just noticed this piece of code in the OCaml compiler:

type t = A | B | .... | Z | Closure of dummy
let method_impl ... =
  match x with
  | A -> some closure
  ...
  | Z -> some closure
  | Closure _ as clo -> Obj.magic clo

In other words, closures are not even boxed in this function. They are stored as is, and even if the tag for the Closure constructor does not match the actual closure tag, the OCaml compiler does not care.

Unfortunately, it does not quite apply to our case, because our types allow for non-constant constructors. So, that trick would only work for a type such as bool:

type native_bool = Bool_true | Bool_false | Bool_closure of dummy
(* who cares if Bool_closure has tag 0 rather than 247? *)

gares reviewed Apr 2, 2021

View reviewed changes

silene reviewed Apr 2, 2021

View reviewed changes

kernel/nativevalues.mli Outdated Show resolved Hide resolved

ppedrot force-pushed the native-no-naked-pointer branch from 10db989 to ff7e3df Compare April 2, 2021 10:38

gares mentioned this pull request Apr 2, 2021

Coq 8.13.2 ocaml/opam-repository#18431

Merged

silene reviewed Apr 2, 2021

View reviewed changes

silene reviewed Apr 3, 2021

View reviewed changes

ppedrot force-pushed the native-no-naked-pointer branch from ff7e3df to bc46d8c Compare April 6, 2021 12:11

ppedrot closed this Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not rely on accumulators having tag 0 in native compilation. #14048

Do not rely on accumulators having tag 0 in native compilation. #14048

ppedrot commented Apr 1, 2021

gares Apr 2, 2021

ppedrot Apr 2, 2021

silene Apr 2, 2021

ppedrot Apr 2, 2021

silene Apr 2, 2021

ppedrot Apr 2, 2021

silene Apr 2, 2021

ppedrot Apr 2, 2021

ppedrot Apr 2, 2021

silene Apr 2, 2021

silene Apr 2, 2021 •

edited

ppedrot Apr 2, 2021

silene Apr 2, 2021

ppedrot Apr 2, 2021

silene Apr 2, 2021

silene Apr 2, 2021

silene Apr 3, 2021

silene Apr 3, 2021

ppedrot Apr 6, 2021

silene Apr 6, 2021

ppedrot Apr 6, 2021 •

edited

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 8, 2021

Navigation Menu

Do not rely on accumulators having tag 0 in native compilation. #14048

Do not rely on accumulators having tag 0 in native compilation. #14048

Conversation

ppedrot commented Apr 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silene Apr 2, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ppedrot Apr 6, 2021 • edited

Choose a reason for hiding this comment

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 7, 2021

ppedrot commented Apr 7, 2021

ppedrot commented Apr 7, 2021

silene commented Apr 8, 2021

silene Apr 2, 2021 •

edited

ppedrot Apr 6, 2021 •

edited