New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unboxed types #606

Merged
merged 1 commit into from Jul 25, 2016

Conversation

Projects
None yet
@damiendoligez
Copy link
Member

damiendoligez commented Jun 7, 2016

This PR introduces an annotation and an optimization for concrete types that have only one constructor with one argument, and immutable records with only one field. In both cases, the default representation is a memory block with header and one field. This memory block represents no useful information and we can get rid of it.
Example:

type t = A of string
let x = A "toto"
match x with A s -> s

In this case, the pattern-matching does not even read the header of the block that represents A s because it contains no useful information. With this patch, if we add a [@@unboxed] annotation on the type definition, the compiler will suppress the indirection block and represent the value directly as the string:

type t = A of string [@@unboxed]
let x = A "toto"
assert (Obj.repr x == Obj.repr (match x with A s -> s))

This is useful (for example):

  • when defining type aliases that we don't want to mix up
  • when introducing polymorphic values by defining a record type with a single polymorphic field
  • when using a single-constructor, single-field GADT to introduce an existential type

Some questions

  • About the name: I called these "unboxed types". Is there any objection and/or better idea?
  • About the annotation: unboxed is already used for a different (but related) purpose. Is it a good idea to reuse it?
  • About the annotation: currently you have to activate this optimization with an annotation. Should it be the default instead? See below for compatibility considerations.
  • About annotations: there is a trap in the current version: if you add a [@unboxed] annotation on the constructor or on the record field, it is ignored. Should it trigger a warning or error, or even just trigger the optimization anyway? (In fact, the same problem already exists with [@immediate]).

Future work

Some more work is needed to make it work nicely with the float array and float record optimizations: currently, if you write:

type t = A of {f : float} [@@unboxed]
type r = {f1 : t; f2 : t; f3: float}
type s = {f4 : float; f5 : float; f6: float}

then a value of type t is represented as a float, but a value of type r is not optimized into a float array, unlike type s.

A related problem is the optimization of array access for t array and the interference with the [@@immediate] annotation (type t = A of int [@@unboxed] [@@immediate] currently fails).

[update: all the above are now implemented]

Compatibility

If activated by default, this optimization will break the FFI because it changes the representation of values. It will also break the compatibility with old marshalled values.

A more subtle incompatibility is with let rec:

type t = A of t [@@unboxed]
let rec x = A x

This must be rejected because it is (compiled as) the same as let rec x = x. The current compiler rejects it without any need for a patch but #556 will be made slightly more complex by this PR.

@jhjourdan

This comment has been minimized.

Copy link
Contributor

jhjourdan commented Jun 7, 2016

About the let rec issue, another solution would be to make x equal to Val_unit. It actually has the intended semantics.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jun 7, 2016

About annotations: there is a trap in the current version: if you add a [@unboxed] annotation on the constructor or on the record field, it is ignored. Should it trigger a warning or error, or even just trigger the optimization anyway? (In fact, the same problem already exists with [@immediate]).

A warning or error might be appropriate, but it should not just trigger the optimisation because there is a related optimisation: unboxing the fields and constructor arguments themselves. For example,

type t = {
  foo: int;
  bar: int;
}

type s = T of t [@unboxed]

could eventually mean that the t should be unboxed within the T constructor, rather than unboxing the T constructor itself.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jun 7, 2016

From a quick scan, it is not clear to me whether the issue of the float array hack is handled properly. For example, what happens in the following case:

type ext = E : 'a -> t [@@unboxed]

let _ = [| E 1.0; E 1 |]
@Drup

This comment has been minimized.

Copy link
Contributor

Drup commented Jun 7, 2016

I really whish this could be made the default. Putting aside backward compat, there are no downside at all of this optimization (the let rec issue is inconsequential for most users ..).

If I understand correctly, this will break FFI only if the C side construct/destruct values from a datatype with one constructor, is that right ? Does that even happen in practice ?

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 7, 2016

About the let rec issue, another solution would be to make x equal to Val_unit. It actually has the intended semantics.

It's a bit more complex than just let rec x = x : you need to handle chains of aliases of the form:

let rec x = y
and y = z
and z = x

Anyway, that's more of a discussion for #556 as it's pretty orthogonal to the present optimization.

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 7, 2016

From a quick scan, it is not clear to me whether the issue of the float array hack is handled properly.

Indeed. Your example is not enough to trigger the problem but if I do:

type t = E : 'a -> t [@@unboxed];;
let a = Array.make 10 (E 1.0);;

then I get a (flat-allocated) float array where I can store ints. As far as I can tell, that should segfault, but for some reason it doesn't.

[edit: it does segfault after fixing another bug that was hiding this one]

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 7, 2016

the let rec issue is inconsequential for most users

Indeed, I only noticed the let rec issue because it broke a test in the test suite. I tried on OPAM and none of the OPAM packages I could compile had such a let rec.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jun 7, 2016

Indeed.

I think what is needed is a check for whether the argument can be either a float or something else. It is hard to define this property precisely. I think a rule allowing only the following three cases would be sufficient, but may be overly conservative:

  1. The argument type has no existential type variables
  2. The argument type is incompatible with float.
  3. The argument type is equal to float.
@yminsky

This comment has been minimized.

Copy link

yminsky commented Jun 7, 2016

Is there any hope/plan of making this the default in the future?

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 8, 2016

Is there any hope/plan of making this the default in the future?

I'd like to get some feedback on this. The incompatibilities don't seem to be really problematic, so right now I'm on the fence.

@yminsky

This comment has been minimized.

Copy link

yminsky commented Jun 8, 2016

My view is that, except for the c binding issue, it's a pretty clear win. One could imagine a reasonable transition story: start with a flag to turn it on by default, and over time migrate to making it opt out instead of opt in, and then finally remove the old behavior.

If that was available, I believe we'd use that and maybe never bother with the annotation.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

alainfrisch commented Jun 8, 2016

I'm personally in favor of making this the default but I've a general tendency to be rather liberal in terms of breaking backward compatibility when useful.

Do we have a way to asses the impact of making this the default? I guess there aren't so many sum/record types with a single constructor/field, so perhaps a pass on public OPAM packages, excluding pure packages without C bindings, could give a good indication.

I'd like to point out also that the impact could be in theory larger than C bindings. For instance we have code at LexiFi that processes our runtime type representations and does some (Obj.)magic with the concrete representation of values. The same could happen with code-generators. Of course people using this are on their own (even though they make the same assumptions as in C bindings). Another potential issue is with alternative backends such as js_of_ocaml or Bucklescript (some Javascript code assuming a specific representation of OCaml values).

About the feature itself: this could come later, but I'd also love to be able to unbox specific constructors. Typically it should be possible to unbox at most one (in a given sum type) constructor taking a "string" argument. One could also support "unboxing" a constructor whose argument is itself a sum type (allocating tags properly to avoid clashes).

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 8, 2016

I was thinking of simply making it the default and providing an annotation to turn it off on a specific type.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jun 8, 2016

I'd be a little hesitant about making it the default whilst the float array hack still exists, because then some existing type definitions will have to become errors until annotated with [@@boxed].

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 8, 2016

Typically it should be possible to unbox at most one (in a given sum type) constructor taking a "string" argument. One could also support "unboxing" a constructor whose argument is itself a sum type (allocating tags properly to avoid clashes).

This would need major changes to the compilation of pattern-matching. Indeed it will come later, if ever.

@DemiMarie

This comment has been minimized.

Copy link
Contributor

DemiMarie commented Jun 9, 2016

I would like to be able to unbox float, int32, int64, and nativeint within records and algebraic datatypes. But that requires changes to the runtime (specifically the GC).

@bluddy

This comment has been minimized.

Copy link

bluddy commented Jun 9, 2016

Not just the GC, but also generic comparison and generic serialization.

@nojb

This comment has been minimized.

@mshinwell mshinwell changed the title [WIP] Unboxed types Unboxed types Jun 10, 2016

@DemiMarie

This comment has been minimized.

Copy link
Contributor

DemiMarie commented Jun 10, 2016

My solution is to have all pointers contiguous at the start of the object.
That massively reduces the overhead — the GC only looks at the first part
of the object.
On Jun 9, 2016 4:13 PM, "Nicolas Ojeda Bar" notifications@github.com
wrote:

Also for reference:

http://caml.inria.fr/pub/ml-archives/caml-list/2001/01/2be66fbcb6844de11cac665cd28fbf0d.en.html


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#606 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AGGWB4Gbs79D_6DVjVnfTlICydLLThdLks5qKHPxgaJpZM4Iv2df
.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

alainfrisch commented Jun 10, 2016

One needs somehow to mark which prefix of the block must be scanned (i.e. contain normal values). This could be done in several ways:

  • Using a special marker value within the block (hence "wasting" one word per such block, but this is still better than boxing multiple fields). The advantage is that computing the size of the block is unchanged.
  • Storing separately in the head the number of scanned words in addition to the total size. At least on 32-bit, we don't have enough bits in the header in general, but we could split the current size field in two only when the tag is between 1 and no_scan_tag. The make computing the actual block size a bit more complex (but still looking only at the header, so it should remain cheap). The rationale is that objects with such tag are never big (arrays have tag 0), since they correspond to sum type constructors. If this representation is used for records (with unboxed fields), one would use a non-0 tag for them.

Generic operations would need to be adapted as well. Without keeping more layout information about these mixed blocks, it will be impossible to preserve the exact same behavior (e.g. inlined floats cannot simply be compared bit-wise), but this is probably fine as long as it is documented (at least if unboxing fields is explicit).

@DemiMarie

This comment has been minimized.

Copy link
Contributor

DemiMarie commented Jun 11, 2016

My thought is to have the first word in such a "mixed block" have the 10 in the two lower bits, and the number of additional unscanned words in the higher bits. Since OCaml values will always have both of the low-order bits 0 (if they are pointers) or the low order bit 1 (otherwise) this is currently invalid, so it can be used for a special case.

My preference is for unboxing of fields to be the default at some points. Yes, this will break C stubs, but I also think that OCaml should move away from C stubs towards an FFI integrated with the compiler, in which OCaml – not C – is responsible for marshaling of data, either inline within generated code (for ocamlopt) or by compiler-generated C code (for ocamlc).

As for structural comparison, hashing, etc, I think that a bitwise comparison would be enough. I actually think that structural comparison/hashing of mutable objects is a misfeature, and that mutable objects would be better compared/hashed by object identity. But this is the wrong discussion for that. In any case, the only alternative that I can think of is to either (1) do a type lookup for each and every type (using even more metadata) or (2) do mandatory specialization of comparison and hashing (but what about polymorphic recursion?

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 15, 2016

My thought is to have the first word in such a "mixed block" have the 10 in the two lower bits, and the number of additional unscanned words in the higher bits. Since OCaml values will always have both of the low-order bits 0 (if they are pointers) or the low order bit 1 (otherwise) this is currently invalid, so it can be used for a special case.

Unfortunately, that breaks compaction because of infix closure pointers.

@DemiMarie

This comment has been minimized.

Copy link
Contributor

DemiMarie commented Jun 16, 2016

My proposal would still treat closures specially (or not, with GPR #203). The first word (which points to the compiled code for the closure) would always be treated as a non-pointer (since compiled code never moves).

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 20, 2016

The consensus at the latest developer meeting (2016-06-16) was that this should not be the default at first, until users of the FFI (especially the likes of ctypes and camlidl) have adapted. Also, it would be nice to design a tool that can help with the transition.

@yminsky

This comment has been minimized.

Copy link

yminsky commented Jun 20, 2016

Would it make sense to add a flag that flips the default? That way, we could try it out inside of our walls, and perhaps learn more about the FFI issues. It would be nice, at least within our walled garden, to get the performance benefits without needing to litter our code with annotations.

I suppose an alternate approach would be for us to write a PPX that automatically adds the annotation to every single-entry variant.

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jun 29, 2016

Are you thinking of a configuration flag or a compiler flag?

One problem with making it the default: when checking GATDs for unboxability, the conditions are rather complex, so I'd rather make [@@unboxed] the default for all single-entry variants (and records) and then you get an error if it's not unboxable (because of GADTs + float array optimization). In that (rare) case, you will have to add a [@@ocaml.boxed] annotation to the type declaration. This is rather ugly, but I don't think it would be a good idea to make the representation depend on the details of other type declarations.

@yminsky

This comment has been minimized.

Copy link

yminsky commented Jun 30, 2016

I was thinking of a compiler flag, since it would allow you to change the behavior in a library by library way. For us, we'd likely leave the default behavior as is for externally developed libraries.

@lpw25 @diml : what do you guys think about the wisdom of having a compiler flag to flip the default, versus a PPX to determine the behavior? If we do the PPX, it will be easy for us to work out in practice what seems most convenient, and then maybe we can use that to inform what the API should look like in the compiler longer term.

@@ -331,6 +333,12 @@ and constructor_arguments =
| Cstr_tuple of type_expr list
| Cstr_record of label_declaration list

and unboxed_status =

This comment has been minimized.

@alainfrisch

alainfrisch Jul 13, 2016

Contributor

Shouldn't this be a record type such as {unboxed: bool; explicit: bool}? This would remove the need for helper functions in Btype.

This comment has been minimized.

@damiendoligez

damiendoligez Jul 13, 2016

Member

Yes, or even two different fields in the type_declaration record, but the overhead! Do you think the brevity of the code is worth one more word in the data structure?

This comment has been minimized.

@alainfrisch

alainfrisch Jul 14, 2016

Contributor

I don't think there would be any significant memory overhead with a record since the 4 possible values will be stored as structured constants (and thus shared). Are you concerned with the extra memory load?

And now that the worm is in the fruit, I expect some @@packed attribute so that record fields that can be represented with a few bits can be merged in a single word (and then the record @unboxed if a single word remain after packing) 😄

But seriously, it's fine to keep the 4 constructors.

This comment has been minimized.

@damiendoligez

damiendoligez Jul 15, 2016

Member

I don't think there would be any significant memory overhead with a record since the 4 possible values will be stored as structured constants (and thus shared). Are you concerned with the extra memory load?

Oh, I didn't think of that. It's so nice I think I'll change the code. I don't think the extra access time is worse than calling the helper functions.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jul 13, 2016

In the OCaml world, "boxing" is almost always used in the context of atomic values (floats, ints)

I think that is only because OCaml only knows how to unbox those things at the moment. As OCaml's unboxing optimisations expand to more types I would still expect them to be called unboxing.

@mrvn

This comment has been minimized.

Copy link
Contributor

mrvn commented Jul 13, 2016

Actually ints are not boxed, they are tagged. So it should be called untagging for ints. But floats are boxed and any more complex type will be boxed too. It is the more general term to use for both unboxing and untagging. Never heard untagging being used. Think of tags as mini boxes.

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Jul 13, 2016

Actually ints are not boxed

I assumed we were talking about Int64, etc.

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jul 13, 2016

"Unwrappable types can be either wrapped or unwrapped." Why not, one good point is that it avoids re-using a word that already describes a different user-visible concept.

@damiendoligez damiendoligez force-pushed the damiendoligez:unboxed-types branch 2 times, most recently from b69bc5e to 1c52075 Jul 13, 2016

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jul 15, 2016

@alainfrisch I think it's cleaner like this. Could you review again?

Also, we need to agree on the name.

match Env.find_type p env with
| {type_unboxed = {unboxed = true; _}; _} ->
Misc.Stdlib.Option.value_default (fun x -> x.desc) ~default:sty
(Typedecl.get_unboxed_type_representation env ty)

This comment has been minimized.

@alainfrisch

alainfrisch Jul 15, 2016

Contributor

Shouldn't this logic be moved to Typeopt.scrape in order to benefit to other functions that depend on it (e.g. is_base_type?

This comment has been minimized.

@alainfrisch

alainfrisch Jul 15, 2016

Contributor

(Having is_base_type covered is useful for Translcore.specialize_comparison, for instance.)

else if unboxed then { unboxed = true; default = false }
else if !Clflags.unboxed_types then { unboxed = true; default = true }
else { unboxed = false; default = true }

This comment has been minimized.

@alainfrisch

alainfrisch Jul 15, 2016

Contributor

Perhaps write this with a pattern matching on the three relevant Booleans (with a comment to prevent clever developers from writing {unboxed=!Clflags.unboxed_types; default=true}, which would loose the nice sharing property).

@damiendoligez damiendoligez force-pushed the damiendoligez:unboxed-types branch from 1c52075 to 2a5c1c6 Jul 19, 2016

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jul 19, 2016

@alainfrisch I've implemented both of your suggestions. Can we merge now? Do you still want to discuss the name?

@alainfrisch

This comment has been minimized.

Copy link
Contributor

alainfrisch commented Jul 19, 2016

Can we merge now?

I'm not fully confident with the restrictions related to unboxed float arrays, but I don't think that more code review will address that on my side. So, for me: "yes".

Do you still want to discuss the name?

Nope!

@damiendoligez damiendoligez force-pushed the damiendoligez:unboxed-types branch 2 times, most recently from 3c6eddb to 9cc74e7 Jul 20, 2016

@damiendoligez damiendoligez force-pushed the damiendoligez:unboxed-types branch from 9cc74e7 to d5a6e50 Jul 21, 2016

@damiendoligez damiendoligez merged commit faed766 into ocaml:trunk Jul 25, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@@ -150,6 +151,7 @@ let number = function
| No_cmx_file _ -> 58
| Assignment_to_non_mutable_value -> 59
| Unused_module _ -> 60
| Unboxable_type_in_prim_decl _ -> 61
;;

let last_warning_number = 60

This comment has been minimized.

@alainfrisch

alainfrisch Jul 26, 2016

Contributor

@damiendoligez This should be bumped.

This comment has been minimized.

@gasche

gasche Jul 26, 2016

Member

Good catch, I bumped it in 7fee1ea .

@damiendoligez damiendoligez deleted the damiendoligez:unboxed-types branch Sep 5, 2016

@mmottl

This comment has been minimized.

Copy link
Contributor

mmottl commented Sep 21, 2016

I have just run into an issue with the OCaml 4.04 beta2 using this feature. Consider the following code:

type ('a, 'kind) tree =
  | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ]) tree
  | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree

and 'a node = Node : ('a, _) tree -> 'a node  [@@ocaml.unboxed]

type 'a t = ('a, [ `inner ]) tree

The above will fail with:

Error: This type cannot be unboxed because
       it might contain both float and non-float values.
       You should annotate it with [@@ocaml.boxed].

Here is a kludgy workaround:

type ('a, 'kind, 'parent) tree =
  | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ], 'parent) tree
  | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent) tree

type 'a node = Node : ('a, _, 'a node) tree -> 'a node  [@@ocaml.unboxed]

type 'a t = ('a, [ `inner ], 'a node) tree

I suspect that the type tree is still unavailable when the compiler handles type node in the first version, because it is defined within the same recursive definition. After breaking the recursion with the help of a type variable, the compiler can apparently see that tree is populated with non-floats. It's probably just a matter of preparing lookup tables for type definitions before handling the attribute.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Sep 21, 2016

Thanks for the catch, I submitted a bug report to make sure we track this properly: PR#7364.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment