Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organise and simplify translation of primitives #1557

Merged
merged 2 commits into from Apr 9, 2018

Conversation

Projects
None yet
7 participants
@lpw25
Copy link
Contributor

lpw25 commented Jan 3, 2018

Currently the translation of builtin externals in bytecomp is pretty disorganized. It is handled across multiple parts of translcore.ml and different builtins are handled in different places. This has lead to a number of silly little bugs, e.g. this one or this bug that is still on trunk:

# module M : sig
    val send : CamlinternalOO.obj -> CamlinternalOO.tag -> CamlinternalOO.t
  end = CamlinternalOO ;;
    Characters -1--1:
  module M : sig
  
Error: The external function `%send' is not available

This PR tries to improve the situation. It moves all the handling for primitives into a new translprim.ml file, and uses an internal type:

type prim =
  | Primitive of Lambda.primitive
  | Comparison of comparison * comparison_kind
  | Raise of Lambda.raise_kind
  | Raise_with_backtrace
  | Lazy_force
  | Loc of loc_kind
  | Send
  | Send_self
  | Send_cache

to represent the various builtin externals. The cases other than Primitive are the ones that require special handling. These have previously been handled in different ad-hoc ways but here are handled by a fairly uniform set of functions in translprim.ml.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Jan 3, 2018

That sounds like an excellent idea to me, although I still have to look at what you have done.

cc @bobot @nojb, who have been working on #1465

@xclerc

This comment has been minimized.

Copy link
Contributor

xclerc commented Jan 3, 2018

I am unsure whether it should be done in this PR, but some cases of
Semantics_of_primitives.for_primitive seem to be wrong.
For instance, caml_format_{float,int}/caml_{int32,int64,nativeint}_fomat
do allocate. Similarly, reading from a big array might allocate, depending on
its kind.

@bobot
Copy link
Contributor

bobot left a comment

It is a very good improvement. Some remarks from the point of view of someone who try to add a new primitive.

| Send_cache, [obj; meth; cache; pos] ->
Lsend(Cached, meth, obj, [cache; pos], loc)
| Send_cache, _ ->
raise(Error(loc, Wrong_arity_builtin_primitive prim_name))

This comment has been minimized.

@bobot

bobot Jan 15, 2018

Contributor

Could this case be regrouped in one | (... | Loc | Send | Send_self | Send_cache ), _ -> , otherwise I think the user wonder what is the difference between all these lines.

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

Fixed.

(**************************************************************************)

(* Translation of primitives *)

This comment has been minimized.

@bobot

bobot Jan 15, 2018

Contributor

Could you add some guide of how to add a new primitive? Something simple perhaps like:

add your primitive to prim or one of the sub-datatype, then to primitives_table, finally let the typer guide you to the missing cases.

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

I decided against adding a comment. I don't think it makes adding a new primitive noticeably easier and these kinds of comments tend not to get updated when things change, leaving things in a worse state than if they weren't there.


exception Error of Location.t * error

(* Insertion of debugging events *)

This comment has been minimized.

@bobot

bobot Jan 15, 2018

Contributor

It is not clear when and where to add debugging events, could you point to the documentation?

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

I don't think there is documentation, nor do I know the rules and invariants. I tried simplifying things when writing this patch, but it made some tests around backtraces fail, so instead I was just careful to keep the behaviour the same.

@hhugo

This comment has been minimized.

Copy link
Contributor

hhugo commented Apr 2, 2018

@damiendoligez, can we consider this one for 4.07 or at least merge #1465

@hhugo

This comment has been minimized.

Copy link
Contributor

hhugo commented Apr 6, 2018

@damiendoligez, @lpw25, can we have this PR in 4.07 ?

@nojb

This comment has been minimized.

Copy link
Contributor

nojb commented Apr 7, 2018

I have been studying this PR and will post a review at some point today.

@nojb
Copy link
Contributor

nojb left a comment

I reviewed this PR.

I think it is a great improvement on the current state of affairs. I read the code carefully and am confident it is correct (modulo a small issue with the handling of the raise primitive). More importantly, I am confident that if there is a bug, it will be easy to find and fix thanks to the new code organization. This is in contrast with the current state of affairs where the handling of primitives is spread all over the place and many invariants are not tracked by the type system.

Some remarks about the code itself.

While the diff is large, a lot of it is due to moving code around and "type-directed" refactoring.

Conceptually, the PR does several things:

  • merge the code used by transl_primitive (which eta-expands a primitive) and transl_primitive_application (which translates a fully applied primitive). This cuts down on a lot of duplicated code, and fixes at least MPR#7674 and MPR#7666.
  • A more robust function to lookup primitives lookup_primitive instead of find_primitive. This would have produced a sensible error message for MPR#7666.
  • Clean up the insertion of debug events in the generated lambda code (functions wrap and wrap0). The current code here is so convoluted that it took me some time to convince myself that the new code (which is simpler) is actually equivalent.
  • Introduce a new type Translprim.prim which includes the usual Lambda.primitive but also other primitives that do not currently have a representation in the type system and are only recognized by their names. The "new" primitives are: comparison primitives, send, sendself, sendcache, and raise_with_backtrace. Having constructors corresponding to these primitives allows to use pattern matching and, in the case of comparisons, to handle them before specialization.
  • Some primitives that need special handling are lifted out of the Lambda.primitive type into the new Translprim.prim type. On the one hand pattern matching makes it obvious where special handling is needed. On the other hand, it also makes it clear that these primitives do not survive lambda-generation. The primitives affected are: raise, lazyforce, and loc_* primitives.

In an ideal world, having the PR split into smaller commits would make it easier to digest, but personally I don't feel it is reason enough to delay its integration.

| Compare, Compare_int32s -> Pccall caml_int32_compare
| Compare, Compare_int64s -> Pccall caml_int64_compare

let lamda_of_loc kind loc =

This comment has been minimized.

@nojb

nojb Apr 7, 2018

Contributor

s/lamda/lambda/

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

fixed

let prim = lookup_primitive loc p env path in
let has_constant_constructor = false in
let prim =
match specialize_primitive env ty ~has_constant_constructor prim with

This comment has been minimized.

@nojb

nojb Apr 7, 2018

Contributor

Why not have specialize_primitive return prim instead of prim option?

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

It is needed for implementing #1566.

| Raise_regular, Lvar argv when Hashtbl.mem try_ids argv ->
Raise_reraise
| _, _ ->
Raise_regular

This comment has been minimized.

@nojb

nojb Apr 7, 2018

Contributor

Should probably be kind instead of Raise_regular, to avoid changing Raise_notrace case.

This comment has been minimized.

@lpw25

lpw25 Apr 9, 2018

Author Contributor

Good catch. Fixed.

@nojb nojb added the high-priority label Apr 9, 2018

@lpw25 lpw25 force-pushed the lpw25:tidy-primitives branch from 6284170 to 656aa42 Apr 9, 2018

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Apr 9, 2018

I've rebased, addressed comments and added a changes entry.

@nojb

nojb approved these changes Apr 9, 2018

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Apr 9, 2018

The CI failure is unrelated to this patch (and has now been fixed on trunk).

@lpw25 lpw25 merged commit a7ee6a0 into ocaml:trunk Apr 9, 2018

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Apr 9, 2018

Inria's CI fails on ARM.
File "translprim.mli", line 18, characters 19-39:
Error: Unbound module Typedtree

Is this just a matter of running make alldepend?

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Apr 9, 2018

I ran make depend, and I can see the line:

bytecomp/translprim.cmi : typing/types.cmi typing/typedtree.cmi

in the commits, so this is a surprising error.

@lpw25

This comment has been minimized.

Copy link
Contributor Author

lpw25 commented Apr 9, 2018

Ah, there is aparently now a ocamldoc/stdlib_non_prefixed/.depend file that also needs updating, and gets updated by make alldepend. Mark has pushed the fix to trunk.

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Apr 9, 2018

@hhugo

This comment has been minimized.

Copy link
Contributor

hhugo commented Apr 9, 2018

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.