Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making better use of extra slots in the symbol corresponding to the current unit #6343

Closed
vicuna opened this Issue Mar 7, 2014 · 4 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Mar 7, 2014

Original bug ID: 6343
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2015-12-11T18:25:48Z)
Resolution: fixed
Priority: normal
Severity: feature
Fixed in version: 4.02.0+dev
Category: back end (clambda to assembly)
Related to: #5537
Monitored by: @gasche @diml @ygrek @yakobowski

Bug description

The compilation of unit has recently been changed to map value identifiers defined in sub-modules into extra slots of the symbol corresponding to the current unit. This makes it possible to access those values directly by a single indirection from the global symbol. Unfortunately, this ability is not used optimally. For instance, consider:

==============================
module A1 = struct
module A2 = struct
let r = ref 1
end
end

let f () = !A1.A2.r

Here, A2 and 4 are stored in extra slots of the global (fields number 2 and 3 respectively). Unfortunately, the reference to r in f still goes through 4 indirections from the global symbol, while 2 would be enough.

On a related note, if we put an .mli on this unit which forces some non-trivial coercions on A1 (e.g. an empty signature for A1), then it's even worse, since the indirections don't start from the global symbol, but from the function's environment (which is now required, the closure is no longer constant). To fix this part, one could keep extra slots in the global to store the uncoerced version of module identifiers, so that they can be accessed from this root symbol.

File attachments

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 7, 2014

Comment author: @alainfrisch

I've attached a patch which attacks the main issue (not using existing global slots as a faster way to access nested value identifiers) by enriching the notion of approximation tracked during the closure conversion with a new case to represent the nth field of a global.

In the long tradition of meaningless but funny micro-benchmarks (Xavier, you're welcome :-)), I've tested the patch with:

module A1 = struct
  module A2 = struct
    let r = ref 1
  end
end

let f () =
  for i = 1 to 10000 do
    incr A1.A2.r
  done;
  !A1.A2.r

let () =
  for i = 1 to 100000 do ignore (f ()) done

and it runs about twice as fast as with the current trunk.

This would need to be checked, but I believe the patch could allow to simplify Translmod to avoid keeping track of a lambda-substitution to map identifiers to global field references.

That said, another approach would be to attack the issue precisely in Translmod directly, in order to get a nicer lambda code. One advantage is that it could in theory benefit more easily to bytecode (if we switch to the tranl_store way of compiling structures), which might be nicer for js_of_ocaml in particular.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 7, 2014

Comment author: @alainfrisch

This seems to be related to #5537 and #5573.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 7, 2014

Comment author: @alainfrisch

The patch also fixed the "related" note of the original ticket. Consider for instance:

foo.mli:
module A1 : sig end
val f : unit -> int

foo.ml:
module A1 = struct let r = ref 1 end
let f () = !A1.r

The "r" value is put in a global slot, and the "A1" block is created by accessing this global slot. This information is preserved in the approximation of A1, so that when the lambda code accesses A1.r, the closure conversion knows that the value can be retrieved from the global slot.

If A1 is completely hidden from the interface (or if "r" is added to it, so that there is no coercion involved), the lambda code for accessing A1.r will be quite different (it will start from the global symbol, not from the local A1 block), but the end result will be the same.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 10, 2014

Comment author: @alainfrisch

Patch committed to trunk (commit 14452). It has the effect of turning more functions into closed ones, even if they seemingly have free variables in the lambda code (this already used to be the case, but less frequently). In such cases, we can allocate the corresponding closures statically (commit 14453).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.