Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unboxing across "static fail" #7017

Open
vicuna opened this issue Oct 13, 2015 · 6 comments

Comments

Projects
None yet
1 participant
@vicuna
Copy link

commented Oct 13, 2015

Original bug ID: 7017
Reporter: @alainfrisch
Status: confirmed (set by @damiendoligez on 2017-02-24T10:40:01Z)
Resolution: open
Priority: normal
Severity: feature
Category: back end (clambda to assembly)
Tags: optimization, patch
Related to: #4800 #6242
Monitored by: @gasche @diml jmeber @hcarty @alainfrisch

Bug description

Currently, arguments of "jumps" in cmm (i.e. Cexit/Ccatch) are always assumed to be generic values; they cannot transport e.g. unboxed floats. Here is an example of code which illustrates the addition of float boxing to cross those jumps:

  let f xs ys =
    match xs, ys with
    | [|a|], b
    | b, [|a|] -> let _ = b.(0) +. a in ()
    | _ -> ()

This produces:

(function camlFoo__f_1324 (xs/1325: val ys/1326: val)
 (catch
   (if (!= (or (>>u (load int (+a xs/1325 -4)) 10) 1) 3)
     (if (!= (or (>>u (load int (+a ys/1326 -4)) 10) 1) 3) 1a
       (let a/1345 (load float64u ys/1326)
         (exit 2 (alloc 2301 a/1345) xs/1325)))
     (let a/1344 (load float64u xs/1325)
       (exit 2 (alloc 2301 a/1344) ys/1326)))
 with(2 a/1327:val b/1328:val)
   (let match/1346 (+f (load float64u b/1328) (load float64u a/1327)) 1a)))

I attach a patch that: (i) adds support for passing other "types" of data in Ccatch/Cexit parameters; (ii) detect when, for a given Ccatch and a given parameter position, all corresponding Cexit sites pass a "boxing expression" (float or boxed integers), and in this case, uses an unboxed parameter instead. I have only been able to test it on float unboxing (with the code above), and I don't know if it would currently apply to boxed integers in practice. That said, various optimization proposals are based on using Ccatch/Cexit, and would certainly benefit from the optimization. I think in particular of #4800 (where gasche proposes to use Ccatch/Cexit to avoid allocating some tuples; but without the current patch, gasche's proposal forces to box floats); and of #6242, which compiles local functions used in tail position into Ccatch/Cexit.

With the patch, the cmm for the example above becomes

(function camlFoo__f_1324 (xs/1325: val ys/1326: val)
 (catch
   (if (!= (or (>>u (load int (+a xs/1325 -4)) 10) 1) 3)
     (if (!= (or (>>u (load int (+a ys/1326 -4)) 10) 1) 3) 1a
       (let a/1345 (load float64u ys/1326) (exit 2 a/1345 xs/1325)))
     (let a/1344 (load float64u xs/1325) (exit 2 a/1344 ys/1326)))
 with(2 a/1347:float b/1328:val)
   (let match/1346 (+f (load float64u b/1328) a/1347) 1a)))

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @alainfrisch

Branch created on github, adapted to current trunk:

https://github.com/alainfrisch/ocaml/tree/unbox_across_jumps

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 15, 2015

Comment author: @alainfrisch

I've has to extend cmm in order to detect float/boxed int constants at this level (see github).

I'm still not very satisfied with the current implementation, since it lacks simple cases such as:

let f a =
let x = a +. 1. in
let y = a *. 3. in
let x, y =
if x < y then x, y
else y, x
in
x +. 30. *. y

the reason being that the unboxing of x and y happens after the compilation of the tuple binding (into a "jump"), so the code for compiling catch doesn't "see" that the argument is going to be immediately boxed. This results in:

(function camlFoo__f_1203 (a/1204: val)
 (let
   (x/1269 (+f (load float64u a/1204) 1.)
    y/1268 (*f (load float64u a/1204) 3.))
   (catch
     (if (

Forcing unboxing earlier with:

let f a =
  let x = a +. 1. in
  let y = a *. 3. in
  let x, y =
    if x < y then x +. 0., y +. 0.
    else y +. 0., x +. 0.
  in
  x +. 30. *. y

indeed avoids boxing:

(function camlFoo__f_1203 (a/1204: val)
 (let
   (x/1275 (+f (load float64u a/1204) 1.)
    y/1274 (*f (load float64u a/1204) 3.))
   (catch
     (if (

One would need to reapply the optimization in subst_boxed_number. Alternatively, one should compile the body of a "let" binding already knowing that the bound id will be unboxed (as in #6866).

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 16, 2015

Comment author: @alainfrisch

The github branch has been merged with unbox_earlier, and now it succeeds in eliminating all allocations in:

let () =
  let module M = Nativeint in
  let x = ref (M.of_int 1) in
  for i = 1 to 1000000000 do
    let a, b =
      if i mod 2 = 0 then !x, M.of_int 1
      else M.of_int 2, !x
    in
    x := M.add (M.mul !x a) b
  done;

trunk / unbox_earlier: 3.6
unbox_across_jumps: 1.65

@vicuna

This comment has been minimized.

Copy link
Author

commented Feb 8, 2016

Comment author: @damiendoligez

@Frisch, can we close this PR now that the github branch is merged?

@vicuna

This comment has been minimized.

Copy link
Author

commented Feb 8, 2016

Comment author: @alainfrisch

can we close this PR now that the github branch is merged?

Which one? There is no PR corresponding to this branch. So please keep this open, I think this is a rather important (future) work.

@vicuna

This comment has been minimized.

Copy link
Author

commented Nov 27, 2018

Comment author: @alainfrisch

#2165

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.