Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using function float arguments prevents unboxing #7023

Closed
vicuna opened this issue Oct 16, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Oct 16, 2015

Original bug ID: 7023
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-07-13T07:27:23Z)
Resolution: fixed
Priority: normal
Severity: minor
Target version: 4.03.1+dev
Fixed in version: 4.04.0 +dev / +beta1 / +beta2
Category: back end (clambda to assembly)
Monitored by: @diml jmeber

Bug description

This is to document a bad behavior of the current unboxing strategy. While let-bound identifiers are unboxed when the bound expression would always returns an explicitly boxed value, function arguments, or identifiers bound to already boxed values, are never unboxed, thus potentially preventing the unboxing of other let-bound identifiers.

Examples:

let f1 b x =
  let y =
    if b then x else x +. 1.
  in
  y *. 2.

let f2 b g =
  let x = g () in
  let y =
    if b then x  (* or g() *) else x +. 1.
  in
  y *. 2.

In both cases, the x identifier is not unboxed, which then prevents y to be unboxed itself (because the "then" branch returns something which is not considered to be a boxed thing). This means than "x +. 1" requires boxing when b is false.

One possible approach would be to relax Cmmgen.is_unboxed_number so that branching constructions (including if-then-else) are considered to return "immediately unboxable" number if at least one branch does so, while currently all branches must do so. But one needs to be careful, because:

  1. This could actually create more boxing (as in "let y = if b then g () else 1. in h y" when b is true).

  2. This might be unsafe if branches return different types (possible with GADTs).

1 is probably not too bad in practice. A possible way to address 2 is to keep more type information for bound identifiers at the Clambda level, or to keep track of "GADT opening" explicitly in this representation (as a special kind of branch). To illustrate the dangerous case, consider:

type _ is_float =
  | Yes: float is_float
  | No: 'a is_float

let f (type t) (b : t is_float) (g : unit -> t) : t =
  let x = g () in
  let y : t =
    match b with
    | Yes -> x +. 1.
    | No -> x
  in
  match b with
  | Yes -> y *. 1.
  | No -> y

which results in:

(function camlFoo__f_1259 (b/1261: val g/1262: val)
 (let
   (x/1263 (app (load val g/1262) 1a g/1262 val)
    y/1264
      (if (!= b/1261 1) x/1263 (alloc 2301 (+f (load float64u x/1263) 1.))))
   (if (!= b/1261 1) y/1264 (alloc 2301 (*f (load float64u y/1264) 1.)))))

and in this case, it would be incorrect to unbox y.

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 20, 2015

Comment author: @alainfrisch

I've created this branch:

alainfrisch/ocaml@unbox_earlier...alainfrisch:unbox_with_type

(on top of unbox_earlier) to experiment with keeping some type info about let-bound identifiers. In cmmgen, when translating a "let x = ... in ..." and "x" has been marked (by Translcore) to be a "boxed number", then the criterion for unboxing the bound identifier is relaxed; in case of a branch construct, only one branch needs to box its result to trigger the unboxing scheme (and not all branches).

@vicuna

This comment has been minimized.

Copy link
Author

commented Dec 4, 2015

Comment author: @alainfrisch

This commit:

alainfrisch@bcb5e35

eliminates boxing for the examples in this ticket. It remembers which let-bound identifiers are floats by marking them using a flag directly attached to the Ident.t (this seems to be the lighter way to pass the info down to cmm). This could be generalized to other types as well (and actually a single flag for all of them is enough; we just need to know that the identifier was not bound to an "unknown" type, i.e. a GADT existential).

@vicuna

This comment has been minimized.

Copy link
Author

commented Dec 9, 2015

Comment author: @alainfrisch

As part of the unbox_ref branch ( https://github.com/alainfrisch/ocaml/tree/unbox_ref ), I've also addressed the same issue differently by keeping the type info on the let node themselves.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 13, 2016

Comment author: @alainfrisch

Addressed by #336

The two examples now (in trunk) allocate only the final result, which is the expected behavior.

@vicuna vicuna closed this Jul 13, 2016

@vicuna vicuna added the back-end label Mar 14, 2019

@vicuna vicuna added this to the 4.03.1 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.