Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[patch] Avoid boxing float/int32/int64 when doing direct call #5894

vicuna opened this issue Jan 17, 2013 · 9 comments

[patch] Avoid boxing float/int32/int64 when doing direct call #5894

vicuna opened this issue Jan 17, 2013 · 9 comments


Copy link

@vicuna vicuna commented Jan 17, 2013

Original bug ID: 5894
Reporter: @chambart
Status: acknowledged (set by @damiendoligez on 2013-06-19T18:48:03Z)
Resolution: open
Priority: normal
Severity: feature
Target version: later
Category: back end (clambda to assembly)
Tags: patch
Related to: #5917
Monitored by: meyer tgazagna meurer @diml @jmeber @hcarty

Bug description

Functions taking floating points parameters always receive them boxed.
When a direct call is done, it is possible to specialise the call to
pass those parameters in registers. This also applies to boxed integers.
The return value can also be unboxed.

This allow avoiding all allocations in a function like

let (+) = Nativeint.add
let (-) = Nativeint.sub
let rec fib n =
if n < 2n then 0n + 1n else fib(n-1n) + fib(n-2n)

Additional information

This patch take care of not unboxing parameters when it is possible that this
could increase the number of allocation.

For instance a function like this

let bad (x:float) =
ignore [x] (* force boxing of x *)

bad will not be specialised because its parameter is used boxed inside its body.

In a case like:

let rec f x y =
let _ = x +. 1. in
let _ = y +. 1. in
h 1. x

and g a b =
let _ = a+.1. in
let _ = b+.1. in
f b 1.

and h c d =
let _ = c +. 1. in
let _ = d +. 1. in
let _ = g 1. c in
let _ = g 1. d in
bad d;

only the y parameter of f and a of g will be unboxed.

The return value will be unboxed if every possible returned value is unboxed.

For instance

x +. y is considered unboxed


if b
then x +. y
else List.hd l

is considered boxed because List.hd return an already boxed value.

Current problems:

  • Constants are considered boxed because they are already boxed at
    compile time, but this may not be the right default choice. It can still be
    circumvented by things like 0n + 1n in the fibonacci example.

  • Specialised functions code is duplicated:

    • It is possible to know that some functions will never be used in direct call:
      those can be compiled only one time
    • specialisation could be forbidden over a certain size
    • the generic version of function specialised only for parameters (not return
      value) could be compiled by adding a stub before the specialised version.

File attachments

Copy link

@vicuna vicuna commented Jan 17, 2013

Comment author: @gasche

Have you tested the efficiency of this optimization on some of the float-heavy benchmark code that started flowing in (from Alain for example)?

Specialised functions code is duplicated

It should be possible to have the boxed version just unbox, and then call the unboxed version; that would result in little additional compiled code. But this is only efficient if the inliner reliably removes this indirection layer (which amounts to unboxing at call-site).

Can the inliner be relied upon for this? How much does this degrade the performances in the case where no inlining can happen anyway (no .cmx, functor application...)?

(A way to work around the "not sure if boxing will be needed" problem is to specialize to a function that takes two arguments, the boxed and the unboxed data. This is better than the boxing version because you don't have to unbox at definition site, could actually not have much overhead if the unboxed data is passed efficiently, and guarantees that no additional allocation is necessary -- it's essentially a specialization of the technique of "On the runtime complexity of type-directed unboxing", Garrigue and Minamide, 1998. I don't expect it to work in practice, but you never know.)

Copy link

@vicuna vicuna commented Jan 18, 2013

Comment author: @alainfrisch

This is great news. I fully support this project!

will not be specialised because its parameter is used boxed inside its body.

I'm not sure this is a good criterion, although it is in line with the current intra-function behavior (don't unbox if there is a potential use site in boxed form). It could very well be the case that the parameter might be used boxed inside the body, but only in rare cases (e.g. to print an error message if some invariant is broken). If this disable the optimization, it means the caller might need to box the parameter.

Have you tried to combine your patch with the strategy I propose to deal with boxing (#5204), i.e. box "lazily" (on demand + memoization)?

Copy link

@vicuna vicuna commented Jan 21, 2013

Comment author: @chambart

On Alain's bench on my computer:
without patch:
time: 21.0s
minor_words: 2243147047
promoted_words: 450541

with patch (and a few forgoten cases present in my git branch):
time: 20.1s
minor_words: 1007129106
promoted_words: 238960

Notice that more than half the running time is taken in __ieee754_exp

The majority of other float heavy benchmarks I encountered, where written with
a big while loop such that nothing was boxed.

I will update the patch with a few more things: function parameters
where not considered as unboxable float so a function like

let f x = let a = if Random.bool () then x else x in a +.a

was compiled allocating x and loading from it.

This is fixed in my git branch:

testing it with opam:

opam remote add comp
opam switch install trunk+floatarg

No the inliner cannot (yet) generate direct calls so a code like:

let apply f x = f x
let g x = apply (+.) x

will generate a generic call to (+.) I have another patch in preparation
for that kind of things.

The main interest of avoiding boxing is to avoid some allocation, passing both
parameters would loose that. The cost of loading from the boxed value is not
that much and you would still have to pay for if at each function call (reloading
spilled registers).

I didn't merge both patch yet, I think that I will need to change the
stategy a bit. I am not completely sure of which cases will not be worth
unboxing, maybe when we can show that every possible path in the function
use the value boxed.

Off topic: How do I remove an attached file ? Is the bug reporter
supposed to have the rights to do that ?

Copy link

@vicuna vicuna commented Dec 4, 2015

Comment author: @alainfrisch

Will the many changes since Jan. 2013, and the ones coming with flambda, this is going to take some work bringing this up to date. But it is certainly something worth pushing forward.

Copy link

@vicuna vicuna commented Sep 6, 2017

Comment author: @alainfrisch

I've started to work on a similar approach: #1322

Copy link

@github-actions github-actions bot commented Apr 26, 2021

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Apr 26, 2021
Copy link

@gasche gasche commented Apr 26, 2021

So here is my understanding of the history of this work:

  • the 2013 patchset from @chambart went nowhere in the end, because it was never completely finished and pushed for upstreaming (... sounds a bit like the story of #608)
  • in September 2017 @alainfrisch's proposed a WIP PR #1322 inspired by those ideas; it relied on propagating more type information in the middle-end, then using them for unboxing decisions; there was a long discussion on many points, in particular about conflicts/overlaps with the then-ongoing work on flambda, and in October 2018 @alainfrisch closed that PR
  • @chambart remarked that he had worked on similar type propagation in #1029 (February 2017); Pierre did not push for this PR to be merged, it birotted away and went nowhere.
  • In November 2018 @alainfrisch proposed #2156, that would do both propagation of type information and function call parameter unboxing, based in large part on the #1029 patchset. @chambart reviewed it quickly, and requested to restrict the PR to type propagation, keeping additional unboxing for later. They both agreed (making this PR essentially a refresh of #1029) and it was merged in less than a month.
  • In November 2018 @alainfrisch sent a PR #2165 that would add unboxing not for function call parameters, but for static-catch parameter (they are related as we now have the "static function" optimization, also from Alain, that turn certain calls to local functions into static-raises). Actually it would first cleanup the unboxing strategy for let, and then work on static functions. There was a discussion of potential conflicts/overlaps with the then-ongoing work on debugging, and finally @alainfrisch was asked to split the PR in two, one part on let-unboxing (which stayed in this PR) and another on static-function unboxing (kept for later). @alainfrisch agreed, @lthls reviewed and the PR was merged in September 2019.
  • In June 2019 @alainfrisch sent the second part of #2165 as #8735 (why the number bump? I guess this coincide with the import of the Mantis bugtracker into Github issues), which was again reviewed by @lthls and merged in October 2019.

My conclusions from this history are as follows:

  • @chambart is surprisingly bad at getting his work merged upstream
  • @alainfrisch is surprisingly persistent to get his work merged upstream
  • we still don't have any unboxing of parameters to direct function calls (only for static/local functions and in general static raises), as proposed by @chambart in this pre-github issue and again by @alainfrisch in #1322.

I wonder if @alainfrisch is still planning to work on unboxing of direct call parameters? The criticisms in #1322 were varied:

  • This is less useful in presence of inlining (and now static function optimization). (I don't find this argument fully convincing.)
  • The optimization duplicate function bodies, maybe it should be opt-in to avoid bad consequences.
  • @lpw25 criticized ( #1322 (comment) ) some of the recent changes to let unboxing (not #2156, which came only later; possibly #336 ? ) which caused pessimizations for certain programs (as Xavier was warning about in the discussion), and I don't know if the issue has been addressed and is now resolved.

Copy link

@alainfrisch alainfrisch commented Apr 26, 2021

@gasche : you're (unsurprisingly) good at reconstructing and summarizing the history, thanks!

No, I don't have immediate plans to resume working on this topic. Mostly by lack of time; I still believe that supporting numerical code without forcing boxing on function boundaries is important. Btw, in the same vein, it could be good to support specialized calling conventions e.g. for functions returning multiple values (unallocated tuples). So perhaps a more general design would be needed.

Note that #1322 was already adapted to use an opt-in strategy (through attributes); also, it would also be possible to avoid duplication of bodies by having the generic version be implemented as a wrapper around the specialized one (except that it could add extra boxing).

@github-actions github-actions bot removed the Stale label Apr 28, 2021
Copy link

@github-actions github-actions bot commented May 2, 2022

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

No branches or pull requests

3 participants