-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[patch] Avoid boxing float/int32/int64 when doing direct call #5894
Comments
Comment author: @gasche Have you tested the efficiency of this optimization on some of the float-heavy benchmark code that started flowing in (from Alain for example)?
It should be possible to have the boxed version just unbox, and then call the unboxed version; that would result in little additional compiled code. But this is only efficient if the inliner reliably removes this indirection layer (which amounts to unboxing at call-site). Can the inliner be relied upon for this? How much does this degrade the performances in the case where no inlining can happen anyway (no .cmx, functor application...)? (A way to work around the "not sure if boxing will be needed" problem is to specialize to a function that takes two arguments, the boxed and the unboxed data. This is better than the boxing version because you don't have to unbox at definition site, could actually not have much overhead if the unboxed data is passed efficiently, and guarantees that no additional allocation is necessary -- it's essentially a specialization of the technique of "On the runtime complexity of type-directed unboxing", Garrigue and Minamide, 1998. I don't expect it to work in practice, but you never know.) |
Comment author: @alainfrisch This is great news. I fully support this project!
I'm not sure this is a good criterion, although it is in line with the current intra-function behavior (don't unbox if there is a potential use site in boxed form). It could very well be the case that the parameter might be used boxed inside the body, but only in rare cases (e.g. to print an error message if some invariant is broken). If this disable the optimization, it means the caller might need to box the parameter. Have you tried to combine your patch with the strategy I propose to deal with boxing (#5204), i.e. box "lazily" (on demand + memoization)? |
Comment author: @chambart On Alain's bench on my computer: with patch (and a few forgoten cases present in my git branch): Notice that more than half the running time is taken in __ieee754_exp The majority of other float heavy benchmarks I encountered, where written with I will update the patch with a few more things: function parameters let f x = let a = if Random.bool () then x else x in a +.a was compiled allocating x and loading from it. This is fixed in my git branch: testing it with opam: opam remote add comp https://github.com/chambart/opam-compilers-repository.git Gabriel:
will generate a generic call to (+.) I have another patch in preparation The main interest of avoiding boxing is to avoid some allocation, passing both Alain: Off topic: How do I remove an attached file ? Is the bug reporter |
Comment author: @alainfrisch Will the many changes since Jan. 2013, and the ones coming with flambda, this is going to take some work bringing this up to date. But it is certainly something worth pushing forward. |
Comment author: @alainfrisch I've started to work on a similar approach: #1322 |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
So here is my understanding of the history of this work:
My conclusions from this history are as follows:
I wonder if @alainfrisch is still planning to work on unboxing of direct call parameters? The criticisms in #1322 were varied:
|
@gasche : you're (unsurprisingly) good at reconstructing and summarizing the history, thanks! No, I don't have immediate plans to resume working on this topic. Mostly by lack of time; I still believe that supporting numerical code without forcing boxing on function boundaries is important. Btw, in the same vein, it could be good to support specialized calling conventions e.g. for functions returning multiple values (unallocated tuples). So perhaps a more general design would be needed. Note that #1322 was already adapted to use an opt-in strategy (through attributes); also, it would also be possible to avoid duplication of bodies by having the generic version be implemented as a wrapper around the specialized one (except that it could add extra boxing). |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
Coming back to this, I'd really love to see this work continue. I've come to realize that relying on flambda isn't enough - we need the normative performance to be good. Having bad floating-point performance by default is not a good place to be for a general purpose language. |
Original bug ID: 5894
Reporter: @chambart
Status: acknowledged (set by @damiendoligez on 2013-06-19T18:48:03Z)
Resolution: open
Priority: normal
Severity: feature
Target version: later
Category: back end (clambda to assembly)
Tags: patch
Related to: #5917
Monitored by: meyer tgazagna meurer @diml @jmeber @hcarty
Bug description
Functions taking floating points parameters always receive them boxed.
When a direct call is done, it is possible to specialise the call to
pass those parameters in registers. This also applies to boxed integers.
The return value can also be unboxed.
This allow avoiding all allocations in a function like
let (+) = Nativeint.add
let (-) = Nativeint.sub
let rec fib n =
if n < 2n then 0n + 1n else fib(n-1n) + fib(n-2n)
Additional information
This patch take care of not unboxing parameters when it is possible that this
could increase the number of allocation.
For instance a function like this
let bad (x:float) =
ignore [x] (* force boxing of x *)
bad will not be specialised because its parameter is used boxed inside its body.
In a case like:
let rec f x y =
let _ = x +. 1. in
let _ = y +. 1. in
h 1. x
and g a b =
let _ = a+.1. in
let _ = b+.1. in
f b 1.
and h c d =
let _ = c +. 1. in
let _ = d +. 1. in
let _ = g 1. c in
let _ = g 1. d in
bad d;
only the y parameter of f and a of g will be unboxed.
The return value will be unboxed if every possible returned value is unboxed.
For instance
x +. y is considered unboxed
but
if b
then x +. y
else List.hd l
is considered boxed because List.hd return an already boxed value.
Current problems:
Constants are considered boxed because they are already boxed at
compile time, but this may not be the right default choice. It can still be
circumvented by things like 0n + 1n in the fibonacci example.
Specialised functions code is duplicated:
those can be compiled only one time
value) could be compiled by adding a stub before the specialised version.
File attachments
The text was updated successfully, but these errors were encountered: