-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Honor inline always
even for functions with optional arguments and default values
#12526
Honor inline always
even for functions with optional arguments and default values
#12526
Conversation
Just tested with flambda : it doesn't seem to be impacted, the function is fully inlined (resulting in a constant), but surprisingly the code takes 0.9s to run. Here is the cmm code for the loop with flambda:
and here is the code with Closure (and this PR):
(I'm a bit puzzled at why this is twice faster than the flambda version...) |
Let's leave aside the "Just use flambda" answer and focus on this PR. As a last note, I can confirm the weird performance results (where the Flambda code appears faster but runs slower), and the answer is just alignment issues. Both versions end up producing the same code for the loop (the reference is not used, so its contents are not even computed; this is detected in the |
Well it's just that if the user asks to avoid inlining, it's better to honor the request. One might argue that it could be interpreted as "don't inline the body, but ok to inline the wrapper". But that doesn't fit immediately in the current code, which simply ignores the attribute (whatever his payload is) after the split (because we end up with several function, and the attribute is only supported for a single function). But I think that what is implemented in the current PR is more faithful to the author's request of not inlining. |
Yes, that's what Flambda does.
I'm not sure what you meant by that, but the attribute for the original function is faithfully propagated to the inner function. For Closure, this doesn't matter, as the only call to the inner function is in the wrapper and Closure can't inline the worker in the wrapper anyway.
I don't agree with that, and I don't think we should change something that is not broken without more evidence that it bothers users. |
Ok, I've pushed a commit so that the behavior only changes in the FTR, with Closure and a function with optional arguments (with default values):
I don't find that particularly useful, but I don't really care, and I can see the benefit of having a behavior slightly closer to flambda. |
@lthls : I added a changelog entry with you as a reviewer, but it's not clear to me whether you actually reviewed the code change or only agreed in principle with its description. Let me know! |
adf48aa
to
0803762
Compare
Do you think I should add a non-regression test (e.g. tracking memory allocation in my example)? I don't see any actual tests of inlining in the testsuite. |
inline always
even for functions with optional arguments and default values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a test for optimisations around optional arguments in testsuite/tests/asmcomp/optargs.ml
. You could add something with inlining attributes in this file if you have a test in mind.
And since the discussion here may prove interesting to readers of the code, you might want to add a reference to this PR in the comment that you've updated.
I'm fine with the PR as it currently stands though.
0803762
to
665f5ae
Compare
Thanks for reminding about the test I wrote :-) I've extended it to check non-regression for the problem fixed by this PR. I've rewritten the history of the branch to include the test first. This makes it easy to check that the test fails before the fix (and succeeds after!). The PR should be squashed-merged to avoid that commit that doesn't pass the testsuite. @lthls : if you are happy with the PR, can you merge it? |
154290d
to
d80896d
Compare
d80896d
to
3535c36
Compare
…default values (ocaml#12526) * Add a non regression test * Honor ocaml.inline even for function with option arguments and default values --------- Co-authored-by: Alain Frisch <alain@frisch>
(Note: this is only about the Closure middle-end; I haven't checked Flambda.)
Functions with optional arguments and default values are split into an inner function and wrapper, which is responsible for filling in missing arguments with their default values (and then tail-calling the inner function). The rationale is that the wrapper can then be inlined on call sites, avoiding in particular the need to wrap passed optional arguments with
Some
and deconstruct thatSome
immediately.However, we don't want this split in two cases:
inline always
==> we want to inline the whole function, not just the wrapper. (The way inlining currently works, after a split, the inner function is never inlined in the wrapper.)inline never
==> we should honor that, and not even inline the wrapper (which makes the split useless).This PR disables the split in these cases. One could try to be more clever, and also avoid the split if the inlining heuristics triggers on the whole function, but this is more tricky (the function size that determines if we are below the inlining threshold is computed on Clambda, but the split is done at the Lambda level). Honoring the
inline
attribute is simple and could be seen as a bug fix.Here is a micro-benchmark to illustrate the benefits of this PR:
With trunk, this takes 2.8s on my machine. With this PR, it takes 0.45s (6x faster), thanks to the float unboxing made possible by inlining.
Without the
inline
attributes, trunk and PR behave the same.With
inline never
, we have trunk->2.7s and PR->2.9s. The slowdown is intentional; it corresponds to that lack of inlining of the wrapper with the PR.Shall I create a non-regression test (perhaps based on observing Gc.allocated_bytes in the example above)?