Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation on 4.07.0+flambda #7849

Open
vicuna opened this Issue Sep 15, 2018 · 6 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

commented Sep 15, 2018

Original bug ID: 7849
Reporter: Eugene
Status: confirmed (set by @gasche on 2018-09-15T21:36:12Z)
Resolution: open
Priority: normal
Severity: tweak
Version: 4.07.0
Category: middle end (typedtree to clambda)

Bug description

Here is the code: https://pastebin.com/TfVzyzgC
If compiled with 4.05.0+flambda, it basically optimizes long (>>=) and map chain away.
On 4.07.0, however, it runs much longer and binary is bigger.

Steps to reproduce

$ ~/.opam/4.05.0+flambda/bin/ocamlopt t.ml -o test_05 -O3
$ ~/.opam/4.07.0+flambda/bin/ocamlopt t.ml -o test_07 -O3
$ time ./test_05
5000000100000000

real 0m0.170s
user 0m0.170s
sys 0m0.000s
$ time ./test_07
5000000100000000

real 0m1.293s
user 0m1.289s
sys 0m0.004s

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 15, 2018

Comment author: @gasche

I could reproduce the issue.

You need -O3 for the optimization to kick in, -O2 is not enough.

4.06.1 also performs the expected optimization, so the regression is between 4.06 and 4.07.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 15, 2018

Comment author: @gasche

Bisecting seems to indicate that the regression was introduced in

Flambda: Add [Closure_origin.t] to trace inlined functions (merged) (#1707)
#1707

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 16, 2018

Comment author: @lpw25

I haven't investigated in detail, but from the look of that code and the bisection, I would say that this is an expected regression. That commit is there to prevent the inliner getting into an infinite loop. It is difficult to tell the difference between the code in this example and code that causes an infinite loop.

I have a branch -- worked on in succession by two interns -- that rewrites how the inlining heuristics and limits work so that they have a predictable semantics. This branch should work properly on this kind of example, but it won't be ready until OCaml 4.09.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 16, 2018

Comment author: @lpw25

That commit is there to prevent the inliner getting into an infinite loop

I should clarify this part before someone suggests using a timer or a big counter.

That commit adjusts how we detect recursive function calls. Flambda's current semantics are supposed to limit the depth of recursive function calls to the (IIRC) "unrolling-depth" and there were cases before 4.07 where we would fail to detect the recursion. So the bug was that the unrolling depth was being ignored -- as a symptom of this bug we could also get infinite loops.

In this code, there are patterns which look like recursion to the recursion detection and so they are being artificially limited by the unrolling depth.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 17, 2018

Comment author: @damiendoligez

So the solution would be to use the -inline-max-unroll command-line option described in http://caml.inria.fr/pub/docs/manual-ocaml/flambda.html#speculation right?

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 17, 2018

Comment author: @lpw25

Possibly, although you might make things blow-up pretty quickly. I can't look at the example code right now due to a proxy (perhaps we should discourage using pastebin for reproduction cases), but IIRC it had genuine recursive calls mixed up with higher-order calls that look recursive to flambda. If you change the max unroll count then you're going to inline both of those cases. Same goes for annotating the calls with [@unroll n].

@vicuna vicuna added the middle-end label Mar 14, 2019

@nojb nojb added the flambda label Mar 16, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.