Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance bug: pushing let-alias inside loop body #5205

Closed
vicuna opened this issue Jan 17, 2011 · 3 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Jan 17, 2011

Original bug ID: 5205
Reporter: @alainfrisch
Assigned to: @xavierleroy
Status: closed (set by @xavierleroy on 2012-09-25T18:38:37Z)
Resolution: fixed
Priority: high
Severity: minor
Fixed in version: 3.13.0+dev
Category: ~DO NOT USE (was: OCaml general)
Has duplicate: #5288
Monitored by: @ygrek till "Julien Signoles" @hcarty @Chris00 @alainfrisch

Bug description

ocamlopt inlines some local bindings that are pure and used once. However, the notion of "used once" is syntactic, which allows some non-trivial expressions (typically field projection) to be pushed within a loop body instead of being evaluated once outside the loop.

Here is an example:

=================================================================
type t1 = {x:int}
type t2 = {y:t1}

let f {y = {x = x1}} {y = {x = x2}} {y = {x = x3}} {y = {x = x4}} =
(*
let x1 = abs x1 in
let x2 = abs x2 in
let x3 = abs x3 in
let x4 = abs x4 in
*)

let r = ref 0 in
for i = 1 to 1000000 do
for j = 1 to 1000 do
r := x1 + x2 + x3 + x4;
done
done;
!r

let () =
let r = {y = { x = 2 }} in
Printf.printf "%i\n%!" (f r r r r)

On a Linux x64 (Intel Core i7, 2.80 GHz), this program takes 2.34 seconds
and only 1.18 seconds after uncommenting the first lines in f. The effect of these lines is to force the field accesses to be done outside the loop. (On simpler loops, the processor does an amazing job of removing the indirection overhead.)

The same problem appear because of pushing field accessed inside lambdas, e.g.:

=================================================================
let rec f () = r := x1 + x2 + x3 + x4 in
for i = 1 to 1000000 do
for j = 1 to 1000 do
f ()
done
done;

(Timing goes from 5s to 1.5s by enabling the 'let xi = abs xi' lines.)

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 17, 2011

Comment author: @alainfrisch

The faulty optimization is implemented in Simplif.simplify_lets. I suggest to keep track of the depth where let variables are bound (bumping the depth when entering a lambda or loop body) and used, and not inline aliases which are used at a deeper level than their definition.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 17, 2011

Comment author: @alainfrisch

Suggested fix attached.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 17, 2011

Comment author: @alainfrisch

As an extra benefit, the patch avoids keeping references to useless parts of data structures.

I'm wondering whether inlining let-aliases used only once is really a good heuristic. Isn't it generally better to load from memory early in order to improve pipelining? Has someone done any benchmark to see if the heuristic is useful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.