Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix a caching problem in OO code causing a 4.11 performance regression #10096

Merged
merged 3 commits into from Dec 26, 2020

Conversation

gasche
Copy link
Member

@gasche gasche commented Dec 22, 2020

The regression was introduced by #9096 in this part of the diff:

             | Tcf_method (name, _, Tcfk_concrete (_, exp)) ->
-                let met_code = msubst true (transl_exp exp) in
+                let scopes = enter_method_definition ~scopes name.txt in
+                let met_code =
+                  msubst true (transl_scoped_exp ~scopes exp) in

transl_exp would use Translobj.oo_wrap on functions, while the new
transl_scoped_exp does not.

The regression was first reported by @giltho in #10092, as it
results in a large slowdown (2x slower) on the Gillian codebase, which
uses an object-oriented visitor in a performance-critical way.

The regression was introduced by ocaml#9096 in this part of the diff:

             | Tcf_method (name, _, Tcfk_concrete (_, exp)) ->
-                let met_code = msubst true (transl_exp exp) in
+                let scopes = enter_method_definition ~scopes name.txt in
+                let met_code =
+                  msubst true (transl_scoped_exp ~scopes exp) in

transl_exp would use Translobj.oo_wrap on functions, while the new
transl_scoped_expr does not.

The regression was first reported by Sacha Ayoun in ocaml#10092, as it
results in a large slowdown (2x slower) on the Gillian codebase, which
uses an object-oriented visitor in a performance-critical way.
@gasche
Copy link
Member Author

gasche commented Dec 23, 2020

Here is a microbenchmark that reproduces the problem:

let test n =
  let obj = object
    method add42 i = 42 + i
  end in
  obj#add42 n

let n_iters = int_of_string Sys.argv.(1)

let () =
  for i = 1 to n_iters do
    ignore (test i)
  done

On my machine, ./oo.opt 10_000_000 takes

  • 0.20s on 4.10 (or with this PR)
  • 0.87s on 4.11 (or trunk)

In this microbenchmark, class initialization takes more time than actual usage of the object. I suppose that this corresponds to real-world use-cases where object-oriented visitors are created locally within traversal functions, but then used on very small inputs most of the time.

@garrigue
Copy link
Contributor

I see another occurrence of transl_scoped_exp inside Translcore.transl_bound_exp.
Is that one safe? It seems to me that it is again going to miss a call to oo_wrap.

@lpw25
Copy link
Contributor

lpw25 commented Dec 25, 2020

This PR fixes transl_scoped_exp so other calls to it should be fine.

The underlying problem was that transl_scoped_exp called straight into transl_function when it saw a function rather than going through transl_exp. This is so that it skips the logic in transl_exp0 that adds an additional scope when it sees a function, but it also skipped the logic for calling Translobj.oo_wrap by mistake.

@gasche
Copy link
Member Author

gasche commented Dec 25, 2020

The fix adds an oo_wrap in transl_scoped_exp itself, in such a way that transforming transl_exp in transl_scoped_exp should preserve the amount of oo-wrapping. In particular, the direct call to transl_scoped_exp which I pointed to is fixed, but transl_bound_exp (the other callsite for transl_scoped_exp) also gains the property that its amount of wrapping is equivalent to transl_exp, so its two calls callsites are now also safe. I believe that the regression observed in the wild (and my repro case) came from the callsite I pointed at, but the uses of transl_bound_exp may have been cause for regressions of their own (I am not sure).

@gasche
Copy link
Member Author

gasche commented Dec 25, 2020

Excellent, we now have two reviewers for this PR!

Copy link
Contributor

@lpw25 lpw25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the bug, but there is still a call to:

List.iter (Translattribute.check_attribute e) e.exp_attributes;

is transl_exp that is being inadvertently left out by transl_scoped_exp.

This pattern seems quite error prone. I think replacing transl_scoped_exp by an additional boolean parameter on transl_exp would be better. Perhaps renaming transl_exp to transl_exp1 and then having transl_exp and transl_scoped_exp that call into it with the boolean parameter set to true and false respectively.

@gasche
Copy link
Member Author

gasche commented Dec 25, 2020

@lpw25 Thanks for the suggestion. I pushed a new commit in the direction that you suggest. This makes the code nicer, but the change is also more invasive, so I guess it should be reviewed carefully.

Copy link
Contributor

@lpw25 lpw25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd probably have left transl_bound_exp as it was, but the patch is fine either way.

lambda/translcore.ml Show resolved Hide resolved
@gasche
Copy link
Member Author

gasche commented Dec 25, 2020

There is an interesting question here, asked at an interesting time of the year: do we want to backport this fix to 4.11 and make a new minor release? Personally I think this would be a good idea, given that (4.11 is still the last release and) we are fixing a performance regression that can be very large (2x slowdown) for some users workflow existing in the wild. (cc @Octachron)

@nojb
Copy link
Contributor

nojb commented Dec 25, 2020

There is an interesting question here, asked at an interesting time of the year: do we want to backport this fix to 4.11 and make a new minor release? Personally I think this would be a good idea, given that (4.11 is still the last release and) we are fixing a performance regression that can be very large (2x slowdown) for some users workflow existing in the wild. (cc @Octachron)

I haven't checked, but I suspect LexiFi's codebase (which uses objects heavily but has not yet been ported to 4.11) would be impacted by this. Based on that, I would be in favour of backporting to 4.11, even if there is no minor release done.

@gasche
Copy link
Member Author

gasche commented Dec 25, 2020

What I will do is to include this change in a "4.11 maintenance branch" section of the Changes, and backport in the 4.11 branch. We can decide having a bugfix release later on.

@Octachron Octachron added this to the 4.12 milestone Dec 26, 2020
Changes Outdated Show resolved Hide resolved
@Octachron
Copy link
Member

I agree that this seems worthwhile to backport this fix to both 4.12 and 4.11 .

@gasche gasche merged commit 9f53c6b into ocaml:trunk Dec 26, 2020
gasche added a commit that referenced this pull request Dec 26, 2020
fix a caching problem in OO code causing a 4.11 performance regression

(cherry picked from commit 9f53c6b)
gasche added a commit that referenced this pull request Dec 26, 2020
fix a caching problem in OO code causing a 4.11 performance regression

(cherry picked from commit 9f53c6b)
@gasche
Copy link
Member Author

gasche commented Dec 26, 2020

I cherry-picked in 4.11 ( 3e3d7f9 ) and 4.12 ( fc5061c ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants