New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cmm arithmetic optimisations #17

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
6 participants
@stedolan
Contributor

stedolan commented Feb 18, 2014

Some optimisations on Cmm expressions involving tagging or constants. Here's a (best-case) example:

let int_of_digits a b c = 
  100 * (Char.code a - Char.code '0') + 
   10 * (Char.code b - Char.code '0') +
    1 * (Char.code c - Char.code '0')

Trunk compiles this to:

 (+
   (+ (+ (* 200 (>>s (+ a/1020 -96) 1)) (* 20 (>>s (+ b/1021 -96) 1)))
     (* 2 (>>s (+ c/1022 -96) 1)))
   1)

This branch compiles this to:

(+ (+ (+ ( * a/1020 100) ( * b/1021 10)) c/1022) -10766)

Floating all of the constant operations, tags, etc. out of arithmetic means they can be merged into one addition.

Cmm arithmetic optimisations.
Constant additions and tagging operations are moved out of
subexpressions when possible. Often they can be merged.
Show outdated Hide outdated asmcomp/cmmgen.ml
match (c1, c2) with
| (Cconst_int n1, c2) -> mul_int (untag_int c1) (decr_int c2)
| (_, _) -> mul_int (decr_int c1) (untag_int c2)

This comment has been minimized.

@chambart

chambart Feb 19, 2014

Contributor

From the name of the function, it is not obvious that the result is the tagged result minus one. You should probably move the 'incr_int' here.

@chambart

chambart Feb 19, 2014

Contributor

From the name of the function, it is not obvious that the result is the tagged result minus one. You should probably move the 'incr_int' here.

Show outdated Hide outdated asmcomp/cmmgen.ml
when n1 > 0 && n2 > 0 && n1 + n2 < size_int * 8 ->
Cop(Clsl, [c; Cconst_int (n1 + n2)])
| (Cop(Caddi, [c1; Cconst_int n1]), Cconst_int n2)
when no_overflow_lsl n1 n2 ->

This comment has been minimized.

@chambart

chambart Feb 19, 2014

Contributor

It is probably a good idea to check that n2 is positive (or do that in no_overflow_lsl), I am not certain that the behaviour of shift is the same on all architectures.

@chambart

chambart Feb 19, 2014

Contributor

It is probably a good idea to check that n2 is positive (or do that in no_overflow_lsl), I am not certain that the behaviour of shift is the same on all architectures.

Show outdated Hide outdated asmcomp/cmmgen.ml
@@ -133,16 +134,13 @@ let mul_int c1 c2 =
sub_int (Cconst_int 0) c
| (c, Cconst_int n) | (Cconst_int n, c) when n = 1 lsl Misc.log2 n->
Cop(Clsl, [c; Cconst_int(Misc.log2 n)])

This comment has been minimized.

@chambart

chambart Feb 19, 2014

Contributor

You should use lsl_int here

@chambart

chambart Feb 19, 2014

Contributor

You should use lsl_int here

@stedolan

This comment has been minimized.

Show comment
Hide comment
@stedolan

stedolan Feb 23, 2014

Contributor

Thanks for the quick response, sorry it took me a while to get back. I've fixed the things you pointed out.

A couple of days ago, you said that it might have O(n^2) behaviour. I'm fairly sure it doesn't: add_int only recurses when it actually finds constants, which will only happen at the root of an expression generated by these functions.

In a test, compiling a function f x = x + x + ... + x, I did observe O(n^2) runtimes. But add_int was only called O(n) times, and the problem exists on older versions of ocamlopt.

Contributor

stedolan commented Feb 23, 2014

Thanks for the quick response, sorry it took me a while to get back. I've fixed the things you pointed out.

A couple of days ago, you said that it might have O(n^2) behaviour. I'm fairly sure it doesn't: add_int only recurses when it actually finds constants, which will only happen at the root of an expression generated by these functions.

In a test, compiling a function f x = x + x + ... + x, I did observe O(n^2) runtimes. But add_int was only called O(n) times, and the problem exists on older versions of ocamlopt.

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Feb 23, 2014

You can use Sys.word_size instead of redetecting it

You can use Sys.word_size instead of redetecting it

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Feb 23, 2014

Contributor

I know, I deleted the comment for that reason... The quadratic behaviour for a long addition is probably due to register allocation (or more precisely constraints generation).

Now this patch looks ok to me.

Contributor

chambart commented Feb 23, 2014

I know, I deleted the comment for that reason... The quadratic behaviour for a long addition is probably due to register allocation (or more precisely constraints generation).

Now this patch looks ok to me.

@vouillon

This comment has been minimized.

Show comment
Hide comment
@vouillon

vouillon Oct 10, 2014

Member

It would be great to optimize logical operations as well.

Member

vouillon commented Oct 10, 2014

It would be great to optimize logical operations as well.

@bobot

This comment has been minimized.

Show comment
Hide comment
@bobot

bobot Nov 11, 2014

Contributor

This PR have been discussed during the last developers' meeting, where I had the pleasure to be. To sum-up:

  • Style: add bar at the start of match
  • Overflow? seems good, no obvious mistake
  • Use Sys.word_size.
  • For the record: This optimization can break other optimization like let-introduction of shared terms but it should be a general win.
  • OK for integration when style and Sys.word_size will be fixed.
Contributor

bobot commented Nov 11, 2014

This PR have been discussed during the last developers' meeting, where I had the pleasure to be. To sum-up:

  • Style: add bar at the start of match
  • Overflow? seems good, no obvious mistake
  • Use Sys.word_size.
  • For the record: This optimization can break other optimization like let-introduction of shared terms but it should be a general win.
  • OK for integration when style and Sys.word_size will be fixed.
@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Dec 17, 2014

Contributor

Would it be possible for me to write & submit a fixed patch?

Contributor

DemiMarie commented Dec 17, 2014

Would it be possible for me to write & submit a fixed patch?

@stedolan

This comment has been minimized.

Show comment
Hide comment
@stedolan

stedolan Dec 18, 2014

Contributor

@drbo no objection from me! Sorry I haven't had a chance to fix this up recently.

Contributor

stedolan commented Dec 18, 2014

@drbo no objection from me! Sorry I haven't had a chance to fix this up recently.

Coding style fixes, finally.
Add | on initial match case, and use Sys.word_size
@stedolan

This comment has been minimized.

Show comment
Hide comment
@stedolan

stedolan Dec 29, 2014

Contributor

Finally fixed this, sorry everyone for the delay.

Contributor

stedolan commented Dec 29, 2014

Finally fixed this, sorry everyone for the delay.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Feb 7, 2015

Member

Merged in trunk@15820, thanks!

Member

gasche commented Feb 7, 2015

Merged in trunk@15820, thanks!

@gasche gasche closed this Feb 7, 2015

bactrian pushed a commit that referenced this pull request Feb 7, 2015

Cmm arithmetic optimisations.
Constant additions and tagging operations are moved out of
subexpressions when possible. Often they can be merged.

From Stephen Dolan:
  #17

git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@15820 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02

nojb added a commit to nojb/ocaml that referenced this pull request Apr 12, 2015

Cmm arithmetic optimisations.
Constant additions and tagging operations are moved out of
subexpressions when possible. Often they can be merged.

From Stephen Dolan:
  ocaml#17

git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@15820 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02

bactrian pushed a commit that referenced this pull request Apr 18, 2015

Cmm arithmetic optimisations.
Constant additions and tagging operations are moved out of
subexpressions when possible. Often they can be merged.

From Stephen Dolan:
  #17

git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@15820 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02

stedolan added a commit to stedolan/ocaml that referenced this pull request Aug 18, 2015

mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Jul 1, 2016

lpw25 pushed a commit to lpw25/ocaml that referenced this pull request Aug 22, 2016

Merge pull request #17 from ocamllabs/clone
Added continuation cloning

@stedolan stedolan deleted the stedolan:linear-constant-opts branch Mar 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment