Compiler performance #322

Open
adriaanm opened this Issue Mar 2, 2017 · 10 comments

Comments

Projects
None yet
6 participants
@adriaanm
Member

adriaanm commented Mar 2, 2017

Discuss ideas for improving compiler performance in this issue. Break out detailed issues and use the t:performance label.

Measure

Tune

  • Experiment with changes to the compiler to improve performance around 30%
  • Polish up these changes and apply them to 2.12.x or 2.13.x as appropriate
  • Find an optimal set of JVM parameters (e.g. -XX:MaxInlineLevel=18 seems to help)

@adriaanm adriaanm modified the milestone: 2.13 Mar 2, 2017

@retronym retronym self-assigned this Mar 27, 2017

@jvican

This comment has been minimized.

Show comment
Hide comment
@jvican

jvican Mar 28, 2017

Member

The numbers reported in scala/scala#5785 (comment) are terrific. Looking forward to this.

Member

jvican commented Mar 28, 2017

The numbers reported in scala/scala#5785 (comment) are terrific. Looking forward to this.

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym May 18, 2017

Member

Misc items to optimize:

Member

retronym commented May 18, 2017

Misc items to optimize:

@lrytz

This comment has been minimized.

Show comment
Hide comment
@lrytz

lrytz Jun 20, 2017

Member

Drive by comment: the unpickler traverses the index twice (https://github.com/scala/scala/blob/v2.12.2/src/reflect/scala/reflect/internal/pickling/UnPickler.scala#L82-L100). Maybe it's more efficient to store "children" and "symbolAnnotation" indicies in the first traversal.

Member

lrytz commented Jun 20, 2017

Drive by comment: the unpickler traverses the index twice (https://github.com/scala/scala/blob/v2.12.2/src/reflect/scala/reflect/internal/pickling/UnPickler.scala#L82-L100). Maybe it's more efficient to store "children" and "symbolAnnotation" indicies in the first traversal.

@lrytz

This comment has been minimized.

Show comment
Hide comment
@lrytz

lrytz Aug 7, 2017

Member

Another misc to optimize (better after the backend refactoring scala/scala#6012 is merged): The ClassWriter used ClassBType.jvmWiseLUB when computing stack map frames. The results are cached in the ASM ClassWriter, however we create a new ClassWriter for each class (and looking at its implementation, they cannot be re-used). It might make sense to cache the LUBs.

Member

lrytz commented Aug 7, 2017

Another misc to optimize (better after the backend refactoring scala/scala#6012 is merged): The ClassWriter used ClassBType.jvmWiseLUB when computing stack map frames. The results are cached in the ASM ClassWriter, however we create a new ClassWriter for each class (and looking at its implementation, they cannot be re-used). It might make sense to cache the LUBs.

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Aug 18, 2017

Member

Here are some options that can disable parts of the compiler for max perfomance.

Option Downside
-no-specialization More boxing of callsites using Tuple2, Function0/1/2, etc
-opt:l:none Default settings still run a DCE local optimization (why?)
-Yno-generic-signatures javac won't understand Scala generics

Collectively, these give a 0.88x change to compile times for the scalap benchmark. These are areas we could look at optimizing so they don't incur such a penalty in the first place.

Member

retronym commented Aug 18, 2017

Here are some options that can disable parts of the compiler for max perfomance.

Option Downside
-no-specialization More boxing of callsites using Tuple2, Function0/1/2, etc
-opt:l:none Default settings still run a DCE local optimization (why?)
-Yno-generic-signatures javac won't understand Scala generics

Collectively, these give a 0.88x change to compile times for the scalap benchmark. These are areas we could look at optimizing so they don't incur such a penalty in the first place.

@lrytz

This comment has been minimized.

Show comment
Hide comment
@lrytz

lrytz Aug 18, 2017

Member

Default settings still run a DCE local optimization (why?)

it's to avoid some ugly / unexpected bytecode. without it

def f(b: Boolean) = { if (b) throw new Exception(); 1 }

gives

    ILOAD 1
    IFEQ L1
    NEW java/lang/Exception
    DUP
    INVOKESPECIAL java/lang/Exception.<init> ()V
    ATHROW
    NOP
    NOP
    ATHROW

our code gen produces the following (seen by enabling AsmUtils.traceClassEnabled / traceClassPattern):

    ILOAD 1
    IFEQ L2
    NEW java/lang/Exception
    DUP
    INVOKESPECIAL java/lang/Exception.<init> ()V
    ATHROW
    GOTO L4
   L4
    ICONST_1
    IRETURN

But then the classfile writer replaces the GOTO with NOP; ... NOP; ATHROW (with the same bytecode size). See comment here.

You can get more nops, for example with def f = { ???; println("hi" + 1) }

Jason points out we could flag methods where we insert ATHROW after a genLoad of type Nothing, and run DCE only on those. We can use jardiff to see if there are other cases where DCE does some work.

Member

lrytz commented Aug 18, 2017

Default settings still run a DCE local optimization (why?)

it's to avoid some ugly / unexpected bytecode. without it

def f(b: Boolean) = { if (b) throw new Exception(); 1 }

gives

    ILOAD 1
    IFEQ L1
    NEW java/lang/Exception
    DUP
    INVOKESPECIAL java/lang/Exception.<init> ()V
    ATHROW
    NOP
    NOP
    ATHROW

our code gen produces the following (seen by enabling AsmUtils.traceClassEnabled / traceClassPattern):

    ILOAD 1
    IFEQ L2
    NEW java/lang/Exception
    DUP
    INVOKESPECIAL java/lang/Exception.<init> ()V
    ATHROW
    GOTO L4
   L4
    ICONST_1
    IRETURN

But then the classfile writer replaces the GOTO with NOP; ... NOP; ATHROW (with the same bytecode size). See comment here.

You can get more nops, for example with def f = { ???; println("hi" + 1) }

Jason points out we could flag methods where we insert ATHROW after a genLoad of type Nothing, and run DCE only on those. We can use jardiff to see if there are other cases where DCE does some work.

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Aug 21, 2017

Member

Jason points out we could flag methods where we insert ATHROW after a genLoad of type Nothing, and run DCE only on those.

scala/scala#6044

Member

retronym commented Aug 21, 2017

Jason points out we could flag methods where we insert ATHROW after a genLoad of type Nothing, and run DCE only on those.

scala/scala#6044

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Oct 4, 2017

Member

Various (ongoing/planned) efforts:

  • symbol table reuse
  • figure out why compiler is slower when run in sbt
  • (allow/document how to) disable specialization for code bases that don't need it
  • backend IO pipelining

More ambitious:

  • use something like dotty's type assigner after type checking instead of full typer (erasure, pattern matcher make heavy use of typer where a more focussed solution is possible)
  • mutable trees (reduce GC / tree copying)
  • optimize typing transforms (more lightweight modeling of scope than a full instance of typer)
Member

adriaanm commented Oct 4, 2017

Various (ongoing/planned) efforts:

  • symbol table reuse
  • figure out why compiler is slower when run in sbt
  • (allow/document how to) disable specialization for code bases that don't need it
  • backend IO pipelining

More ambitious:

  • use something like dotty's type assigner after type checking instead of full typer (erasure, pattern matcher make heavy use of typer where a more focussed solution is possible)
  • mutable trees (reduce GC / tree copying)
  • optimize typing transforms (more lightweight modeling of scope than a full instance of typer)
@sjrd

This comment has been minimized.

Show comment
Hide comment
@sjrd

sjrd Oct 4, 2017

Member

(allow/document how to) disable specialization for code bases that don't need it

It would be really, really nice if we could disable specialization in the entire Scala.js ecosystem, including on the standard library itself.

Member

sjrd commented Oct 4, 2017

(allow/document how to) disable specialization for code bases that don't need it

It would be really, really nice if we could disable specialization in the entire Scala.js ecosystem, including on the standard library itself.

@densh

This comment has been minimized.

Show comment
Hide comment
@densh

densh Jan 22, 2018

(allow/document how to) disable specialization for code bases that don't need it

It would be great to turn it off on Native as well.

densh commented Jan 22, 2018

(allow/document how to) disable specialization for code bases that don't need it

It would be great to turn it off on Native as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment