New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize baseTypeIndex #5829

Merged
merged 3 commits into from May 3, 2017

Conversation

Projects
None yet
4 participants
@retronym
Member

retronym commented Apr 5, 2017

Before:

[info] Benchmark                                   (extraArgs)  (source)    Mode  Cnt     Score   Error  Units
[info] HotScalacBenchmark.compile                                         sample  350  1468.390 ± 5.570  ms/op

After:

[info] Benchmark                                   (extraArgs)  (source)    Mode  Cnt     Score   Error  Units
[info] HotScalacBenchmark.compile                                         sample  385  1421.226 ± 5.465  ms/op

Which is a 3% improvement.

@scala-jenkins scala-jenkins added this to the 2.12.3 milestone Apr 5, 2017

@lrytz

lrytz approved these changes Apr 5, 2017

Very nice!

@@ -27,7 +27,6 @@ private[reflect] trait SynchronizedOps extends internal.SymbolTable
trait SynchronizedBaseTypeSeq extends BaseTypeSeq {
override def apply(i: Int): Type = gilSynchronized { super.apply(i) }
override def rawElem(i: Int) = gilSynchronized { super.rawElem(i) }
override def typeSymbol(i: Int): Symbol = gilSynchronized { super.typeSymbol(i) }

This comment has been minimized.

@lrytz

lrytz Apr 5, 2017

Member
Found 1 binary incompatibilities (21 were filtered out)
=======================================================
 * method typeSymbol(Int)scala.reflect.internal.Symbols#Symbol in interface
   scala.reflect.runtime.SynchronizedOps#SynchronizedBaseTypeSeq does not have
   a correspondent in current version

This comment has been minimized.

@retronym

retronym Apr 5, 2017

Member

I can restore this, but remove the lock which hasn't been necessary since 2.12.0, due to other refactorings.

@retronym

This comment has been minimized.

Member

retronym commented Apr 19, 2017

After rebasing, I've had to modify two recently added tests that no longer induce a stub symbol error.

@lrytz

This comment has been minimized.

Member

lrytz commented Apr 19, 2017

cc @jvican

@lrytz

lrytz approved these changes Apr 19, 2017

@adriaanm

This comment has been minimized.

Member

adriaanm commented Apr 24, 2017

Needs a rebase?

!! 1205 - run/t5717.scala                           [output differs]
% diff /home/jenkins/workspace/scala-2.12.x-validate-test@2/test/files/run/t5717-run.log /home/jenkins/workspace/scala-2.12.x-validate-test@2/test/files/run/t5717.check
@@ -1 +1 @@
-error: error writing a/B: java.nio.file.FileSystemException t5717-run.obj/a/B.class: Not a directory
+error: error writing a/B: t5717-run.obj/a/B.class: t5717-run.obj/a is not a directory

retronym added some commits Mar 13, 2017

Rework baseTypeIndex to use linear, rather than binary search
This appears to be faster, most likely due to the cost of the
`Symbol.isLess`, which needs to force the info of element's
type symbols to compute `baseTypeSeqLen`.
More compact base type sequence lookup array
Refactors the previous commit to store symbol ids (ints),
rather than references to symbols, in the lookup array.

This halves the space consumed by the elements of the array.
For instance, a 13 element base type symbol array will fit
in a 64 byte cache line.
@retronym

This comment has been minimized.

Member

retronym commented Apr 30, 2017

I gathered the distribution of base type seq lengths in our benchmark.

https://docs.google.com/spreadsheets/d/1rfVo8dvW6ajH_AtX8jPWs_J_nmHthiAaKUSV1ULLLO0/edit?usp=sharing

@adriaanm suggested the possibility of a hybrid scheme where we use a binary search until we narrow down to a small range and then switch to a linear search of the symbol id array at smaller ranges.

@retronym

This comment has been minimized.

Member

retronym commented Apr 30, 2017

Before and after this change, compiling scalap and better-files from scala/compiler-benchmark, we see a .93x reduction in base type seqs created now that we don't use Symbol#isLess, which forced baseTypeSeq.

#base type seqs               : 22045
avg base type seq length      : 7.5
  of which for compound types : 7353 (33.4%)
  of which for typerefs       : 14370 (65.2%)
  of which for singletons     : 2 (0.0%)
#base type seqs               : 20534
avg base type seq length      : 7.7
  of which for compound types : 6418 (31.3%)
  of which for typerefs       : 13794 (67.2%)
  of which for singletons     : 2 (0.0%)

Old code, for reference:

    /** A total ordering between symbols that refines the class
     *  inheritance graph (i.e. subclass.isLess(superclass) always holds).
     *  the ordering is given by: (_.isType, -_.baseTypeSeq.length) for type symbols, followed by `id`.
     */
    final def isLess(that: Symbol): Boolean = {
      def baseTypeSeqLength(sym: Symbol) =
        if (sym.isAbstractType) 1 + sym.info.bounds.hi.baseTypeSeq.length
        else sym.info.baseTypeSeq.length
      if (this.isType)
        (that.isType &&
         { val diff = baseTypeSeqLength(this) - baseTypeSeqLength(that)
           diff > 0 || diff == 0 && this.id < that.id })
      else
        that.isType || this.id < that.id
    }

    /**
     *  @param sym the class symbol
     *  @return    the index of given class symbol in the BaseTypeSeq of this type,
     *             or -1 if no base type with given class symbol exists.
     */
    def baseTypeIndex(sym: Symbol): Int = {
      val bts = baseTypeSeq
      var lo = 0
      var hi = bts.length - 1
      while (lo <= hi) {
        val mid = (lo + hi) / 2
        val btssym = bts.typeSymbol(mid)
        if (sym == btssym) return mid
        else if (sym isLess btssym) hi = mid - 1
        else if (btssym isLess sym) lo = mid + 1
        else abort("sym is neither `sym == btssym`, `sym isLess btssym` nor `btssym isLess sym`")
      }
      -1
    }
@retronym

This comment has been minimized.

Member

retronym commented Apr 30, 2017

I've added a worksheet to the spreadsheet above that shows the spread of results of baseTypeIndex. 60% of the queries return -1, which means they'll incur the full cost of the search (linear or binary...). Maybe we could get fancy with a bloom filter to cheaply answer some of those ones from an even smaller chunk of memory, but I don't feel like its worth it at this stage.

@adriaanm

This comment has been minimized.

Member

adriaanm commented May 1, 2017

Could you also gather some data on compiling the std lib / compiler?

@retronym

This comment has been minimized.

Member

retronym commented May 2, 2017

60% of the queries return -1, which means they'll incur the full cost of the search

... however, the majority of those are are against very short base type sequences (90% of failed baseTypeIndex queries are have baseTypeSeq.lengh <= 7.

Out of curiousity, I checked what symbols were involved.

The most common tp.baseTypeSeq.typeSymbol(0) in queries where tp.baseTypeSeq.baseTypeIndex(sym) == -1 were:

3299 class Unit
3774 class Codec
3962 package <root>
4340 package scalax
4430 package rules
4788 trait CanBuildFrom
4888 package scalasig
6976 package tools
7086 package scalap
8163 class Float
8384 class Double
8445 class Any
16021 class Int
16252 class Object
17328 class Nothing
17547 class Long

The winner is Long, from the formal parameter type of implicits in Predef.

On the other side, the most common sym-s, where tp.baseTypeIndex(sym) == -1:

1062 trait TraversableLike
1153 trait ClassTag
1250 class Tuple2
1292 object Some
1442 object Statics
1528 trait Rule
1584 object Predef
1725 trait Seq
2354 class Object
2484 package runtime
2847 class Float
2852 class Char
2942 object ScalaRunTime
3138 class Short
3262 class Double
3608 class Long
3638 class Byte
5984 class Boolean
6120 package scala
6387 class String
14181 class Int
19686 class Array
19921 trait Function1
26255 class <byname>

Inspecting the base type sequence of a package class looks fishy, at least some of the cases there are eliminated by a simple change the superaccessors, which I've added to this pull request.

@retronym

This comment has been minimized.

Member

retronym commented May 2, 2017

I also benchmarked compiling the library before/after this change. The results exceeded the 3% speedup I reported above. I won't take the exact numbers (~7-10%) as more than a guide, as I'm even less sure about how to manage the run-to-run noise when in compiling larger codebases under JMH.

@retronym

This comment has been minimized.

Member

retronym commented May 2, 2017

My suggestion is that we merge this change without implementing the suggestion of hybrid binary/linear search, but keep an eye out in our ongoing profiling for remaining hotspots in baseTypeIndex that might be helped by that approach.

@adriaanm

This comment has been minimized.

Member

adriaanm commented May 2, 2017

Sounds good!

@retronym retronym merged commit 7038e34 into scala:2.12.x May 3, 2017

6 checks passed

cla @retronym signed the Scala CLA. Thanks!
Details
combined All previous commits successful.
integrate-ide [5092] SUCCESS. Took 10 s.
Details
validate-main [5790] SUCCESS. Took 72 min.
Details
validate-publish-core [5576] SUCCESS. Took 5 min.
Details
validate-test [4909] SUCCESS. Took 66 min.
Details
@retronym

This comment has been minimized.

Member

retronym commented May 15, 2017

I've collected the distributions of base type sequence lengths, and the length of the search in baseTypeIndex for the compiler and library and added them to the spreadsheet. They show a slightly longer tail of longer sequences.

FTR, here's the code I added to generate the extra statistics: https://github.com/scala/scala/compare/cf47d4c...retronym:faster/baseytypeindex-stats?expand=1

*** Cumulative statistics at phase jvm
...
  baseTypeSeq length          : 4: 4251, 3: 3754, 2: 2501, 5: 1561, 8: 1350, 7: 1242, 1: 762, 6: 546, 11: 437, 13: 420, 23: 351, 9: 343, 10: 263, 17: 215, 18: 208, 38: 205, 14: 202, 12: 202, 24: 191, 21: 136, 15: 135, 45: 134, 16: 129, 25: 110, 39: 102, 20: 92, 27: 81, 30: 76, 22: 69, 26: 68, 19: 59, 37: 55, 35: 41, 29: 30, 44: 28, 36: 26, 46: 23, 31: 22, 48: 20, 33: 18, 40: 16, 47: 14, 41: 13, 28: 13, 43: 12, 32: 7, 56: 6, 55: 5, 51: 5, 34: 3, 0: 1
...
  baseTypeIndex result        : -3: 92823, 1: 34970, 0: 29070, 4: 28029, -2: 26738, 2: 24711, -1: 22824, -6: 22189, 3: 17062, -5: 14507, -4: 11535, 6: 10879, -8: 10243, 5: 7603, -7: 5844, 8: 4070, 7: 3213, 15: 2793, 18: 2429, 20: 2186, 13: 2096, 33: 1888, -23: 1884, 16: 1865, 22: 1834, 10: 1817, -10: 1767, 11: 1707, 9: 1671, -29: 1575, 19: 1540, 35: 1478, 28: 1335, -12: 1196, -21: 1194, 27: 1106, 26: 1075, 43: 1072, 21: 1029, 12: 1017, 14: 973, -13: 930, -20: 862, -14: 816, -9: 702, 25: 644, -28: 604, 29: 547, -38: 518, 23: 507, 24: 428, 37: 424, 36: 395, -31: 383, 34: 339, 17: 337, -11: 311, 42: 242, -35: 239, -26: 237, 30: 233, 31: 231, -27: 202, 45: 200, 38: 197, -17: 183, -24: 179, 32: 139, -44: 118, 40: 111, -19: 108, -22: 96, -30: 96, 39: 77, -43: 74, -25: 71, -16: 70, 44: 61, 41: 55, -18: 50, 46: 46, -36: 45, -40: 28, -33: 27, -51: 20, -41: 18, 47: 15, -46: 13, -47: 12, -39: 10, -37: 9, -48: 8, -45: 6, -15: 5, -32: 4, 51: 2, 50: 1, -55: 1
time in backend
pbpaste | perl -pne 's/:/\t/g' | perl -pne 's/,/\n/g' | sort -n

@retronym retronym referenced this pull request Jun 7, 2017

Merged

Faster base class sets #2676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment