Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2.13.2 Vector prepending performance regression #12027

Closed
oxbowlakes opened this issue Jun 4, 2020 · 7 comments · Fixed by scala/scala#9036
Closed

Scala 2.13.2 Vector prepending performance regression #12027

oxbowlakes opened this issue Jun 4, 2020 · 7 comments · Fixed by scala/scala#9036

Comments

@oxbowlakes
Copy link

oxbowlakes commented Jun 4, 2020

I have a mature application currently running scala 2.13.1 and performing its daily task comfortably with -Xmx4g and taking about 30-60s. If I make the change to compile/run using 2.13.2 (with no other changes) and run, it goes out of memory (even when increased to -Xmx16g) after about 5 minutes of running. If I capture a stack trace when running (and when it throws an OutOfMemoryError), it looks like this:

04-Jun-2020 18:30:19:915: [stdout - STDOUT] [1]: java.lang.OutOfMemoryError: Java heap space
        at scala.collection.immutable.VectorBuilder.advance1(Vector.scala:1518)
        at scala.collection.immutable.VectorBuilder.advance(Vector.scala:1512)
        at scala.collection.immutable.VectorBuilder.addArr1(Vector.scala:1478)
        at scala.collection.immutable.VectorBuilder.$anonfun$addVector$1(Vector.scala:1492)
        at scala.collection.immutable.VectorBuilder.$anonfun$addVector$1$adapted(Vector.scala:1492)
        at scala.collection.immutable.VectorBuilder$$Lambda$810/0x000000080103f840.apply(Unknown Source)
        at scala.collection.immutable.VectorStatics$.foreachRec(Vector.scala:1794)
        at scala.collection.immutable.VectorStatics$.foreachRec(Vector.scala:1800)
        at scala.collection.immutable.VectorBuilder.addVector(Vector.scala:1800)
        at scala.collection.immutable.VectorBuilder.addAll(Vector.scala:1502)
        at scala.collection.immutable.Vector.appendedAll0(Vector.scala:211)
        at scala.collection.immutable.Vector1.appendedAll0(Vector.scala:394)
        at scala.collection.immutable.Vector.appendedAll(Vector.scala:199)
        at scala.collection.immutable.Vector.appendedAll(Vector.scala:113)
        at scala.collection.SeqOps.concat(Seq.scala:192)
        at scala.collection.SeqOps.concat$(Seq.scala:192)
        at scala.collection.AbstractSeq.concat(Seq.scala:1152)
        at scala.collection.IterableOps.$plus$plus(Iterable.scala:723)
        at scala.collection.IterableOps.$plus$plus$(Iterable.scala:723)
        at scala.collection.AbstractIterable.$plus$plus(Iterable.scala:920)

This is a single Vector being built up by repeatedly prepending Vectors which would typically be of size 1 and occasionally of size 2. There would be around 250_000 such vectors.

Here is the reproduction.

Scala 2.13.1

Welcome to Scala 2.13.1 (OpenJDK 64-Bit Server VM, Java 13.0.3).
Type in expressions for evaluation. Or try :help.

scala> (1 to 1_000_000).foldLeft(Vector.empty[Int]) { (v, i) => Vector(i) ++ v }
res0: scala.collection.immutable.Vector[Int] = Vector(1000000, 999999, 999998, 999997, 999996, 999995, 999994, 999993, 999992, 999991, 999990, 999989, 999988, 999987, 999986, 999985...

Scala 2.13.2

Welcome to Scala 2.13.2 (OpenJDK 64-Bit Server VM, Java 13.0.3).
Type in expressions for evaluation. Or try :help.

scala> (1 to 1_000_000).foldLeft(Vector.empty[Int]) { (v, i) => Vector(i) ++ v }
<WAIT FOR A LONG TIME>
@oxbowlakes oxbowlakes changed the title Scala 2.13.2 Vector regression Scala 2.13.2 Vector prepending regression Jun 4, 2020
@oxbowlakes
Copy link
Author

Note that this does not get any better using explicit prependedAll:

Welcome to Scala 2.13.2 (OpenJDK 64-Bit Server VM, Java 13.0.3).
Type in expressions for evaluation. Or try :help.

scala> (1 to 1_000_000).foldLeft(Vector.empty[Int]) { (v, i) => v prependedAll Vector(i) }
<WAIT FOR A LONG TIME>

@retronym
Copy link
Member

retronym commented Jun 4, 2020

I'm taking a look to understand why this used to perform okay.

But you'll get better performance in both cases with something like:

Vector.from((1 to 1_000_000).reverseIterator.flatMap(i => Vector(i)))

Internally, that will create a single VectorBuilder, append all the subsequences, then create the final result Vector.

@retronym
Copy link
Member

retronym commented Jun 4, 2020

The case n if this.size < (n >>> Log2ConcatFaster) && suffix.isInstanceOf[Vector[_]] => used to kick in for this example, but the new implementation didn't bring that forward. 😕

  // appendAll (suboptimal but avoids worst performance gotchas)
  override def appendedAll[B >: A](suffix: collection.IterableOnce[B]): Vector[B] = {
    import Vector.{Log2ConcatFaster, TinyAppendFaster}
    if (suffix.iterator.isEmpty) this
    else {
      suffix match {
        case suffix: collection.Iterable[B] =>
          suffix.size match {
            // Often it's better to append small numbers of elements (or prepend if RHS is a vector)
            case n if n <= TinyAppendFaster || n < (this.size >>> Log2ConcatFaster) =>
              var v: Vector[B] = this
              for (x <- suffix) v = v :+ x
              v
            case n if this.size < (n >>> Log2ConcatFaster) && suffix.isInstanceOf[Vector[_]] =>
              var v = suffix.asInstanceOf[Vector[B]]
              val ri = this.reverseIterator
              while (ri.hasNext) v = ri.next() +: v
              v
            case _ => super.appendedAll(suffix)
          }
        case _ => super.appendedAll(suffix)
      }
    }
  }

@szeiger
Copy link
Member

szeiger commented Jun 4, 2020

Right, appendedAll0 should implement this check for the LHS, too, in case it's a degenerated append like in this case. It only checks the RHS at the moment. And prependedAll doesn't check at all.

@retronym retronym changed the title Scala 2.13.2 Vector prepending regression Scala 2.13.2 Vector prepending performance regression Jun 5, 2020
@retronym
Copy link
Member

retronym commented Jun 5, 2020

Thanks, @szeiger. I'll add the prependedAll special cases to my PR, too.

@SethTisue SethTisue added this to the 2.13.3 milestone Jun 5, 2020
@SethTisue
Copy link
Member

SethTisue commented Jun 5, 2020

@oxbowlakes great catch! kudos for isolating & reporting

@som-snytt
Copy link

Also thanks for the circumspect locution, "mature application", in lieu of, "old code".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment