Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteString.toArray performance degradation in Scala 2.13 #28114

Closed
galbarm opened this issue Nov 4, 2019 · 16 comments · Fixed by #28116
Closed

ByteString.toArray performance degradation in Scala 2.13 #28114

galbarm opened this issue Nov 4, 2019 · 16 comments · Fixed by #28116
Assignees
Milestone

Comments

@galbarm
Copy link

@galbarm galbarm commented Nov 4, 2019

  println(s"${Instant.now} before")
  val b = new Array[Byte](1 * 1024 * 1024)
  var bb = ByteString.empty

  for (_ <- 0 to 1000){
    bb ++ b
  }

  println(s"${Instant.now} after")

The ouput of the code above in Scala 2.12 is:
2019-11-04T10:06:01.225Z before
2019-11-04T10:06:01.745Z after

in Scala 2.13 it is:
2019-11-04T10:06:33.833Z before
2019-11-04T10:06:36.729Z after

5.5 times more slow.

@galbarm

This comment has been minimized.

Copy link
Author

@galbarm galbarm commented Nov 4, 2019

An issue that is likely to be related:
akka/akka-http#2788

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Nov 4, 2019

Ouch, thanks for investigating and letting us know!

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Nov 4, 2019

Verified that it is around an order of magnitude slower in a jmh bench.

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Nov 4, 2019

@galbarm I cannot however find anywhere in the Akka HTTP WS infra (or elsewhere) calling ByteString.++(Array[Byte]), did you find it while profiling, can you point us to the use site?

@galbarm

This comment has been minimized.

Copy link
Author

@galbarm galbarm commented Nov 4, 2019

@johanandren I didn't find an exact spot calling ByteString.++(Array[Byte])

I witness 3 things:

  1. Slow server websocket speed only when using AkkaHttpServer via it's Play wrapper (play-akka-http-server) in Scala 2.13. Using standalone AkkaHttpServer, or Play wrapper over Netty did not have any issues.
    akka.http.play.WebSocketHandler currentFrameData ++= data in line 103 being the main suspect.
  2. ByteString & ByteIterator being the only classes having seperate code between Scala 2.12 & 2.13 (https://github.com/akka/akka/tree/master/akka-actor/src/main/scala-2.13%2B/akka/util)
  3. ByteString having at least one method (++) performing significantly more slowly in Scala 2.13

Left it for you to trying to connect the things together :)

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Nov 4, 2019

Both currentFrameData and data is a ByteString in https://github.com/playframework/playframework/blob/ab6697f494a000ba8d19768e9d04eed3bb8cdbf7/transport/server/play-akka-http-server/src/main/scala/akka/http/play/WebSocketHandler.scala#L103

There could perhaps be some other place where that is the case though, I think it'd be better to take a step back and profile the app on 2.13 and get concrete evidence about what is slow.

@plokhotnyuk

This comment has been minimized.

Copy link

@plokhotnyuk plokhotnyuk commented Nov 4, 2019

@johanandren could you please check if -XX:MaxInlineLevel=18 helps for Scala 2.13?

It can be related with scala/bug#11627

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Nov 4, 2019

@plokhotnyuk thanks for the tips, looks like in our case it is about overloads for toArray/copyToArray though.

@johanandren johanandren changed the title ByteString concat performance degradation in Scala 2.13 ByteString.toArray performance degradation in Scala 2.13 Nov 4, 2019
johanandren added a commit to johanandren/akka that referenced this issue Nov 4, 2019
johanandren added a commit to johanandren/akka that referenced this issue Nov 4, 2019
johanandren added a commit that referenced this issue Nov 6, 2019
johanandren added a commit to johanandren/akka that referenced this issue Nov 6, 2019
@johanandren johanandren added this to the 2.5.27 milestone Nov 6, 2019
@johanandren johanandren reopened this Nov 6, 2019
johanandren added a commit that referenced this issue Nov 19, 2019
@galbarm

This comment has been minimized.

Copy link
Author

@galbarm galbarm commented Dec 18, 2019

I'm trying to test with both Akka 2.5.27 & 2.6.1, but the issue seems to persist.
with akka-actor 2.6.1, the code

  println(s"${Instant.now} before")
  val b = new Array[Byte](1 * 1024 * 1024)
  var bb = ByteString.empty

  for (_ <- 0 to 1000){
    bb ++ b
  }

  println(s"${Instant.now} after")

still prints
2019-12-18T13:23:37.185Z before
2019-12-18T13:23:39.803Z after

While in Scala 2.12 it prints:
2019-12-18T13:24:30.926Z before
2019-12-18T13:24:31.394Z after

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Dec 18, 2019

I don't think we specifically did any work on ++, only on copyToArray which we could confirm that it degraded performance in Akka HTTP websockets. I'll take your reproducer linked from the Akka HTTP issue out for a spin and see what shows up in a profiler.

(And on what is going on with ++ as well)

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Dec 18, 2019

Doing a quick profiling of the sample app I can see that it uses an unfortunate path for ++ on 2.13, and we will improve that as it should be rather easy to do in a more optimal way, however, the sample app spends a very small percentage (~1.1%) on doing that so it is not what is dominating (and so is nothing else obvious, it's pretty evenly spread out over 1-2% tasks) that means fixing that won't give the web socket use case any fantastic speed ups.

Here's a flame graph of the app running a 30 second websocket stream of elements in from the client:
profile-play-2.7.svg.zip

@galbarm

This comment has been minimized.

Copy link
Author

@galbarm galbarm commented Dec 18, 2019

@johanandren I also took another spin for the websocket use case and I think I've found the problem.
Here's a code for demonstration:

object Main extends App{
  println(s"${Instant.now} before")

  for (_ <- 0 to 100){
    var bb = ByteString.empty
    for (_ <- 1 to 100){
      bb ++= ByteString(new Array[Byte](1000 * 100))
    }
    bb.toArray
  }

  println(s"${Instant.now} after")

}

The code above takes 1 sec to complete on Scala 2.12 and 7 secs on 2.13.
It is clearly being caused by having a ByteString composed of many smaller ones, and then invoking toArray

As for why it's related to the web sockets issue, in FrameEventParser.scala line 149:
val buffer = bytes.toArray[Byte]

With large ws messages, this is being called on a ByteString composed of many smaller ones.

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Dec 18, 2019

Ah, yes, I think Johannes eluded to that, the many small parts of a rope somewhere.

@johanandren

This comment has been minimized.

Copy link
Member

@johanandren johanandren commented Dec 20, 2019

Work on the composite bytestrings in progress in #28419

@galbarm

This comment has been minimized.

Copy link
Author

@galbarm galbarm commented Dec 29, 2019

@johanandren
Any ETA on this? Can I expect a fix in 2.6.2?
This is blocking our project from upgrading to Scala 2.13.

@raboof

This comment has been minimized.

Copy link
Member

@raboof raboof commented Dec 30, 2019

@galbarm that PR is looking pretty healthy so I think we can expect it to make it for 2.6.2 (or otherwise 2.6.3), likely somewhere in January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.