Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make concatenation of empty chunks 600 times faster #7396

Merged

Conversation

RafalSumislawski
Copy link
Contributor

Concatenation of empty Chunks is very slow. This is caused by Empty match {case Empty => ???} being an expensive operation. Usually pattern machine on case object is very cheap, because their default equals is cheap. Not in this case. Empty inherits equals from Chunk.

The focus of this PR is to improve the performance of concatenation of empty chunks. In addition to that I've extended the benchmarks. There's more that can be done, but ensuring that there are no performance regressions takes a lot of time designing benchmarks and running them for hours, so I scaled the scope of the PR down to this very simple change which ensures that concatenation of empty chunks is as fast as concatenation of non-empty chunks.

This PR extends benchmark coverage of Chunk concatenation.

Benchmarks

Pay attention to the improvement in the leftConcat0 and rightConcat0 benchmarks. No regressions in other benchmarks.

Before

[info] Benchmark                                       (size)   Mode  Cnt          Score         Error  Units
[info] ChunkConcatBenchmarks.balancedConcatOnce         10000  thrpt   10  194930350.454 ± 2798462.583  ops/s
[info] ChunkConcatBenchmarks.balancedConcatRecursive    10000  thrpt   10      16074.025 ±      69.190  ops/s
[info] ChunkConcatBenchmarks.leftConcat0                10000  thrpt   10          1.768 ±       0.004  ops/s
[info] ChunkConcatBenchmarks.leftConcat1                10000  thrpt   10       1173.516 ±       2.286  ops/s
[info] ChunkConcatBenchmarks.leftConcat10               10000  thrpt   10      16301.090 ±      39.158  ops/s
[info] ChunkConcatBenchmarks.rightConcat0               10000  thrpt   10          1.858 ±       0.004  ops/s
[info] ChunkConcatBenchmarks.rightConcat1               10000  thrpt   10       1142.122 ±       2.001  ops/s
[info] ChunkConcatBenchmarks.rightConcat10              10000  thrpt   10      15818.502 ±     568.069  ops/s

After

After
[info] Benchmark                                      (size)   Mode  Cnt          Score         Error  Units
[info] ChunkConcatBenchmarks.balancedConcatOnce        10000  thrpt   10  196492469.987 ± 3233479.713  ops/s
[info] ChunkConcatBenchmarks.balancedConcatRecursive   10000  thrpt   10      16834.564 ±     256.764  ops/s
[info] ChunkConcatBenchmarks.leftConcat0               10000  thrpt   10       1144.024 ±       3.648  ops/s
[info] ChunkConcatBenchmarks.leftConcat1               10000  thrpt   10       1181.615 ±       5.951  ops/s
[info] ChunkConcatBenchmarks.leftConcat10              10000  thrpt   10      16262.446 ±      72.468  ops/s
[info] ChunkConcatBenchmarks.rightConcat0              10000  thrpt   10       1143.104 ±       1.150  ops/s
[info] ChunkConcatBenchmarks.rightConcat1              10000  thrpt   10       1149.469 ±       1.610  ops/s
[info] ChunkConcatBenchmarks.rightConcat10             10000  thrpt   10      15900.734 ±     167.511  ops/s

Side note: This should significantly improve performance of ZValidated.validateAll (zio-prelude) and other operations on ZValidated, because concatenation of often empty Chunks of logs is involved in these operations.

Copy link
Contributor

@adamgfraser adamgfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like just needs a fix for one linting error!

@RafalSumislawski
Copy link
Contributor Author

Fixed

Copy link
Contributor

@adamgfraser adamgfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@shawjef3
Copy link
Contributor

shawjef3 commented Oct 6, 2022

@RafalSumislawski I think your benchmark results are reversed

@RafalSumislawski
Copy link
Contributor Author

@shawjef3 no, why? These are ops/s (higher == better). 1.768 ops/s before this improvement. 1144.024 ops/s after.

@shawjef3
Copy link
Contributor

shawjef3 commented Oct 6, 2022

@RafalSumislawski jeez right! I do my own benchmarks with the inverse unit so this always gets me.

@adamgfraser adamgfraser merged commit ad7da11 into zio:series/1.x Oct 6, 2022
@shawjef3
Copy link
Contributor

shawjef3 commented Oct 6, 2022

Adding

    if (isEmpty) {
      that
    } else if (that.isEmpty) {
      self
    } else {

to the beginning of Chunk#++ improves performance even more.

Before

[info] Benchmark                                      (size)   Mode  Cnt          Score         Error  Units
[info] ChunkConcatBenchmarks.leftConcat0               10000  thrpt    5       1463.820 ±      13.910  ops/s
[info] ChunkConcatBenchmarks.rightConcat0              10000  thrpt    5       1459.460 ±      45.090  ops/s

After

[info] Benchmark                           (size)   Mode  Cnt      Score      Error  Units
[info] ChunkConcatBenchmarks.leftConcat0    10000  thrpt    5  75810.357 ± 1133.715  ops/s
[info] ChunkConcatBenchmarks.rightConcat0   10000  thrpt    5  67140.985 ±  825.722  ops/s

@shawjef3
Copy link
Contributor

shawjef3 commented Oct 6, 2022

Sorry I missed that this was closed. I'll make another PR.

adamgfraser pushed a commit that referenced this pull request Oct 17, 2022
* Fix Gen.instant (#7325)

* Make concatenation of empty chunks 600 times faster (#7396)

* Extend benchmark coverage of Chunk concatenation

* Improve performance of concatenation of empty Chunks

* replace unused identifier with _

* Improve Chunk concatenation performance when either chunk is empty. (#7399)

* Update things for Scala Native (still no running tests) (#7409)

* Scala Native: run tests

* Native: Run scala-native-loop in OneShot#get

* Native: Test / fork := false

* ci.yaml: testPlatforms on Scala 3

* ci.yaml: testPlatforms 180 min timeout

* ci.yaml: testPlatforms 360 min timeout

* ci.yaml: testPlatforms on Scala 3

* Update versions

* Update versions

* Update munit

* Update things for Scala Native (still no running tests)

* Update

* Scala 2.12.16, 2.13.8

* fix merge

Co-authored-by: LUC DUZAN <strokyl@users.noreply.github.com>
Co-authored-by: Rafał Sumisławski <rafal.sumislawski@gmail.com>
Co-authored-by: shawjef3 <shawjef3@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants