New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IO's encoding #90

Merged
merged 11 commits into from Nov 30, 2017

Conversation

7 participants
@alexandru
Member

alexandru commented Nov 28, 2017

Fixes #89 and improves performance.

This implementation is inspired by the internal encoding of the Monix Task — it's not an exact port because Monix's Task is cancellable and has to do more work because of that. Also some minor optimizations were left out for now until further benchmarks (e.g. having a separate state for IO.apply was left out), but the gist is here.

Benchmarking (last update: 2017-11-29)

The PR includes a JMH setup with benchmarks/vPrev and benchmarks/vNext as sub-projects for measuring the performance impact of changes, compared with whatever previous version we want.

In order to run the benchmarks one needs to execute the script:

./benchmarks/run-benchmarks-all

The results will be dumped in bechmarks/results.

ShallowBindBenchmark

This measures a plain tail-recursive flatMap loop. Previous version:

Benchmark                   (size)   Mode  Cnt     Score    Error  Units
ShallowBindBenchmark.async   10000  thrpt   20    49.247 ±  4.749  ops/s
ShallowBindBenchmark.delay   10000  thrpt   20  1589.975 ± 32.753  ops/s
ShallowBindBenchmark.pure    10000  thrpt   20  2556.298 ± 35.352  ops/s

PR changes:

Benchmark                   (size)   Mode  Cnt     Score    Error  Units
ShallowBindBenchmark.async   10000  thrpt   20    82.844 ± 14.171  ops/s
ShallowBindBenchmark.delay   10000  thrpt   20  3619.257 ± 49.135  ops/s
ShallowBindBenchmark.pure    10000  thrpt   20  6291.561 ± 52.802  ops/s

Over twice the throughput.

DeepBindBenchmark

This one measures a non-tail-recursive flatMap loop (like in issue #89). Previous version:

Benchmark                (size)   Mode  Cnt     Score     Error  Units
DeepBindBenchmark.async    3000  thrpt   20     2.704 ±   0.062  ops/s
DeepBindBenchmark.delay    3000  thrpt   20     4.354 ±   0.076  ops/s
DeepBindBenchmark.pure     3000  thrpt   20  2883.429 ± 103.256  ops/s

After PR changes:

Benchmark                (size)   Mode  Cnt     Score     Error  Units
DeepBindBenchmark.async    3000  thrpt   20   375.433 ±  19.089  ops/s
DeepBindBenchmark.delay    3000  thrpt   20  5485.506 ±  71.269  ops/s
DeepBindBenchmark.pure     3000  thrpt   20  7089.853 ± 111.027  ops/s

The differences are dramatic due to memory usage.

AttemptBenchmark

This one measures the performance of attempt, both for the happy path and for handling errors:
Previous version:

Benchmark                     (size)   Mode  Cnt     Score    Error  Units
AttemptBenchmark.errorRaised   10000  thrpt   20   349.944 ±  3.795  ops/s
AttemptBenchmark.happyPath     10000  thrpt   20  2282.834 ± 23.092  ops/s

After the PR changes:

Benchmark                     (size)   Mode  Cnt     Score    Error  Units
AttemptBenchmark.errorRaised   10000  thrpt   20  2150.224 ± 25.714  ops/s
AttemptBenchmark.happyPath     10000  thrpt   20  2284.496 ± 39.242  ops/s

The differences are dramatic for when errors get handled.

HandleErrorBenchmark

This one measure the performance of handleErrorWith, that is optimized in the new version.
So previous version:

Benchmark                         (size)   Mode  Cnt    Score     Error  Units
HandleErrorBenchmark.errorRaised   10000  thrpt   20  282.673 ±   3.525  ops/s
HandleErrorBenchmark.happyPath     10000  thrpt   20  979.575 ± 130.618  ops/s

After PR changes:

Benchmark                         (size)   Mode  Cnt     Score    Error  Units
HandleErrorBenchmark.errorRaised   10000  thrpt   20  1939.384 ± 14.190  ops/s
HandleErrorBenchmark.happyPath     10000  thrpt   20  3068.821 ± 37.069  ops/s

Moving Forward

More optimizations are possible, but at this point this provides a good baseline — other micro-optimizations can come in separate PRs, along with the proof that they work.

@alexandru alexandru changed the title from Optimize IO's encoding to WIP: Optimize IO's encoding Nov 28, 2017

@alexandru alexandru force-pushed the alexandru:new-io branch from 3c14c72 to 1bf9821 Nov 28, 2017

@alexandru alexandru changed the title from WIP: Optimize IO's encoding to Optimize IO's encoding Nov 28, 2017

@alexandru alexandru referenced this pull request Nov 28, 2017

Closed

Issues with recursion #89

@codecov-io

This comment has been minimized.

codecov-io commented Nov 28, 2017

Codecov Report

Merging #90 into master will increase coverage by 0.83%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master      #90      +/-   ##
==========================================
+ Coverage   85.93%   86.77%   +0.83%     
==========================================
  Files          19       20       +1     
  Lines         384      378       -6     
  Branches       21       27       +6     
==========================================
- Hits          330      328       -2     
+ Misses         54       50       -4

@alexandru alexandru changed the title from Optimize IO's encoding to WIP: Optimize IO's encoding Nov 28, 2017

alexandru added some commits Nov 28, 2017

}

/** Pops the next bind function from the stack, but filters out
* `Mapping.OnError` references, because we know they won't do

This comment has been minimized.

@mpilquist

mpilquist Nov 29, 2017

Member

s/Mapping.OnError/IOFrame.ErrorHandler/g

@mpilquist mpilquist self-requested a review Nov 29, 2017

@mpilquist

This looks great. Interested to see some updated benchmarks.

@pchlupacek

This comment has been minimized.

Collaborator

pchlupacek commented Nov 29, 2017

@alexandru this is excellent news. Btw does new IO encoding support the flatMapping over result, i.e. flatMap { result: Either[Throwable, A] => ??? } ?

@alexandru

This comment has been minimized.

Member

alexandru commented Nov 29, 2017

@pchlupacek the new internal encoding is optimized for flatMap-ing over both successful values and errors, however we are not exposing it in the API.

Monix's Task has a transformWith[B](f: a => Task[B], g: Throwable => Task[B]): Task[B] operator. I would like that in IO as well for efficiency reasons, but this PR is already too big and I'd propose that in a separate PR.

@pchlupacek

This comment has been minimized.

Collaborator

pchlupacek commented Nov 29, 2017

@alexandru yeah, would be excellent to have that flatMap-ish design if possible, with same parformance as normal flatMap.

@alexandru

This comment has been minimized.

Member

alexandru commented Nov 29, 2017

@pchlupacek note that these changes makes handleErrorWith much, much lighter for the happy path.

A transformWith would be better still, but even without it you should see big improvements with this PR for defensive code.

See the HandleErrorBenchmark, which is relevant here.

@alexandru alexandru changed the title from WIP: Optimize IO's encoding to Optimize IO's encoding Nov 29, 2017

@alexandru alexandru requested a review from djspiewak Nov 29, 2017

@djspiewak

This comment has been minimized.

Collaborator

djspiewak commented Nov 29, 2017

@alexandru This looks very impressive at first glance. Thanks for taking the time to do this! I won't have a chance to review it potentially for a few days. @mpilquist has already given his stamp of approval, so if you guys want to move forward, feel free to do so. :-) Otherwise, I'll get to it asap.

@non

This comment has been minimized.

Member

non commented Nov 29, 2017

👍 This looks great, thanks @alexandru!

@mpilquist

This comment has been minimized.

Member

mpilquist commented Nov 30, 2017

@alexandru Could you take a look at the compilation failure on the 2.10 build? Once that's fixed, I think we can go ahead and merge this given Daniel's response from yesterday.

@mpilquist mpilquist merged commit 964e8d0 into typelevel:master Nov 30, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@LukaJCB

This comment has been minimized.

Collaborator

LukaJCB commented Nov 30, 2017

Amazing work!

@alexandru

This comment has been minimized.

Member

alexandru commented Nov 30, 2017

@mpilquist thanks for the review and the merge.

@djspiewak when you have the time, please publish a hash version for testing purposes.

This might not be the last PR for performance optimizations. I'm tormenting myself with some profiling tools from Intel with an UI made in 1994 and I'm doing experiments, but as I said it would be better to introduce further optimizations piecemeal with some proof that they work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment