Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further optimize IODeferred #3435

Merged

Conversation

armanbilge
Copy link
Member

@armanbilge armanbilge commented Feb 19, 2023

Just getting it into the record, unfortunately benchmarks showed that it made no difference at best, or slower at worst. The idea was to try to cache the get: IO[A] value in the atomic reference, instead of A, similar to what IOFiber does for join.

after

[info] Benchmark                   (count)   Mode  Cnt     Score   Error  Units
[info] DeferredBenchmark.cancel         10  thrpt   20    39.504 ± 1.192  ops/s
[info] DeferredBenchmark.cancel        100  thrpt   20     4.421 ± 0.127  ops/s
[info] DeferredBenchmark.cancel       1000  thrpt   20     0.401 ± 0.018  ops/s
[info] DeferredBenchmark.complete       10  thrpt   20    30.081 ± 3.468  ops/s
[info] DeferredBenchmark.complete      100  thrpt   20     7.963 ± 1.865  ops/s
[info] DeferredBenchmark.complete     1000  thrpt   20     1.374 ± 0.194  ops/s
[info] DeferredBenchmark.get            10  thrpt   20  1050.021 ± 9.871  ops/s
[info] DeferredBenchmark.get           100  thrpt   20   163.874 ± 1.474  ops/s
[info] DeferredBenchmark.get          1000  thrpt   20    16.155 ± 0.368  ops/s

before

[info] DeferredBenchmark.cancel         10  thrpt   20   42.230 ±  1.265  ops/s
[info] DeferredBenchmark.cancel        100  thrpt   20    4.343 ±  0.138  ops/s
[info] DeferredBenchmark.cancel       1000  thrpt   20    0.407 ±  0.016  ops/s
[info] DeferredBenchmark.complete       10  thrpt   20   31.948 ±  3.528  ops/s
[info] DeferredBenchmark.complete      100  thrpt   20    8.625 ±  1.620  ops/s
[info] DeferredBenchmark.complete     1000  thrpt   20    1.292 ±  0.267  ops/s
[info] DeferredBenchmark.get            10  thrpt   20  919.962 ± 90.952  ops/s
[info] DeferredBenchmark.get           100  thrpt   20  160.503 ±  1.369  ops/s
[info] DeferredBenchmark.get          1000  thrpt   20   20.408 ±  0.176  ops/s

@armanbilge armanbilge closed this Feb 19, 2023
@armanbilge armanbilge changed the title Further optimize IODeferred Attempt to optimize IODeferred Feb 19, 2023
@armanbilge armanbilge reopened this Apr 16, 2023
@armanbilge
Copy link
Member Author

armanbilge commented Apr 16, 2023

Let's try this again, now that we're actually using this specialization lol. Benchmarks incoming ...

@armanbilge
Copy link
Member Author

armanbilge commented Apr 17, 2023

Huh 🤔 complete got faster but get got much slower and that seems bonkers. This PR replaces an IO.defer(...) with a volatile read of an IO.pure(...).

This PR

Benchmark                   (count)   Mode  Cnt    Score    Error  Units
DeferredBenchmark.cancel         10  thrpt   20   44.488 ±  2.784  ops/s
DeferredBenchmark.cancel        100  thrpt   20    5.303 ±  0.117  ops/s
DeferredBenchmark.cancel       1000  thrpt   20    0.449 ±  0.018  ops/s
DeferredBenchmark.complete       10  thrpt   20   32.755 ±  5.155  ops/s
DeferredBenchmark.complete      100  thrpt   20    8.012 ±  1.811  ops/s
DeferredBenchmark.complete     1000  thrpt   20    1.367 ±  0.044  ops/s
DeferredBenchmark.get            10  thrpt   20  702.181 ± 16.228  ops/s
DeferredBenchmark.get           100  thrpt   20   74.141 ±  3.685  ops/s
DeferredBenchmark.get          1000  thrpt   20    7.310 ±  0.139  ops/s

series/3.x

Benchmark                   (count)   Mode  Cnt     Score    Error  Units
DeferredBenchmark.cancel         10  thrpt   20    44.243 ±  1.105  ops/s
DeferredBenchmark.cancel        100  thrpt   20     4.598 ±  0.179  ops/s
DeferredBenchmark.cancel       1000  thrpt   20     0.433 ±  0.025  ops/s
DeferredBenchmark.complete       10  thrpt   20    22.119 ±  7.781  ops/s
DeferredBenchmark.complete      100  thrpt   20     7.368 ±  1.470  ops/s
DeferredBenchmark.complete     1000  thrpt   20     0.970 ±  0.284  ops/s
DeferredBenchmark.get            10  thrpt   20  1639.649 ± 60.727  ops/s
DeferredBenchmark.get           100  thrpt   20   258.978 ± 12.768  ops/s
DeferredBenchmark.get          1000  thrpt   20    28.177 ±  0.913  ops/s

@armanbilge
Copy link
Member Author

There we go :)

Benchmark                    (count)   Mode  Cnt     Score     Error  Units
DeferredBenchmark.cancel          10  thrpt   20    47.890 ±   0.389  ops/s
DeferredBenchmark.cancel         100  thrpt   20     4.971 ±   0.110  ops/s
DeferredBenchmark.complete        10  thrpt   20    30.252 ±   5.486  ops/s
DeferredBenchmark.complete       100  thrpt   20     7.318 ±   2.162  ops/s
DeferredBenchmark.getAfter        10  thrpt   20  3532.888 ± 101.376  ops/s
DeferredBenchmark.getAfter       100  thrpt   20   985.239 ±  19.928  ops/s
DeferredBenchmark.getBefore       10  thrpt   20  1753.401 ±  17.655  ops/s
DeferredBenchmark.getBefore      100  thrpt   20   294.561 ±  13.756  ops/s

@armanbilge armanbilge changed the title Attempt to optimize IODeferred Further optimize IODeferred Apr 17, 2023
@durban
Copy link
Contributor

durban commented Apr 18, 2023

Is there a reason this can't be for series/3.4.x?

@armanbilge
Copy link
Member Author

Good question, technically no, but there are some existing conflicts between the branches due to asyncCheckAttempt that we've hit before, and I didn't feel like flaring them up again /shrug. Also 3.5.x is around the corner ... right? 😛

Copy link
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! :-)

@djspiewak djspiewak merged commit 6b20063 into typelevel:series/3.x Apr 18, 2023
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants