Optimise Kleisli with specialized `Function1` implementation #4211

bplommer · 2022-05-23T12:24:52Z

~~Adds an implementation of Kleisli that directly wraps a given F[B] rather than lifting it into a function.~~ Adds a StrictConstFunction1 type to optimise Kleisli lifted from effectful values. Intended to improve performance of tagless final algebras running in Kleisli. This is borne out by results of cats-effect deep bind benchmark, lifted to Kleisli in the obvious way - throughput on async increases by about 65%, on delay by about 54%, and on pure by about 12%.

cats-effect deep bind: IO

jmh:run -i 10 -wi 10 -f 2 -t 1 cats.effect.benchmarks.DeepBindBenchmark

Benchmark                (size)   Mode  Cnt      Score     Error  Units
DeepBindBenchmark.async   10000  thrpt   20   2961.801 ±  35.939  ops/s
DeepBindBenchmark.delay   10000  thrpt   20  11702.678 ± 429.010  ops/s
DeepBindBenchmark.pure    10000  thrpt   20  14763.333 ± 119.990  ops/s

cats-effect deep bind: Kleisli (before):

jmh:run -i 5 -wi 5 -f 1 -t 1 cats.effect.benchmarks.DeepBindBenchmark

Benchmark                (size)   Mode  Cnt     Score    Error  Units
DeepBindBenchmark.async   10000  thrpt    5  1144.643 ±  5.384  ops/s
DeepBindBenchmark.delay   10000  thrpt    5  2844.455 ± 46.374  ops/s
DeepBindBenchmark.pure    10000  thrpt    5  4395.099 ± 12.977  ops/s

cats-effect deep bind: Kleisli (first implementation in this PR):

jmh:run -i 5 -wi 5 -f 1 -t 1 cats.effect.benchmarks.DeepBindBenchmark

Benchmark                (size)   Mode  Cnt     Score    Error  Units
DeepBindBenchmark.async   10000  thrpt    5  1825.018 ± 21.151  ops/s
DeepBindBenchmark.delay   10000  thrpt    5  4857.402 ± 26.344  ops/s
DeepBindBenchmark.pure    10000  thrpt    5  5459.568 ± 59.821  ops/s

cats-effect deep bind: Kleisli (current implementation in this PR, with `StrictConstFunction1`):

jmh:run -i 5 -wi 5 -f 1 -t 1 cats.effect.benchmarks.DeepBindBenchmark

Benchmark                (size)   Mode  Cnt     Score     Error  Units
DeepBindBenchmark.async   10000  thrpt    5  1892.768 ±  28.334  ops/s
DeepBindBenchmark.delay   10000  thrpt    5  4397.772 ± 254.866  ops/s
DeepBindBenchmark.pure    10000  thrpt    5  4908.705 ±  28.870  ops/s

bplommer · 2022-05-23T13:41:20Z

Thought of a better way to do this, by specializing Function1 rather than Kleisli - converting back to draft.

…duce diff

johnynek · 2022-05-23T18:08:40Z

core/src/main/scala/cats/data/Kleisli.scala

@@ -204,7 +209,7 @@ sealed private[data] trait KleisliFunctions {
   * }}}
   */
  def liftF[F[_], A, B](x: F[B]): Kleisli[F, A, B] =
-    Kleisli(_ => x)


I wonder if we can find other places in cats where we have _ => x and replace with StrictConstFunction1 profitably.

Maybe! But see #4211 (comment)

core/src/main/scala/cats/data/StrictConstFunction1.scala

bplommer · 2022-05-24T08:21:47Z

The benchmarks for this implementation (added to the bottom of the PR description) aren't actually as good as the one from b576136 that specialised Kleisli rather than Function1, which surprised me a bit (though they're still a big improvement) - I changed a few things though, so I need to try changing things more incrementally to see what's making the difference.

bplommer · 2022-05-24T10:43:14Z

On further thought, I don't think we can so blithely assume referential transparency as this PR does - for example we provide a MonadThrow instance for Future that can be lifted into Kleisli, which I'm fairly sure this will break. Maybe instead of making this a drop-in change we need to provide new constructors with stronger constraints - possibly Defer?

Unfortunately I don't have time to think this through just now. Any thoughts welcomed!

Edit: Defer isn't going to give the needed guarantee - it still allows for non-RT constructors, e.g. in Eval. It seems like we'd need a marker trait that guarantees lack of non-RT constructors.

bplommer · 2022-05-24T14:35:03Z

Discussion on referential transparency taking place on Discord at https://discord.com/channels/632277896739946517/633329569402978313/978609454465564752 - the consensus seems to be that we can assume RT and there are no gurantees about what happens when that assumption is violated.

bplommer · 2022-05-25T12:07:10Z

Re-opening this for review - there's probably scope for some further optimisation but the benefits as they stand are pretty clear.

core/src/main/scala/cats/data/Kleisli.scala

bplommer added 3 commits May 23, 2022 13:09

Specialise Kleisli for LiftF

3cdd6b9

Fix doctest

0317323

Reverst changes to Kleisli methods that are overridden for liftF

b576136

bplommer marked this pull request as ready for review May 23, 2022 13:05

Simplified Kleisli specialization

1ae8957

bplommer marked this pull request as draft May 23, 2022 13:40

bplommer added 3 commits May 23, 2022 15:08

Simplified Kleisli specialization - separate StrictConstFunction1, re…

cfda4cc

…duce diff

Kleisli is final again

09334a6

Revert run -> apply

e9b9b5e

armanbilge added this to the 2.8.0 milestone May 23, 2022

johnynek reviewed May 23, 2022

View reviewed changes

Nested package declarations

79f4d61

bplommer added 3 commits May 24, 2022 16:25

Kleisli refinements

59e4bb4

Merge remote-tracking branch 'origin/main' into lifted-kleisli

09b47ba

add headers

9dcdb1c

bplommer marked this pull request as ready for review May 25, 2022 12:07

bplommer changed the title ~~Add specialised Kleisli implementation for Kleisli.liftF~~ Optimise Kleisli with specialized Function1 implementation May 25, 2022

bplommer commented May 25, 2022

View reviewed changes

core/src/main/scala/cats/data/Kleisli.scala Outdated Show resolved Hide resolved

johnynek previously approved these changes May 28, 2022

View reviewed changes

Revert incorrect change

81ff10c

bplommer dismissed johnynek’s stale review via 81ff10c May 29, 2022 12:59

johnynek approved these changes May 29, 2022

View reviewed changes

armanbilge approved these changes May 30, 2022

View reviewed changes

johnynek merged commit 0fd2fdb into typelevel:main May 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise Kleisli with specialized `Function1` implementation #4211

Optimise Kleisli with specialized `Function1` implementation #4211

bplommer commented May 23, 2022 •

edited

Loading

bplommer commented May 23, 2022

johnynek May 23, 2022

bplommer May 24, 2022

bplommer commented May 24, 2022 •

edited

Loading

bplommer commented May 24, 2022 •

edited

Loading

bplommer commented May 24, 2022

bplommer commented May 25, 2022

Optimise Kleisli with specialized Function1 implementation #4211

Optimise Kleisli with specialized Function1 implementation #4211

Conversation

bplommer commented May 23, 2022 • edited Loading

cats-effect deep bind: IO

cats-effect deep bind: Kleisli (before):

cats-effect deep bind: Kleisli (first implementation in this PR):

cats-effect deep bind: Kleisli (current implementation in this PR, with StrictConstFunction1):

bplommer commented May 23, 2022

johnynek May 23, 2022

Choose a reason for hiding this comment

bplommer May 24, 2022

Choose a reason for hiding this comment

bplommer commented May 24, 2022 • edited Loading

bplommer commented May 24, 2022 • edited Loading

bplommer commented May 24, 2022

bplommer commented May 25, 2022

Optimise Kleisli with specialized `Function1` implementation #4211

Optimise Kleisli with specialized `Function1` implementation #4211

bplommer commented May 23, 2022 •

edited

Loading

cats-effect deep bind: Kleisli (current implementation in this PR, with `StrictConstFunction1`):

bplommer commented May 24, 2022 •

edited

Loading

bplommer commented May 24, 2022 •

edited

Loading