Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace dunai by automata/state machines #299

Merged
merged 10 commits into from
May 13, 2024
Merged

Replace dunai by automata/state machines #299

merged 10 commits into from
May 13, 2024

Conversation

turion
Copy link
Owner

@turion turion commented Mar 11, 2024

Motivation

Unfortunately there are two reasons that force me to drop the dependency on dunai, and replace it by an implementation of effectful Mealy state machines.

  1. dunai is incompatible with transformers 0.6, mtl 2.3, GHC 9.6: Allow GHC 9.6 #215, dunai: Does not build with transformers-0.6 because of ListT ivanperez-keera/dunai#402
  2. There are fundamental performance issues with dunai: Space leak in rhine-bayes example program? #227 dunai: morphGS is probably inefficient ivanperez-keera/dunai#370

Work done

This PR replaces dunai by a effectful Mealy state machines, in the initial encoding. The implementation is heavily inspired by https://github.com/lexi-lambda/incremental/blob/master/src/Incremental/Fast.hs and https://github.com/turion/essence-of-live-coding. It solves the two problems:

  1. The automaton implementation is compatible with transformers 0.6. Prominently, there is no support for the old ListT.
  2. A benchmark is added which shows dramatic performance improvements.

Benchmark

The benchmark is a simple word count implementation which counts the words of Shakespeare's complete works. Its purpose is to show how much overhead dunai and rhine introduce. It includes:

  • An idiomatic Rhine implementation
  • An idiomatic state automaton implementation (without Rhine clocks)
  • An idiomatic dunai implementation
  • Three direct implementations using text
    • Using IORefs
    • Reading line by line, but avoiding IORefs
    • Lazy text

For a non-Haskell baseline, the standard wc completes the benchmark in 30 ms on my reference machine.

Result with state automata

benchmarking WordCount/rhine
time                 239.8 ms   (233.2 ms .. 253.6 ms)
                     0.998 R²   (0.993 R² .. 1.000 R²)
mean                 214.1 ms   (196.9 ms .. 225.8 ms)
std dev              19.36 ms   (9.243 ms .. 27.70 ms)
variance introduced by outliers: 16% (moderately inflated)

benchmarking WordCount/automaton
time                 84.87 ms   (77.64 ms .. 92.24 ms)
                     0.992 R²   (0.986 R² .. 0.998 R²)
mean                 91.60 ms   (88.83 ms .. 94.73 ms)
std dev              5.279 ms   (3.818 ms .. 7.088 ms)
variance introduced by outliers: 18% (moderately inflated)

benchmarking WordCount/dunai
time                 900.6 ms   (789.3 ms .. 972.3 ms)
                     0.998 R²   (0.997 R² .. 1.000 R²)
mean                 848.3 ms   (821.7 ms .. 868.4 ms)
std dev              28.27 ms   (12.92 ms .. 39.29 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking WordCount/Text/IORef
time                 90.62 ms   (87.74 ms .. 94.72 ms)
                     0.997 R²   (0.994 R² .. 1.000 R²)
mean                 89.66 ms   (88.41 ms .. 90.98 ms)
std dev              2.116 ms   (1.502 ms .. 3.124 ms)

benchmarking WordCount/Text/no IORef
time                 154.8 ms   (148.1 ms .. 163.4 ms)
                     0.998 R²   (0.996 R² .. 1.000 R²)
mean                 152.5 ms   (149.9 ms .. 155.1 ms)
std dev              3.750 ms   (2.495 ms .. 5.029 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking WordCount/Text/Lazy
time                 128.0 ms   (102.9 ms .. 141.7 ms)
                     0.962 R²   (0.841 R² .. 0.996 R²)
mean                 136.6 ms   (126.1 ms .. 155.2 ms)
std dev              21.80 ms   (10.23 ms .. 34.99 ms)
variance introduced by outliers: 48% (moderately inflated)
  • The naive Haskell baseline is 3 times wc with around 90 ms. There are faster word count implementations than wc in Haskell, but this benchmark is about the overhead introduced by the frameworks, so I wrote a baseline implementation that is the fastest conceivable program which I can imagine a Rhine program being optimized to.
  • Automata achieve this baseline.
  • Not using IORef introduces factors of 1.2-1.5.
  • Rhine with automata is 2.5x slower. This is not ideal, but I find it acceptable. Further investigation would be necessary to find out what additional overhead is introduced.
  • dunai is far behind with a 10x slowdown.

Comparison: No automata

I will introduce the benchmark before this PR (#285). With Rhine depending on dunai, it is vastly slower: In the direction of 100x against wc, and still over 2x over dunai, which means that clock erasure does not optimize well. The abstractions introduced in dunai-dependent Rhine are far from zero-cost.

benchmarking rhine
time                 2.368 s    (2.173 s .. 2.609 s)
                     0.999 R²   (0.996 R² .. 1.000 R²)
mean                 2.984 s    (2.696 s .. 3.352 s)
std dev              410.3 ms   (14.38 ms .. 509.5 ms)
variance introduced by outliers: 24% (moderately inflated)

Open questions, to dos

  • Is there an existing automaton library published? I'm not aware of any. https://github.com/lexi-lambda/incremental/blob/master/src/Incremental/Fast.hs is not published or maintained, and all existing implementations (see machines) I'm aware of use the final encoding which doesn't optimize well.
  • Should Automata be a separate library (e.g. also listT)? In the long run certainly, but I'm of a mind to keep it in the rhine repository until this PR is merged, and pull it out later.
  • Merge Add benchmarks #285 first (benchmarks)
  • Clean up git history
  • Check whether Rhine now exports everything that it previously re-exported from dunai.
  • Make sure MSF has all the same type class instances that dunai MSF has
  • ResamplingBuffers should also have initial encoding, this should speed up larger SNs.
  • Separate issue for benchmarking & performance improving (further inlining?) of exception handling Benchmarks for automaton #311
  • Check whether stack.yaml can be simplified because of removed reverse dependencies & version bounds. Answer: No they cannot, I'm still using dunai in the benchmarks.
  • Separate issue: Benchmarks comparing purely automaton against e.g. streaming, machines etc. Benchmarks for automaton #311

rhine-bayes/app/Main.hs Outdated Show resolved Hide resolved
rhine-gloss/Main.hs Outdated Show resolved Hide resolved
rhine-gloss/Main.hs Outdated Show resolved Hide resolved
Copy link
Owner Author

@turion turion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Check whether all primitives in automata are inline
  • Fix copyrights, make clear that API is inspired heavily by dunai

rhine-gloss/src/FRP/Rhine/Gloss/IO.hs Outdated Show resolved Hide resolved
rhine/rhine.cabal Outdated Show resolved Hide resolved
rhine/src/Data/Automaton.hs Outdated Show resolved Hide resolved
rhine/src/Data/Automaton.hs Outdated Show resolved Hide resolved
rhine/src/Data/Automaton/Except.hs Outdated Show resolved Hide resolved
rhine/src/FRP/Rhine/Schedule.hs Outdated Show resolved Hide resolved
rhine/test/Automaton/MSF.hs Outdated Show resolved Hide resolved
rhine/test/Automaton/MSF.hs Outdated Show resolved Hide resolved
rhine/test/Schedule.hs Outdated Show resolved Hide resolved
rhine/test/Schedule.hs Outdated Show resolved Hide resolved
@turion turion force-pushed the dev_automata branch 3 times, most recently from fe28b13 to 7195216 Compare April 10, 2024 15:15
@turion turion force-pushed the dev_automata branch 4 times, most recently from 14986af to f8f7205 Compare April 18, 2024 15:03
rhine/src/Data/Automaton.hs Outdated Show resolved Hide resolved
@turion turion force-pushed the dev_automata branch 2 times, most recently from 09b273e to 8e7b836 Compare April 22, 2024 10:10
@turion turion force-pushed the dev_automata branch 3 times, most recently from 64f9a8b to 46240bd Compare May 3, 2024 17:55
@turion turion force-pushed the dev_automata branch 2 times, most recently from 30a3b78 to a7cfe80 Compare May 10, 2024 09:45
@turion
Copy link
Owner Author

turion commented May 10, 2024

CC @ners I'm pretty close to merging now :) one leftover FIXME and a final review from my side. In case you want to review, feel free!

@turion turion force-pushed the dev_automata branch 4 times, most recently from 4951fa0 to 4ef6c57 Compare May 13, 2024 07:40
@turion turion force-pushed the dev_automata branch 2 times, most recently from f2c35e4 to 6ea30bf Compare May 13, 2024 08:30
@turion turion merged commit 1991995 into master May 13, 2024
13 checks passed
@turion turion deleted the dev_automata branch May 13, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants