PLT-834 Experiment: refactor CEK machine for adding a stepper #4909

thealmarty · 2022-10-13T17:31:41Z

This branch refactors the CEK machine such that we can add a stepper to it easier. It's for benchmarking purposes. Do not merge.

Pre-submit checklist:

Branch
- Tests are provided (if possible)
- Commit sequence broadly makes sense
- Key commits have useful messages
- Relevant tickets are mentioned in commit messages
- Formatting, PNG optimization, etc. are updated
PR
- (For external contributions) Corresponding issue exists and is linked in the description
- Targeting master unless this is a cherry-pick backport
- Self-reviewed the diff
- Useful pull request description
- Reviewer requested

thealmarty · 2022-10-13T17:34:33Z

/benchmark plutus-benchmark:validation

thealmarty · 2022-10-13T17:36:23Z

/benchmark plutus-benchmark:validation

kwxm · 2022-10-14T00:42:18Z

The automatic benchmarking is still broken. I'll run the benchmarks manually on the benchmarking machine and paste the results in here.

kwxm · 2022-10-14T02:06:41Z

Oh.

Script                              Old            New         Change
----------------------------------------------------------------------------------
auction_1-1                       182.9 μs	 1.847 ms    +909.8%
auction_1-2                       776.7 μs	 9.276 ms   +1094.3%
auction_1-3                       762.9 μs	 9.215 ms   +1107.9%
auction_1-4                       230.8 μs	 2.487 ms    +977.6%
auction_2-1                       183.4 μs	 1.866 ms    +917.4%
auction_2-2                       774.4 μs	 9.235 ms   +1092.5%
auction_2-3                       984.0 μs	 12.50 ms   +1170.3%
auction_2-4                       761.8 μs	 9.226 ms   +1111.1%
auction_2-5                       231.5 μs	 2.489 ms    +975.2%
crowdfunding-success-1            215.2 μs	 2.251 ms    +946.0%
crowdfunding-success-2            215.2 μs	 2.244 ms    +942.8%
crowdfunding-success-3            214.6 μs	 2.242 ms    +944.7%
currency-1                        279.4 μs	 3.237 ms   +1058.6%
escrow-redeem_1-1                 397.7 μs	 4.445 ms   +1017.7%
escrow-redeem_1-2                 398.8 μs	 4.449 ms   +1015.6%
escrow-redeem_2-1                 469.5 μs	 5.354 ms   +1040.4%
escrow-redeem_2-2                 470.1 μs	 5.361 ms   +1040.4%
escrow-redeem_2-3                 470.2 μs	 5.339 ms   +1035.5%
escrow-refund-1                   162.6 μs	 1.595 ms    +880.9%
future-increase-margin-1          279.9 μs	 3.226 ms   +1052.6%
future-increase-margin-2          640.6 μs	 7.620 ms   +1089.5%
future-increase-margin-3          640.3 μs	 7.583 ms   +1084.3%
future-increase-margin-4          598.4 μs	 6.768 ms   +1031.0%
future-increase-margin-5          966.6 μs	 11.08 ms   +1046.3%
future-pay-out-1                  279.1 μs	 3.225 ms   +1055.5%
future-pay-out-2                  638.9 μs	 7.671 ms   +1100.7%
future-pay-out-3                  638.3 μs	 7.620 ms   +1093.8%
future-pay-out-4                  963.2 μs	 11.05 ms   +1047.2%
future-settle-early-1             279.0 μs	 3.234 ms   +1059.1%
future-settle-early-2             638.3 μs	 7.548 ms   +1082.5%
future-settle-early-3             637.8 μs	 7.546 ms   +1083.1%
future-settle-early-4             737.6 μs	 7.651 ms    +937.3%
game-sm-success_1-1               451.7 μs	 4.874 ms    +979.0%
game-sm-success_1-2               200.8 μs	 2.107 ms    +949.3%
game-sm-success_1-3               763.8 μs	 9.248 ms   +1110.8%
game-sm-success_1-4               231.9 μs	 2.499 ms    +977.6%
game-sm-success_2-1               450.5 μs	 4.903 ms    +988.3%
game-sm-success_2-2               200.2 μs	 2.109 ms    +953.4%
game-sm-success_2-3               766.5 μs	 9.034 ms   +1078.6%
game-sm-success_2-4               232.9 μs	 2.489 ms    +968.7%
game-sm-success_2-5               766.4 μs	 9.092 ms   +1086.3%
game-sm-success_2-6               231.8 μs	 2.487 ms    +972.9%
multisig-sm-1                     470.9 μs	 5.157 ms    +995.1%
multisig-sm-2                     463.4 μs	 4.920 ms    +961.7%
multisig-sm-3                     463.0 μs	 5.008 ms    +981.6%
multisig-sm-4                     468.7 μs	 5.082 ms    +984.3%
multisig-sm-5                     679.1 μs	 7.714 ms   +1035.9%
multisig-sm-6                     470.4 μs	 5.183 ms   +1001.8%
multisig-sm-7                     457.6 μs	 4.944 ms    +980.4%
multisig-sm-8                     464.0 μs	 5.022 ms    +982.3%
multisig-sm-9                     468.0 μs	 5.100 ms    +989.7%
multisig-sm-10                    679.9 μs	 7.711 ms   +1034.1%
ping-pong-1                       382.0 μs	 4.174 ms    +992.7%
ping-pong-2                       382.9 μs	 4.188 ms    +993.8%
ping-pong_2-1                     221.5 μs	 2.363 ms    +966.8%
prism-1                           165.3 μs	 1.686 ms    +920.0%
prism-2                           487.6 μs	 5.191 ms    +964.6%
prism-3                           409.2 μs	 4.712 ms   +1051.5%
pubkey-1                          141.7 μs	 1.366 ms    +864.0%
stablecoin_1-1                    1.057 ms	 10.17 ms    +862.2%
stablecoin_1-2                    194.4 μs	 2.031 ms    +944.8%
stablecoin_1-3                    1.193 ms	 11.92 ms    +899.2%
stablecoin_1-4                    208.3 μs	 2.195 ms    +953.8%
stablecoin_1-5                    1.470 ms	 16.13 ms    +997.3%
stablecoin_1-6                    257.6 μs	 2.777 ms    +978.0%
stablecoin_2-1                    1.055 ms	 10.17 ms    +864.0%
stablecoin_2-2                    194.8 μs	 2.029 ms    +941.6%
stablecoin_2-3                    1.197 ms	 12.05 ms    +906.7%
stablecoin_2-4                    207.7 μs	 2.193 ms    +955.8%
token-account-1                   206.8 μs	 2.261 ms    +993.3%
token-account-2                   371.9 μs	 4.413 ms   +1086.6%
uniswap-1                         479.7 μs	 6.070 ms   +1165.4%
uniswap-2                         241.7 μs	 2.632 ms    +989.0%
uniswap-3                         1.877 ms	 24.12 ms   +1185.0%
uniswap-4                         340.7 μs	 3.746 ms    +999.5%
uniswap-5                         1.288 ms	 15.16 ms   +1077.0%
uniswap-6                         330.0 μs	 3.620 ms    +997.0%
vesting-1                         405.6 μs	 4.555 ms   +1023.0%

kwxm · 2022-10-14T02:10:30Z

I tried again and got similar results.

michaelpj · 2022-10-14T09:11:03Z

That's more or less what I'd expect!

michaelpj · 2022-10-14T09:34:46Z

Shower thoughts.

We really need to get rid of the creation of the state objects. I don't think there's any way GHC is going to be clever enough to eliminate those, the place where they're destructed is just way too far from where they're constructed. I think that means that instead we probably need to do something with continuations:

{-# INLINE computeStep #-}
computeStep :: (... -> r) -> .... -> r
computeStep k =
  go ...
    -- instead of Computing $ ...
    k ...

So computeStep gets inlined, at which point we hopefully know k, so GHC can optimize it properly. Except it's going to be more complicated than that because:

We need to pass continuations for all the different cases, so three of them.
The continuations are recursive with the main function! I'm not sure how to make this nice, might just have to try it.

kwxm · 2022-10-14T09:59:22Z

continuations

Oh no.

plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Evaluation/Machine/Cek/Internal.hs

effectfully · 2022-10-20T21:42:33Z

@michaelpj

We really need to get rid of the creation of the state objects.

No, we just need to recurse in the worker of the production machine and not recurse in the worker of the debugging machine. Something along these lines (pseudocode, doesn't type check, omitting all frames etc business for clarity):

enterComputeCek debug term = computeCekStep term where
    computeCek = if debug then computeCekDebug else computeCekStep
    {-# NOINLINE computeCek #-}  -- Making sure the `if` is only evaluated once.

    computeCekDebug term = pure $ Computing term

    computeCekStep (Force _ body) = computeCek body
    <other_clauses>

Here you'll still return a CekState on the production path, but it'll always be Terminating, because it's the good old recursive CEK machine, just wraps the result with Terminating for the purposes of unifying the interface for the two kinds of the machine. Given that the machine is tail-recursive, this additional wrapping will affect performance in a negligible way.

It's probably still going to be a bit slower (not by 10x for sure, though), because this alignment likely breaks compilation to join points, but then you can do the class with a type-level Bool trick and replace the NOINLINE pragma with the INLINE one and let GHC do its magic. You could also mark enterComputeCek as INLINE, apply it to False in a standalone definition and make sure everything inlines properly, but that is much less reliable than matching on a Bool at the type level.

michaelpj · 2022-10-21T10:16:10Z

Sneaky, yes indeed that would probably work. I was trying to abstract out the recursion, which would probably work worse anyway.

thealmarty · 2022-10-24T15:15:14Z

So I'm not sure we can say that we're sharing the code in this way. We still need to abstract out the "steps" for the debugging machine. It's basically a case match with a way of returning the same type at the end (CekState). I'm not sure what that's buying us?

effectfully · 2022-10-24T17:02:19Z

We still need to abstract out the "steps" for the debugging machine.

We still need Computing, Returning and Terminating and whatever logic they require, but we don't need to duplicate computeCek, returnCek, applyEvaluate etc (just like in your current PR, except with better performance).

I'm not sure what that's buying us?

Performance.

michaelpj · 2022-10-25T11:06:18Z

So I'm not sure we can say that we're sharing the code in this way. We still need to abstract out the "steps" for the debugging machine.

My understanding of what Roman is proposing is that it would not duplicate the logic for the steps. We would be sharing most of the code.

michaelpj · 2023-04-03T10:51:31Z

I think we went the other way for now, we can resurrect this if we need it.

thealmarty added Do not merge EXPERIMENT Experiments that we probably don't want to merge labels Oct 13, 2022

thealmarty added 2 commits October 13, 2022 10:35

Refactor CEK machine to add stepper.

9a0cd4c

Fix warnings.

619d1ee

thealmarty force-pushed the thealmarty/plt-834-experiment-refactor-cek branch from c21409b to 619d1ee Compare October 13, 2022 17:36

Replace env term with Closure datatype.

efbf5bb

thealmarty commented Oct 19, 2022

View reviewed changes

plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Evaluation/Machine/Cek/Internal.hs Show resolved Hide resolved

effectfully mentioned this pull request Oct 20, 2022

PLT-834: ADR for a steppable CEK machine for debugging. #4915

Merged

10 tasks

thealmarty mentioned this pull request Oct 27, 2022

PLT-1084 ADR for how to share code between production and debugging CEK machines. #4930

Merged

10 tasks

thealmarty closed this Dec 20, 2022

thealmarty reopened this Dec 20, 2022

michaelpj closed this Apr 3, 2023

kwxm deleted the thealmarty/plt-834-experiment-refactor-cek branch November 10, 2023 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PLT-834 Experiment: refactor CEK machine for adding a stepper #4909

PLT-834 Experiment: refactor CEK machine for adding a stepper #4909

thealmarty commented Oct 13, 2022

thealmarty commented Oct 13, 2022

thealmarty commented Oct 13, 2022

kwxm commented Oct 14, 2022

kwxm commented Oct 14, 2022

kwxm commented Oct 14, 2022

michaelpj commented Oct 14, 2022

michaelpj commented Oct 14, 2022

kwxm commented Oct 14, 2022

effectfully commented Oct 20, 2022

michaelpj commented Oct 21, 2022

thealmarty commented Oct 24, 2022

effectfully commented Oct 24, 2022 •

edited

michaelpj commented Oct 25, 2022

michaelpj commented Apr 3, 2023

PLT-834 Experiment: refactor CEK machine for adding a stepper #4909

PLT-834 Experiment: refactor CEK machine for adding a stepper #4909

Conversation

thealmarty commented Oct 13, 2022

thealmarty commented Oct 13, 2022

thealmarty commented Oct 13, 2022

kwxm commented Oct 14, 2022

kwxm commented Oct 14, 2022

kwxm commented Oct 14, 2022

michaelpj commented Oct 14, 2022

michaelpj commented Oct 14, 2022

kwxm commented Oct 14, 2022

effectfully commented Oct 20, 2022

michaelpj commented Oct 21, 2022

thealmarty commented Oct 24, 2022

effectfully commented Oct 24, 2022 • edited

michaelpj commented Oct 25, 2022

michaelpj commented Apr 3, 2023

effectfully commented Oct 24, 2022 •

edited