Kwxm/improve plc timing (SCP-1872) #2738

kwxm · 2021-02-16T10:22:23Z

This improves the timing feature in the plc command, so it should now be possible to quickly get reasonable benchmark times without having to set up lots of Criterion stuff. You can now say plc evaluate <prog> -X <N> to run a program N times and see the mean execution time. You can also use -x for 100 repetitions.

This turned out to be more difficult than I expected. I had a version using Criterion.Measurement.measure (see the documentation), which evaluates a function a given number of times and returns various statistics, including the total execution time. However, this gives odd results. It turns out that if you repeatedly execute a PLC program on the CEK machine the first evaluation can take significantly longer than subsequent ones (especially for programs which execute very quickly), perhaps because of warm-up effects. You can see the effect of this in the attached PDFs, which show the time reported by measure for 1,2,3,... executions of a couple of programs (one just multiplies 88 by 44, the other is a crowfunding validator). The reported average time starts off large and then drops asymptotically towards a stable value: this is because the atypically large first measurement drags the mean upwards, but the distortion decreases as large numbers of more typical measurements appear. Criterion's measure function doesn't retain the individual times, so it's impossible to adjust it. To overcome this I've written my own code which collects n+1 measurements and discards the first result (and if you look at the individual numbers you can see that the first one really is atypical). This gives much more consistent results. Possibly this isn't a sensible thing to do: in the real world the machine will be receiving a sequence of different validators to execute one after the other and maybe any warm-up effects won't have the opportunity to appear.

Here are the files mentioned above:
mult.pdf
crowdfunding.pdf

Next PR: adding an option to report budgeting information.

Pre-submit checklist:

Branch
- Commit sequence broadly makes sense
- Key commits have useful messages
- Relevant tickets are mentioned in commit messages
PR
- Self-reviewed the diff
- Useful pull request description
- Reviewer requested
If you updated any cabal files or added Haskell packages:
- nix-shell shell.nix --run updateMaterialized to update the materialized Nix files
- Update hie-*.yaml files if needed
If you changed any Haskell files:
- nix-shell shell.nix --run fix-stylish-haskell to fix any formatting issues
If you changed any Purescript files:
- nix-shell shell.nix --run fix-purty to fix any formatting issues

Pre-merge checklist:

Someone approved it
Commits have useful messages
Review clarifications made it into the code
History is moderately tidy; or going to squash-merge

kwxm · 2021-02-16T10:25:18Z

plutus-core/exe/Main.hs

@@ -647,13 +687,14 @@ runEval (EvalOptions language inp ifmt evalMode printMode printtime) =
                  CK  -> PLC.unsafeEvaluateCk  PLC.defBuiltinsRuntime
                  CEK -> PLC.unsafeEvaluateCek PLC.defBuiltinsRuntime
            body = void . PLC.toTerm $ prog
-        () <-  Exn.evaluate $ rnf body
+            !_ = rnf body


This seems to give more consistent results than Exn.evaluate. Is it causing deeper evaluation?

It really shouldn't. Even with Exn.evaluate removed the original line should be forcing the program fully.

I tried again and maybe I was imagining it, or there was something else going on. I'll take another look later just to be sure.

Yeah, not sure if there's really any difference. There do seem to be slight differences for very fast programs, but it could just be statistical noise. If there is a difference I don't think it's terribly significant.

effectfully

Really wondering what's up with that first atypical execution.

effectfully · 2021-02-16T11:13:07Z

plutus-core/exe/Main.hs

-languageMode :: Parser Language
-languageMode = typedPLC <|> untypedPLC
+languagemode :: Parser Language
+languagemode = typedPLC <|> untypedPLC


I liked the previous version more, but whatever.

Yeah, the others are camel case too

I did that because we've got types with names like LanguageMode, parsers for them with names like langaugeMode, and variables with names like languagemode, and there was some inconsistency in how the latter two were capitalised. It might be less confusing to have languageModeParser and so on.

effectfully · 2021-02-16T11:26:40Z

plutus-core/exe/Main.hs

@@ -647,13 +687,14 @@ runEval (EvalOptions language inp ifmt evalMode printMode printtime) =
                  CK  -> PLC.unsafeEvaluateCk  PLC.defBuiltinsRuntime
                  CEK -> PLC.unsafeEvaluateCek PLC.defBuiltinsRuntime
            body = void . PLC.toTerm $ prog
-        () <-  Exn.evaluate $ rnf body
+            !_ = rnf body


It really shouldn't. Even with Exn.evaluate removed the original line should be forcing the program fully.

plutus-core/exe/Main.hs

michaelpj · 2021-02-16T11:43:42Z

plutus-core/exe/Main.hs

-languageMode :: Parser Language
-languageMode = typedPLC <|> untypedPLC
+languagemode :: Parser Language
+languagemode = typedPLC <|> untypedPLC


Yeah, the others are camel case too

plutus-core/exe/Main.hs

michaelpj · 2021-02-16T11:44:54Z

plutus-core/exe/Main.hs

+          NoTiming -> case evaluate body of
+                        PLC.EvaluationSuccess v -> succeed v
+                        PLC.EvaluationFailure   -> exitFailure
+          Timing n -> timeEval n evaluate body >> exitSuccess


Seems like we should still use the exit status? Which I guess will be the same every time, but still.

Yeah. I was too lazy to do that, but since you spotted it I'll fix it.

michaelpj · 2021-02-16T11:46:38Z

plutus-core/exe/Main.hs

+we measure the evaluation time (n+1) times and discard the first result. -}
+timeEval :: NFData a => Integer -> (t -> a) -> t -> IO ()
+timeEval n evaluate prog =
+    if n <= 0 then error "Error: the number of repetitions should be at least 1"


Style nit: I'd probably use two clauses with the first one having a | n <=0 guard on it. Why? it means that when you do the do block for the second clause you save a level of indentation.

I'm not sure that that helped. Having the guards is a bit clearer, but then emacs indented it like this:

| otherwise = do (results, times) <- unzip . tail <$> for (replicate (fromIntegral (n+1)) prog) (timeOnce evaluate) let mean = (fromIntegral $ sum times) / (fromIntegral n) :: Double

It looks a bit odd with the stuff inside the do-block to the left of the guard. I can cope with that though.

plutus-core/exe/Main.hs

Co-authored-by: Michael Peyton Jones <michael.peyton-jones@iohk.io>

kwxm · 2021-02-17T06:06:46Z

Really wondering what's up with that first atypical execution.

I think the standard thing to do is to shrug your shoulders and say that it's probably got to do with caching. That's kind of plausible here: maybe the AST gets loaded into cache on the first iteration and then it's available quickly for later iterations. I got the results by using a shell loop to run plc repeatedly, and I suppose that'll wipe out the cache because you have to load the source file from disk and parse it each time, and you can't know that you'll get the same AST each time. I don't know how you'd check that though.

There are some other odd features: in the first graph it descends pretty steeply but then it clearly jumps up again at 159 iterations. The times for 150 to 170 look like this:

 4.651 4.466 4.744 4.478 4.392 4.469 4.406 5.173 4.463 7.167 7.018 6.878 7.264 7.105 6.749 6.725 6.716 6.855 6.642 6.881 8.822

Criterion does a garbage collection before each individual run, and each run does exactly the same thing, so I don't know why that would happen. There's another upwards jump at 995:

2.916  2.951  2.834  3.039  2.807  3.616  3.977  3.709  3.526  3.440  3.925

That's a tiny program though (the execution times are in microseconds), so it won't take a lot to disturb the numbers. For longer-running ones things seem to be a bit smoother.

kwxm added 3 commits February 15, 2021 08:09

Add parser code for plc timing options

8b84093

WIP: use own timing code

34b57e9

Improve timing feature in plc command

bd16d81

kwxm changed the title ~~Kwxm/improve plc timing~~ Kwxm/improve plc timing (SCP-1872) Feb 16, 2021

Remove debug code

7b920bd

kwxm commented Feb 16, 2021

View reviewed changes

kwxm requested review from michaelpj and effectfully February 16, 2021 10:28

kwxm added Costing Anything relating to costs, fees, gas, etc. Plutus Exe labels Feb 16, 2021

Correct spacing

f7dcf56

effectfully approved these changes Feb 16, 2021

View reviewed changes

michaelpj approved these changes Feb 16, 2021

View reviewed changes

kwxm and others added 5 commits February 16, 2021 14:28

Update plutus-core/exe/Main.hs

fa5d4db

Co-authored-by: Michael Peyton Jones <michael.peyton-jones@iohk.io>

Update plutus-core/exe/Main.hs

9175d64

Co-authored-by: Michael Peyton Jones <michael.peyton-jones@iohk.io>

Address review comments

21f935e

Address review comments

391e4f8

Address review comments

c14b7ad

kwxm merged commit d16b997 into master Feb 17, 2021

kwxm deleted the kwxm/improve-plc-timing branch February 17, 2021 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kwxm/improve plc timing (SCP-1872) #2738

Kwxm/improve plc timing (SCP-1872) #2738

kwxm commented Feb 16, 2021 •

edited

kwxm Feb 16, 2021

effectfully Feb 16, 2021

kwxm Feb 16, 2021

kwxm Feb 17, 2021 •

edited

effectfully left a comment

effectfully Feb 16, 2021

michaelpj Feb 16, 2021

kwxm Feb 16, 2021

effectfully Feb 16, 2021

michaelpj Feb 16, 2021

michaelpj Feb 16, 2021

kwxm Feb 16, 2021

kwxm Feb 17, 2021

michaelpj Feb 16, 2021

kwxm Feb 17, 2021

kwxm commented Feb 17, 2021 •

edited

Kwxm/improve plc timing (SCP-1872) #2738

Kwxm/improve plc timing (SCP-1872) #2738

Conversation

kwxm commented Feb 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwxm Feb 17, 2021 • edited

Choose a reason for hiding this comment

effectfully left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwxm commented Feb 17, 2021 • edited

kwxm commented Feb 16, 2021 •

edited

kwxm Feb 17, 2021 •

edited

kwxm commented Feb 17, 2021 •

edited