Merge branch 'master' into koz/bitwise

IntersectMBO · Aug 8, 2022 · 3bc7b48 · 3bc7b48
2 parents 09dd2dd + 2f85fbf
commit 3bc7b48
Show file tree

Hide file tree

Showing 6 changed files with 34 additions and 23 deletions.
diff --git a/plutus-conformance/README.md b/plutus-conformance/README.md
@@ -17,11 +17,11 @@ The tests currently cover or will cover the Haskell and Agda implementation of:
 
 ## Adding/updating test outputs
 
-To update test outputs, use the accept test option of the tests. E.g., to have the test results overwriting the `.expected` files in the Haskell implementation test suite (`haskell-conformance`) , run:
+To update or add test outputs, use the accept test option of the tests. E.g., to have the test results overwriting the `.expected` files in the Haskell implementation test suite (`haskell-conformance`) , run:
 
 `cabal test haskell-conformance --test-options=--accept`
 
-There is also an executable (`add-test-output`) for adding test output with more refined options:
+There is also an executable (`add-test-output`) for adding test output to a specific directory:
 
 E.g., run
 
@@ -39,25 +39,33 @@ For the manual, run:
 
 ## Executable for Haskell implementation
 
-(WIP) `haskell-implementation` is an executable for Haskell implementation CLI testing/usage. 
+(WIP) `haskell-implementation` is an executable for Haskell implementation CLI testing/usage.
 
 ## The Plutus Conformance Test Suite Library
 
 The library provides functions that users can import and run conformance tests with their own implementation. At the moment the tests can only be run against another Haskell implementation. More support will be added later. Of course, one can wrap an arbitrary executable in Haskell. See an explanation [here](https://www.fpcomplete.com/blog/2017/02/typed-process/) and [the related documentation](https://www.stackage.org/haddock/lts-19.11/typed-process-0.2.10.1/System-Process-Typed.html).
 
 ## Untyped Plutus Core Program Evaluation
 
-The UPLC evaluation tests ensure conformance of evaluation of untyped plutus core programs. The expected output may contain:
+The UPLC evaluation tests ensure conformance of evaluation of untyped plutus core programs.
 
-1. "parse error"
+### The CEK machine
+
+Currently we have tested the conformance of UPLC evaluation against our *Haskell* and *Agda* implementations of the CEK machine. Note that we are not testing conformance of a *reducer*. We are testing an *evaluator*. One noticeable difference between a reducer and an evaluator is that a reducer reduces an ill-defined term to an `error` term while an evaluator has a special *error state*. See section 6.1 of our specification for more details. <!--TODO add link to the spec when it's ready. -->
+
+### Expected outputs
+
+The expected output may contain:
+
+#### "parse error"
 
 The input files are expected to have the concrete syntax. The expected output will show "parse error" when the parser fails to parse the input file.
 
-2. "evaluation error"
+#### "evaluation error"
 
 If evaluation fails with an error, the expected output will show "evaluation error".
 
-3. An untyped plutus core program
+#### An untyped plutus core program
 
 This means the input file successfully evaluates to the output program as per the specification. The evaluated program is represented in the concrete syntax.
 
@@ -93,13 +101,12 @@ The `plc` executable can be used to type check programs. Run `cabal run plc type
 
 ## Contributing
 
-We welcome contributions and comments to this repository. Feel free to open an issue. 
+We welcome contributions and comments to this repository. Feel free to open an issue.
 
-If we add the tests you share, we will acknowledge your contribution and post a link back to your repository. 
+If we add the tests you share, we will acknowledge your contribution and post a link back to your repository.
 
 ## Acknowledgement
 
 We are grateful to these external partners for their contributions:
 
 - Runtime Verification Inc. ([runtimeverification/plutus-core-semantics](https://github.com/runtimeverification/plutus-core-semantics/tree/master/tests))
-
diff --git a/plutus-conformance/agda/Spec.hs b/plutus-conformance/agda/Spec.hs
@@ -45,13 +45,9 @@ failingTests = [
     , "test-cases/uplc/evaluation/example/churchZero"
     , "test-cases/uplc/evaluation/example/force-lam"
     , "test-cases/uplc/evaluation/example/succInteger"
-    , "test-cases/uplc/evaluation/example/DivideByZero"
-    , "test-cases/uplc/evaluation/example/DivideByZeroDrop"
     , "test-cases/uplc/evaluation/example/churchSucc"
     , "test-cases/uplc/evaluation/term/lam"
     , "test-cases/uplc/evaluation/term/delay-lam"
-    , "test-cases/uplc/evaluation/term/error"
-    , "test-cases/uplc/evaluation/failure/ifThenElse-no-force"
     ]
 
 main :: IO ()

diff --git a/plutus-core/testlib/PlutusCore/Generators/NEAT/Spec.hs b/plutus-core/testlib/PlutusCore/Generators/NEAT/Spec.hs
@@ -92,7 +92,7 @@ tests genOpts@GenOptions{} =
 {- NOTE:
 
 The tests below perform multiple steps in a pipeline, they take in
-kind & type or type & term and then peform operations on them passing
+kind & type or type & term and then perform operations on them passing
 the result along to the next one, sometimes the result is passed to
 several operations and/or several results are later combined and
 sometimes a result is discarded. Quite a lot of this is inherently
@@ -139,7 +139,7 @@ prop_typePreservation tyG tmG = do
     evaluateCkNoEmit defaultBuiltinsRuntime tm `catchError` handleError ty
   withExceptT TypeError $ checkType tcConfig () tmCK (Normalized ty)
 
--- |Property: check if both the typed CK and untyped CEK machines produce the same ouput
+-- |Property: check if both the typed CK and untyped CEK machines produce the same output
 -- modulo erasure.
 --
 prop_agree_termEval :: ClosedTypeG -> ClosedTermG -> ExceptT TestFail Quote ()
@@ -443,7 +443,7 @@ _mapTest GenOptions{..} t f = testGroup "a bunch of tests" $ map (f t) examples
 -- | given a prop, generate one test
 packAssertion :: (Show e) => (t -> a -> ExceptT e Quote ()) -> t -> a -> Assertion
 packAssertion f t a =
-  case (runQuote . runExceptT $ f t a) of
+  case runQuote . runExceptT $ f t a of
     Left  e -> assertFailure $ show e
     Right _ -> return ()
 

diff --git a/plutus-core/testlib/PlutusCore/Generators/NEAT/Term.hs b/plutus-core/testlib/PlutusCore/Generators/NEAT/Term.hs
@@ -62,8 +62,8 @@ NOTE: We don't just want to enumerate arbitrary types but also normal
 
       Two approaches spring to mind:
 
-      1. We could define a seperate AST for normal types and possibly
-      also a seperate AST for terms with normalized types. This would
+      1. We could define a separate AST for normal types and possibly
+      also a separate AST for terms with normalized types. This would
       be the safest option as it would then be impossible to generate
       a non-normalized type that claimed to be normalized. This is how
       it's done in the metatheory package. The downside is that there

diff --git a/plutus-metatheory/src/Main.lagda b/plutus-metatheory/src/Main.lagda
@@ -553,12 +553,17 @@ postulate showU : TermU -> String
 
 {-# COMPILE GHC showU = T.pack . show #-}
 
+-- Note that according to the specification ◆ should reduce to an error term,
+-- but here the untyped PLC term evaluator
+-- matches the behaviour of the Haskell implementation:
+-- an error is thrown with ◆. 
 runU : TermU → Either ERROR TermU
 runU t = do
   tDB ← withE scopeError $ U.scopeCheckU0 (convTmU t)
   □ V ← withE runtimeError $ U.stepper maxsteps (ε ; [] ▻ tDB)
-    where ◆  → return (unconvTmU U.UError)
-          _    → inj₁ (runtimeError gasError)
+    where      
+    ◆ → inj₁ (runtimeError userError) -- ◆ returns a `userError` runtimeError.
+    _ → inj₁ (runtimeError gasError)
   return (unconvTmU (U.extricateU0 (U.discharge V)))
 
 {-# COMPILE GHC runU as runUAgda #-}

diff --git a/plutus-metatheory/test/TestNEAT.hs b/plutus-metatheory/test/TestNEAT.hs
@@ -126,11 +126,14 @@ prop_Term tyG tmG = do
   -- turn it into an untyped de Bruijn term
   tmUDB <- withExceptT FVErrorP $ U.deBruijnTerm tmU
   -- reduce the untyped term
-  tmUDB' <- withExceptT (\e -> (Ctrex (CtrexTermEvaluationFail "untyped CEK" tyG tmG))) $ liftEither $ runUAgda tmUDB
+  tmUDB' <- case runUAgda tmUDB of
+      Left (RuntimeError UserError) -> pure $ U.Error ()
+      _ -> withExceptT (\e -> Ctrex (CtrexTermEvaluationFail "untyped CEK" tyG tmG))
+          $ liftEither $ runUAgda tmUDB
   -- turn it back into a named term
   tmU' <- withExceptT FVErrorP $ U.unDeBruijnTerm tmUDB'
   -- reduce the original de Bruijn typed term
-  tmDB'' <- withExceptT (\e -> (Ctrex (CtrexTermEvaluationFail "typed CEK" tyG tmG))) $
+  tmDB'' <- withExceptT (\e -> Ctrex (CtrexTermEvaluationFail "typed CEK" tyG tmG)) $
     liftEither $ runTCEKAgda tmDB
   -- turn it back into a named term
   tm'' <- withExceptT FVErrorP $ unDeBruijnTerm tmDB''