From 4eccf797d31a98d0ad19438f782d0630aad818e0 Mon Sep 17 00:00:00 2001 From: Maximilian Algehed Date: Tue, 23 Sep 2025 10:00:12 +0200 Subject: [PATCH] Import docs and start cleaning them up --- docs/DesignPrinciples.md | 1822 ++++++++++++++++++++++++++++++++++++++ docs/Manual.md | 1673 ++++++++++++++++++++++++++++++++++ 2 files changed, 3495 insertions(+) create mode 100644 docs/DesignPrinciples.md create mode 100644 docs/Manual.md diff --git a/docs/DesignPrinciples.md b/docs/DesignPrinciples.md new file mode 100644 index 0000000..ddbb287 --- /dev/null +++ b/docs/DesignPrinciples.md @@ -0,0 +1,1822 @@ +# Design Principles of the Constrained Generators System + +This document describes the design of the internals of the Constrained Generators System. +It attempts to explain how the system works. + +There is another document +*Constrained Generators Manual* that can be found +[here](https://github.com/input-output-hk/constrained-generators/blob/master/docs/DesignPrinciples.md). +This explains the user facing API, and teaches users to write specifications. It is +deliberately silent about the internals of the system. + +In this document we explore the principles that make the system work. To do that we deliberately omit some of the advanced +features having to do with explanations, interfacing with advanced features of Test.QuickCheck, existentials, reifications, +hints, and Generics. We expect another document to deal with those features, which add considerable complexity to the code base. +So we use a much simplified system to describe just the basic principles of how one can add logical properties to +a simple typed lambda calculus, to build a first order logic with typed functions. + +## Constrained Generators is a First-Order Logic + +A First-order typed logic (FOTL) has 4 components, where each component uses types to ensure well-formedness. + +1. **Terms** consisting of + - Variables: `x`, `y` . + - Constants: `5`, `"abc"`, `True` . + - Applications: `elem_ "abc" xs`, `x ==. y`. + - Applications apply a function symbol (i.e. `elem_`) to a list of Terms, + - In an infix application the function symbol lies between its two Term arguments .(i.e. `abc ==. y`) +2. **Predicates** (Assert (x ==. 5)). Predicates are the assertions of boolean typed terms. +3. **Connectives** (And, Or, Not, =>, ...). Connectives make more complex Predicates out of simpler ones. +4. **Quantifiers** (Forall, Exists) + +The **Constrained generators** system is a FOTL implemented as an embedded domain specific language in Haskell. +It allows programmers to write Haskell programs with type `Spec T` that denotes a set of random +values for the type `T`, that are subject to a set of constraints expressed as Predicates. This supports property +based testing where a completely random set of values may not be useful. + +## Overview of the Design Principles + +We list the set of Design Principles here. In the rest of the paper we will go over these in much +more detail. + +1. Every function symbol has `Syntax`, `Semantics`, and `Logic` capabilities. While the concepts of + `Syntax` and `Semantics` are familiar ones, `Logic` capabilities, include the the notion of property + propagation. Consider an application term `(f x 8)` that has some property + `P`, and mentions exactly one variable `x`. Propagation works as follows. For every function symbol `f`, we need to be able to do the following. + Given `(hasProperty (f x 8) P)` find property `Q` such that `(hasProperty x Q)` + +2. The FOTL for the Constrained Generators System is embedded in the Haskell datatypes `Spec a`, `Term a` and `Pred` + +3. Associated with some types `t`, is another type called the `TypeSpec T` . This an associated type family + of the class `HasSpec T`. If a type has a `HasSpec T` instance, then we can write solveable + constraints for a variable of type `T`. A `TypeSpec T` is a type that encodes type specific information + on how to generate constrained values for the type `T`. + +4. A set of constraints about a single variable of type `t` is collected into a `Spec t`. The type `Spec t` + is a type with several constructors, one of which embeds the type-specific `TypeSpec t`. The other constructors + embed non type-specific constraint properties. The type `(Spec t)` is how we encode properties for type `t`. + +5. The type `Spec t` has a Monoid instance. So two `Spec t` values can be combined `(spec1 <> spec2)`, which + describes a new `Spec t` where all the properties of `spec1` and `spec2` hold. + +6. We solve a set of constraints one variable at a time. The order in which we solve the variables is called + a `SolverPlan`. We can generate a random set of values that meet all the constraints, by picking the first variable + from the plan. Then choosing a random value for that variable that meets the constraints on it. Then + adding that value to an `Env`. Then the extended `Env` is substituted into the rest of the plan. This now has + one less variable. We then solve the next variable in the plan, until the plan is empty, in which case the Env, + contains a solution for every variable. + +7. Given a term `t :: Term T` and 1 or more `Pred` that only mention `t`, compute a `(Spec T)` from the `Pred`s. + This works by computing a `Spec` for each `Pred`, and then use the `Monoid` operation `(<>)` on `(Spec T)` to combine + them all into 1 comprehensive `(Spec T)`. This step is how we relate the type `Pred` to the type `Spec` + +8. We can extract a random value from a `Spec t`, by using the function `genFromSpec :: Spec t -> QuickCheck.Gen t` + +## What is the Constrained Generators System good for? + +Once we have written a `Specification` what can we do with it? Specifications have two interpretations. + +1. We can interpret them as a generator of values the meet all the constraints inside the specification. +2. We can interpret them as a function that checks if a given value meets all the constraints inside the specification. + + +The first interpretation of the specification is the function `genFromSpec` + +``` +--| Generate a value from the spec in the QuickCheck 'Gen' monad +genFromSpec:: (HasCallStack, HasSpec a) => Specification a -> QuickCheck.Gen a +``` + +This function is very useful when writing QuickCheck properties. With it we can write +`Gen` instances that are not completely random, but instead meet a set of constraints. +This is the actual purpose of the system. Construct mostly random values, that meet some constraints +that inter-relate many different objects. + +Consider a system of 4 objects, named by the variables, _w,x,y,z_ where we want to test the QuickCheck *implication* property + +`(w < x && x < y && y < z) ==> property (w < z)` We might write a QuickCheck property like this + +``` +prop1 :: Gen Property +prop1 = do + (w,x,y,z) <- arbitrary :: Gen (Int,Int,Int,Int) + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +The problem with this is that the probability that the condition `(w < x && x < y && y < z)` is True, for random +`w`, `x`, `y`, and `z`, is pretty low, so the property will pass vacuously most of the time, making a poor test. +We can observe this by running the property. + +``` +ghci> quickCheck prop1 +*** Gave up! Passed only 29 tests; 1000 discarded tests. +``` + +A vacuous pass, becomes a QuickCheck `discard`, so we cannot find even 100 successful passes. +We can do a better job by constraining a semi-random set of 4 variables by the the condition `(w < x && x < y && y < z)`, +then use `genFromSpec` to get the constrained set of random values. We do this in two steps. +First write a specification that expresses the constraint `(w < x && x < y && y < z)`, then second, +embed the call to `genFromSpec` in a QuickCheck property. + +``` +spec4 :: Spec (Integer, Integer, Integer, Integer) +spec4 = constrained $ \x -> + match x $ \a b c d -> And [Assert $ a <=. b, Assert $ b <=. c, Assert $ c <=. d] + + +prop2:: Gen Property +prop2 = do + (w,x,y,z) <- genFromSpec spec1 + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +Now the test passes. + +``` +ghci> quickCheck prop2 ++++ OK, passed 100 tests. +``` + +Now this isn't a very good test either, since the precedent is alway true. A better solution would be to +generate a mix, where the precedent is True most of the time, but sometimes False. + +``` +prop3:: Gen Property +prop3 = do + (w,x,y,z) <- frequency [(9,genFromSpec spec1),(1,arbitrary)] + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +Observe the result: + +``` +ghci> quickCheck prop3 ++++ OK, passed 100 tests; 7 discarded. +``` +This makes it possible to write conditional 'implication' properties that have a high +probability of not being vacuously true. + +## The FOTL is embedded in the Haskell datatypes `Term a` and `Pred` + +Now lets look closely at the specification, identifying the different parts which make it a statement in a *FOTL* + +``` +spec4 :: Spec (Integer, Integer, Integer, Integer) +spec4 = constrained $ \x -> + match x $ \a b c d -> And [Assert $ a <=. b, Assert $ b <=. c, Assert $ c <=. d] +``` +1. `constrained` says we are writing a `Spec` and the Haskell lambda expression introduces the the variable `x` that + names the tuple of 4 integers. +2. `match` says we are decomposing the 4-tuple, and naming each of its components `a`, `b`, `c`, `d`. +3. `And` is a connective, saying we are making 1 big predicate out of 3 smaller ones. +4. `Assert` says we are asserting that something is true. `Assert` makes a predicate out of a Boolean valued term. +5. `a <=. b` is a Boolean valued term, and `(<=.)` an infix function symbol. + +Here are the complete definitions for `Term` and `Pred` + +``` +data Term a where + -- ^ An application of a function symbol `t` to a list of arguments + App :: + forall t dom rng. + AppRequires t dom rng => + t dom rng -> + List Term dom -> + Term rng + -- ^ A literal constant + Lit :: (Typeable a, Eq a, Show a) => a -> Term a + -- ^ A variable + V :: HasSpec a => Var a -> Term a + +data Pred where + ElemPred :: forall a. HasSpec a => Bool -> Term a -> NonEmpty a -> Pred + And :: [Pred] -> Pred + Exists :: ((forall b. Term b -> b) -> GE a) -> Binder a -> Pred + ForAll :: (Container t a, HasSpec t, HasSpec a) => Term t -> Binder a -> Pred + DependsOn :: (HasSpec a, HasSpec b) => Term a -> Term b -> Pred + Assert :: Term Bool -> Pred + TruePred :: Pred + FalsePred :: NonEmpty String -> Pred + Case :: HasSpec (Either a b) => Term (Either a b) -> Binder a -> Binder b -> Pred + Let :: Term a -> Binder a -> Pred + Subst :: HasSpec a => Var a -> Term a -> Pred -> Pred +``` + +## Every function symbol has `Syntax`, `Semantics`, and `Logic` properties. + +A function symbol can have three kinds of operations. + +1. Syntax. Used, if we want to display the symbol. +2. Semantics. Used, if we want to evaluate a `Term` with that symbol, i.e. `5 <=. 1` evalutes to `False`. +3. Logic. Used, if we want to reason about a `Term` with that symbol, + - i.e. if Term `(x +. 2)` has property `elem_ (x +. 2) [3..6]` + - what does that say about variable `x`. It says that `elem_ x [1..4]` + - This logic operation is called propagation. For any function symbol `f` and any property `p`, + if `(f x) has p`, then what property `q` makes `x has q` true? Propagation computes properties about + the variable in an application, from the property about the application itself. + +Note that if a Term is not inside an `Assert` all we need are the `Syntax` and `Semantic` properties. +Every function symbol inside an `Assert` must have `Logic` properties. Here is how we capture this +in the Constrained Generators System. Constructors of a type with kind `([Type] -> Type -> Type)` +are candidates for function symbols. We make the constuctors real function symbols by +making `Syntax`, `Semantics` and `Logic` instances for that type. + +``` +-- | Syntactic operations are ones that have to do with the structure and appearence of the type. +class Syntax (t :: [Type] -> Type -> Type) where + inFix :: forall dom rng. t dom rng -> Bool + inFix _ = False + name :: forall dom rng. t dom rng -> String + +-- | Semantic operations are ones that give the function symbol, meaning as a function. +-- I.e. how to apply the function to a list of arguments and return a value, +-- or to apply meaning preserving rewrite rules. +class Syntax t => Semantics (t :: [Type] -> Type -> Type) where + semantics :: forall d r. t d r -> FunTy d r -- e.g. FunTy '[a,Int] Bool == a -> Int -> Bool + rewriteRules :: forall ds rng. (TypeList ds, Typeable ds, HasSpec rng, All HasSpec ds) => + t ds rng -> List Term ds -> Evidence (Typeable rng, Eq rng, Show rng) -> Maybe (Term rng) + rewriteRules t l Evidence = Lit <$> (applyFunSym @t (semantics t) l) + +-- | Logical operations are one that support reasoning about how a function symbol +-- relates to logical properties, that we call Spec's +class (Typeable t, Syntax t, Semantics t) => Logic (t :: [Type] -> Type -> Type) where + propagate :: + (AppRequires t as b, HasSpec a) => + t as b -> ListCtx as (HOLE a) -> Spec b -> Spec a +``` + +Here is an example of a type whose constructors are candidates for function symbols. Note that +`IntegerSym` has kind `([Type] -> Type -> Type)` + +``` +data IntegerSym (dom :: [Type]) rng where + PlusW :: IntegerSym '[Integer, Integer] Integer + MinusW :: IntegerSym '[Integer, Integer] Integer + LessOrEqW :: IntegerSym '[Integer, Integer] Bool + GreaterOrEqW :: IntegerSym '[Integer, Integer] Bool +``` + +## The function propagateSpec, and nested function symbols. + +Consider the Term `((size_ x +. Lit 3) <=. Lit 12)` with a bunch of nested function symbols, with just 1 variable `x` +To solve this, we will have to use `propagate` for each of the nested function symbols `size_`, `(+.)`, and `(<=.)`. +We do this by working our way from the outermost function symbol `(<=.)`, to the middle function symbol `(+.)`, to the +innermost function symbol `size_`. This is the role of the function `propagateSpec` + +Given a `Term` with just 1 variable `x`, we can build a context with exactly one `CtxHole`, replacing the variable `x` +Applied to `((size_ x +. Lit 3) <=. Lit 12)` we get the context + +``` +CtxApp <=. (CtxApp +. (CtxApp size_ (Unary CtxHole) :<| 3) :<| 12) +``` + +Working our way from outside in, we first propagate (<=.), then (+.), then (size_). This reduces in several steps. +Note that we have deliberately **bolded** the **`$`** , to note visually where the last argument to `propagateSpec` +occurs. Note after each step, this last arg becomes a 1 step larger composition of `propagate`. + +1. `propagateSpec (CtxApp <=. (CtxApp +. (CtxApp size_ (Unary CtxHole) :<| 3) :<| 12))` **`$`** `spec` +2. `propagateSpec (CtxApp +. (CtxApp size_ (Unary CtxHole) :<| 3)))` **`$`** `(propagate <=. (HOLE:<| 12) spec)` +3. `propagateSpec (CtxApp size_ (Unary CtxHole)))` **`$`** `(propagate +. (HOLE:<| 3) (propagate <=. (HOLE :<| 12) spec))` +4. **`$`** `propagate size_ (Unary HOLE) (propagate +. (HOLE:<| 3) (propagate <=. (HOLE :<| 12) spec)))` + +Note the pattern in the code below that defines `propagateSpec`. +The recursize call to 'propagateSpec' is on the pattern variable `ctx` which is the +part of the context pointed to by the arrows (:<|) and (:|>), and this recurive call to `propagateSpec` is +applied to a new spec computed by 'propagate', where the variable `ctx` is replaced by HOLE. +We end up on line **4** above, with three nested calls to `propagate`. Here is the defintion of `propagateSpec` + +``` +propagateSpec :: forall v a. HasSpec v => Ctx v a -> Spec a -> Spec v +propagateSpec context spec = case context of + CtxHole -> spec + CtxApp f (Unary ctx) | Evidence <- ctxHasSpec ctx -> propagateSpec ctx (propagate f (Unary HOLE) spec) + CtxApp f (ctx :<| v) | Evidence <- ctxHasSpec ctx -> propagateSpec ctx (propagate f (HOLE :<| v) spec) + CtxApp f (v :|> ctx) | Evidence <- ctxHasSpec ctx -> propagateSpec ctx (propagate f (v :|> HOLE) spec) +``` + + +## Inductive tuples, The List type, and their operations + +Before we can study `Syntax`, `Semantics` and `Logic` instances for `IntegerSym`, we need to +discuss the datatype `List` and the constraints `TypeList`, `Typeable`, `HasSpec`, `All`, and `AppRequires` + +The Constrained Generators System is a **typed** first order functional language +(i.e. one without first class functions). As such, the datatypes that define it must track the +type of every object. To do this we use the `Typeable` class from the Haskell module `Data.Typeable`. When a value or +function has a `Typeable t` constraint, the Haskell compiler stores a `TypeRep` for `t` in the code. There are ways for +programs to access the type stored in this `TypeRep`. + +The data type `List` implements heterogeneous lists. One can think of this as an inductive tuple. For example + +``` +data List (f :: k -> Type) (as :: [k]) where + Nil :: List f '[] + (:>) :: f a -> List f as -> List f (a : as) + +args :: List Term '[Int,Bool,String] +args = Lit 5 :> Lit True :> Lit "abc" :> Nil +``` +Is an inductive representation of the tuple `(Lit 5, Lit True,Lit "abc") :: (Term Int, Term Bool, Term String)` + +`List`s and the constraints listed above make it possible to write Haskell functions that can manipulate +typed `Term t`, and other type indexed datatypes. + +The `TypeList ds` class constraint, says that `ds` has kind `[Type]` +(just like the second type parameter of `(List Term '[Int,Bool,String])` +The methods of the `TypeList` class, along with the +type family `FunTy`, are used to express the type of, and implement functions that manipulate `List`s + +``` +type family FunTy ts res where + FunTy '[] a = a + FunTy (a : as) r = a -> FunTy as r +``` + +Ie. `FunTy ds Bool` rewrites to `Integer -> Bool -> [Char] -> Bool` + +``` +-- | Higher-order functions for working on `List`s +class TypeList ts where + uncurryList :: FunTy (MapList f ts) r -> List f ts -> r + uncurryList_ :: (forall a. f a -> a) -> FunTy ts r -> List f ts -> r + curryList :: (List f ts -> r) -> FunTy (MapList f ts) r + curryList_ :: (forall a. a -> f a) -> (List f ts -> r) -> FunTy ts r + unfoldList :: (forall a as. List f as -> f a) -> List f ts +``` +Here are some examples that illustrate the methods of `TypeList` + +``` +-- | Fold over a (List Term ts) with 'getTermsize' which consumes a Term component for each element of the list +ex1:: Maybe Int +ex1 = uncurryList getTermsize args1 + where args1 :: List Term '[Int,Bool,String] + args1 = Lit 5 :> Lit True :> Lit "abc" :> Nil + getTermsize :: Term Int -> Term Bool -> Term String -> Maybe Int + getTermsize (Lit n) (Lit b) (Lit s) = Just $ if b then n else length s + getTermsize _ _ _ = Nothing +``` + +``` +-- Fold over a list with two parts 'unLit' and 'getSize' +ex2 :: Int +ex2 = uncurryList_ unLit getsize args2 + where unLit :: forall a. Term a -> a + unLit (Lit n) = n + getsize :: Int -> Bool -> String -> Int + getsize n b s = if b then n else length s + args2 :: List Term '[Int,Bool,String] + args2 = Lit 5 :> Lit True :> Lit "abc" :> Nil +``` + +Construct a function over `Terms` from a function over `(List Term '[a,b,c])` + +``` +ex3 :: Term a -> Term b -> Term c -> String +ex3 = curryList crush + where crush :: (List Term '[a,b,c] -> String) + crush (a :> b :> c :> Nil) = show a ++ show b ++ show c +``` + +Construct a function `FunTy '[a,b,c] r` from a function over `(List T '[a,b,c] -> r)` +and a function from `(a -> T a)` + +``` +ex4 ::Int -> Bool -> String -> Int +ex4 = curryList_ one totalLength + where totalLength :: List [] '[Int,Bool,String] -> Int + totalLength (n :> b :> s :> Nil) = length n + length b + length s + one :: a -> [a] + one x = [x] +``` + + +The constraint `All F` applies constraint `F` to all elements in a type list. + +`All :: forall k. (k -> Constraint) -> [k] -> Constraint` + +So `All HasSpec '[ a, b, c]` reduces to `(HasSpec a, HasSpec b, HasSpec c)` + +## The HasSpec class + +The `HasSpec` class is at the heart of the system. It does two things + +1. Identifies the operations necessary to generate random values for a type. +2. Provides the interface to QuickCheck (`genFromSpecT`) that supports writing impliication properties. + +``` +class HasSpec a where + -- | The `TypeSpec a` is the type-specific version of `Spec a` for type `a` + type TypeSpec a + + -- | The `TypeSpec a` that has no constraints. + anySpec :: TypeSpec a + + -- | A function, akin to `Semigroup(<>)`, that combines two `TypeSpec`. It is different + -- in that it returns a general `Spec` rather than a type specific `TypeSpec` + combineSpec :: TypeSpec a -> TypeSpec a -> Spec a + + -- | Generate a value that satisfies the `TypeSpec`. + -- The key property for this generator is soundness: + -- ∀ a ∈ genFromTypeSpec spec. a `conformsTo` spec + genFromTypeSpec :: (HasCallStack, MonadGenError m) => TypeSpec a -> GenT m a + + -- | Check conformance to the spec. + conformsTo :: HasCallStack => a -> TypeSpec a -> Bool + + -- | Convert a spec to predicates: + -- The key property here is: + -- ∀ a. a `conformsTo` spec == a `conformsTo` constrained (\t -> toPreds t spec) + toPreds :: Term a -> TypeSpec a -> Pred + + -- | This is used to detect self inconsistencies in a (TypeSpec t) + -- guardTypeSpec ty --> ErrorSpec message, if ty is inconsistent + guardTypeSpec :: TypeSpec a -> Spec a + guardTypeSpec ty = typeSpec ty +``` + +In a later sections we present `HasSpec` instances for several +types: [Bool](#HasSpecBool), [Integer](#HasSpecInteger), [Set a](#HasSpecSet), +[Pair(a,b)](HasSpecPair) and [Either a b](HasSpecEither) + +## The Spec datatype + +The type `Spec` is the datatype we use to describe properties about the variables in the `Spec`. +It has 5 construtors. + + ``` + data Spec a where + -- | There are no constraints at all. + TrueSpec :: Spec a + + -- | The `Spec` is inconsistent + ErrorSpec :: NonEmpty String -> Spec a + +-- | The Spec encodes a FOTL statement, with Predicates encoded in the type `Pred` + SuspendedSpec :: HasSpec a => Var a -> Pred -> Spec a + +-- | The solution is exactly the elements in the Non empty list + MemberSpec :: NonEmpty a -> Spec a + +-- | The solution is embedded in the type-specific `TypeSpec a` + TypeSpec :: HasSpec a => TypeSpec a -> [a] -> Spec a +``` + + +## HasSpec Bool instance + +The `HasSpec Bool` instance is relatively simple, since the type `Bool` has only two elements. + +``` +instance HasSpec Bool where + type TypeSpec Bool = Set Bool + + anySpec = Set.fromList [False,True] + + combineSpec s s' = typeSpec (Set.union s s') + + genFromTypeSpec set + | Set.null set = fatalError "genFromTypeSpec @Set where the typeSpec is Set.empty" + | otherwise = oneofT (map pure (Set.toList set)) + + guardTypeSpec s + | Set.null s = ErrorSpec $ pure "guardTypeSpec @Set where the typeSpec is Set.empty" + | otherwise = TypeSpec s [] + + conformsTo i set = Set.member i set + + toPreds v set = case Set.toList set of + [] -> FalsePred (pure "toPreds @Set where the typeSpec is Set.empty") + (x : xs) -> ElemPred True v (x :| xs) +``` + +1. The `TypeSpec` for `Bool` is just a set of boolean values. There are only 4 distinct values + in this set. `{}`, `{True}`, `{False}`, and `{False,True}` Of these four the empty set `{}` is + not self consistent, so we will have to handle this carefully. + +2. The method `anySpec` is the `TypeSpec` which makes no constraints. That means every `Bool` value must + be in the set. Since `Bool` has only two values, the `anySpec` must be the set `{False,True}` + +3. The method `guardTypeSpec` returns an `ErrorSpec` if the `TypeSpec` is inconsistent. There is only + one inconsistent set of `Bool`, so return `ErrorSpec` if the set is null. + +4. The method `consformsTo` just tests if the input `i :: Bool` is in the set of booleans. + +5. The method `toPreds` returns `FalsePred` if the `TypeSpec` is the null set, and if not + returns the `ElemPred` that say each item in the set must be in the list computed from that set. + +### Function symbols for Bool + +There is only one function symbol for `Bool` , the negation operation on `Bool`, `not`. + +``` +data BoolSym (dom :: [Type]) rng where + NotW :: BoolSym '[Bool] Bool + +deriving instance Eq (BoolSym dom rng) + +instance Show (BoolSym dom rng) where show = name + +instance Syntax BoolSym where + name NotW = "not_" + inFix _ = False + +instance Semantics BoolSym where + semantics NotW = not +``` + +The `Syntax` and `Semantics` instances are trivial, and should need no explanation. In order to +talk about the `Logic` instance we must study the type of the `Logic` method `propagate`. +The type of the method `propagate` is +``` +propagate :: (Logic t, AppRequires t as b, HasSpec a) => + t as b -> ListCtx as (HOLE a) -> Spec b -> Spec a +``` +A `ListCtx` is a `Term` with a single `HOLE` and no variables (i.e. everything else is a literal constant). +For a unary function there is only one context `(Unary HOLE)` because a unary function has only one input, +it can only have a `HOLE`, no literal constants are possible. + +``` +instance Logic BoolSym where + propagate _ _ TrueSpec = TrueSpec + propagate _ _ (ErrorSpec msgs) = ErrorSpec msgs + propagate NotW (Unary HOLE) (SuspendedSpec v ps) = + constrained $ \v' -> Let (App NotW (v' :> Nil)) (v :-> ps) + propagate NotW (Unary HOLE) spec = + caseBoolSpec spec (equalSpec . not) +``` +The only interesting case is the last one. The first 3 only rely on the properties that +propagating over `TrueSpec`, `ErrorSpec` or `SuspendedSpec` rely on their special properties. + +The last case boils down to the following. if `(not x) conformsTo spec` what does that say about `x` +To make this concrete we will name the hole `x`, consider the 3 cases for `Spec`. + +1. `(not x) conformsTo {False}` what does that say about `x` ? `x` must be True +2. `(not x) conformsTo {True}` what does that say about `x` ? `x` must be False +3. `(not x) conformsTo {False,True}` what does that say about `x` ? `x` can be either True or False (`anySpec`) + +The function `caseBoolSpec` formalizes this. We build a list of values that the `spec` conforms to. +There are only two values (True and False), so we try them all. If the list is empty, +there is no solution so return an `ErrorSpec` , if the list has more than 1 element, and since Bool has +only two elements, the solution is the Spec with no constraints, i.e. the Monoid value, `mempty` for +`Spec Bool` which is defined to be `(TypeSpec anySpec [])` +which using the definition of `anySpec` is `(TypeSpec (Set.fromList [False,True]) [])` +If the list has exactly 1 element, we apply the continuation to that element, and let the caller +of `caseBoolSpec` decide what to do. + +``` +caseBoolSpec :: (HasSpec Bool, HasSpec a) => Spec Bool -> (Bool -> Spec a) -> Spec a +caseBoolSpec spec cont = case possibleValues spec of + [] -> ErrorSpec (NE.fromList ["No possible values in caseBoolSpec"]) + [b] -> cont b + _ -> mempty + where + possibleValues s = filter (flip conformsToSpec s) [True, False] +``` + +Finally, add the function symbol, with (`name NotW` == `"not_"`) as a function over terms + +``` +not_ :: Term Bool -> Term Bool +not_ x = App NotW (x :> Nil) +``` + +## HasSpec Integer instance + +We introduce the datatype `Range`. This will be the type specific `TypeSpec` for `Integer` +It has both a `Semigroup` and `Monoid` instance. A `Range` denotes a contiguous interval. +The Range `(Interval Nothing Nothing)` denotes the interval from negative infinity to positive infinity. +A `Range` with a single `Nothing` denotes infinity on only one end. + +``` +data Range = Interval (Maybe Integer) (Maybe Integer) deriving (Eq, Show) + +unionWithMaybe :: (a -> a -> a) -> Maybe a -> Maybe a -> Maybe a +unionWithMaybe f x y = case (x,y) of + (Nothing,Nothing) = Nothing + (Just lo, Just hi) = Just (f lo hi) + (Nothing , Just hi) = Just hi + (Just lo, Nothing) = Just lo + +instance Semigroup Range where + Interval ml mu <> Interval ml' mu' = + Interval + (unionWithMaybe max ml ml') + (unionWithMaybe min mu mu') + +instance Monoid Range where + mempty = Interval Nothing Nothing +``` + +Just a note. If two Ranges are completely disjoint, with no overlap, then +the operation `<>` leads to an inconsistent Range where lower bound is greater than the upper bound. +``` +(Range (Just 5) (Just 10)) <> (Range (Just 12) (Just 20)) ---> Range (Just 12) (Just 10) + \ \ / / + \---------\---------------max / + \ 12 / + \---------------------min + 10 +``` + +The `HasSpec` instance requires an instance for the type family `TypeSpec Integer` and five other methods. + +``` +instance HasSpec Integer where + type TypeSpec Integer = Range + + anySpec = Interval Nothing Nothing + + combineSpec s s' = guardTypeSpec (s <> s') + + guardTypeSpec r@(Interval (Just n) (Just m)) + | n > m = ErrorSpec (pure ("lower bound greater than upper bound\n" ++ show r)) + | otherwise = typeSpec r + guardTypeSpec range = typeSpec range + + genFromTypeSpec (Interval ml mu) = do + n <- sizeT + chooseT =<< constrainInterval (ml <|> lowerBound) (mu <|> upperBound) (fromIntegral n) + + conformsTo i (Interval ml mu) = maybe True (<= i) ml && maybe True (i <=) mu + + toPreds v (Interval ml mu) = + Foldable.fold $ + [Assert $ Lit l <=. v | l <- maybeToList ml] + ++ [Assert $ v <=. Lit u | u <- maybeToList mu] +``` + +1. The method `anySpec` is the TypeSpec that has no constraints, so we make it the `Range` from -∞ to +∞ . +2. The method `combineSpec` combines two `TypeSpec` so that the result has the constraints of both. We use the + `Semigroup` instance on `Range`, which does exactly that. +3. The method `guardTypeSpec` returns an `ErrorSpec` if its input of type `Range` is inconsistent. + This happens only when a `(Just lo)` bound is greater than a `(Just hi)` bound. Which usually arises + when we `combineSpec` two disjoint `Ranges` as explained above. If a `Range` has + one or more `Nothing` it is always consistent. +4. The method `genFromTypeSpec` generates a random value in the `GenT` monad (a richer version of the QuickCheck + `Gen` monad that has mechanisom for handling errors, which the `Gen` monad does not). We extract the `QuickCheck.Gen` + size parameter to scale the random values according to magnitude of the bounds stored in the `Range`. The Haskell + functions `sizeT` and `chooseT` are the `GenT` version of the QuickCheck operations `getSize` and `choose`. +5. The method `conformsTo` checks that `i` is in the given `Range`. +6. The method `toPreds` translates a `Range` into a pair of assertions. Note that if either of the + bounds of `ml` or `mu` is `Nothing` the function `maybeToList` returns the null list, so the list + containing the `Assert` term for that bound is also empty. So Nothing ends up not adding anything + to the `Pred` produced. +7. The function `constrainInterval` uses the `n` input to choose a random value whose + size is scaled to the actual bounds in the range. + +### Function symbols for Integer + +To make the `HasSpec Integer` instance usefull we must introduce function symbols on Integer, and then provide +`Syntax`, `Semantics` and `Logic` instances for those function symbols. Recall that a datatype whose +constructors denote function symbols must have kind `([Type] -> Type -> Type)`. + +``` +data IntegerSym (dom :: [Type]) rng where + PlusW :: IntegerSym '[Integer, Integer] Integer + MinusW :: IntegerSym '[Integer, Integer] Integer + NegateW :: IntegerSym '[Integer] Integer + LessOrEqW :: IntegerSym '[Integer, Integer] Bool + GreaterOrEqW :: IntegerSym '[Integer, Integer] Bool +``` + +One we have `(Syntax IntegerSym)`, `(Semantics IntegerSym)` and `(Logic IntegerSym)` instances +we will have four new function symbols. We use the kinds `(dom :: [Type])` and `(rng :: Type)` to encode the type +of the function symbol as `(FunTy dom rng)` which for `PlusW` would be `(+) :: Integer -> Integer -> Integer` +The function symbol `name` will be a Haskell function over terms. So for `PlusW` we would have +`(+.) :: Term Integer -> Term Integer -> Term Integer` + +``` +instance Syntax IntegerSym where + name PlusW = "+." + name MinusW = "-." + name NegateW = "negate_" + name LessOrEqW = "<=." + name GreaterOrEqW = ">=." + inFix NegateW = False + inFix _ = True + +instance Semantics IntegerSym where + semantics PlusW = (+) + semantics MinusW = (-) + semantics NegateW = negate + semantics LessOrEqW = (<=) + semantics GreaterOrEqW = (>=) +``` + +These function symbols have `name`s that align with their Haskell counterparts, but have a trailing `.` in their name. +The `Syntax` and `Sematics` are exactly what you would think. The `Logic` instance method `propagate` +requires some explanation. First some helper functions for building `Spec Integer` without using any +function symbols. They will be useful when we study the `Logic Integer` instance. + +``` +negateRange :: Range -> Range +negateRange (Interval ml mu) = Interval (negate <$> mu) (negate <$> ml) + +minus :: Integer -> Integer -> Integer +minus n x = n - x + +geqSpec :: Integer -> Spec Integer +geqSpec n = typeSpec (Interval (Just n) Nothing) + +leqSpec :: Integer -> Spec Integer +leqSpec n = typeSpec (Interval Nothing (Just n)) + +gtSpec :: Integer -> Spec Integer +gtSpec n = typeSpec (Interval (Just (n + 1)) Nothing) + +ltSpec :: Integer -> Spec Integer +ltSpec n = typeSpec (Interval Nothing (Just (n - 1))) +``` + +1. The function `minus` is just `(-)` with its arguments flipped. Partially applied as `(minus n)` this is quite usefull. +2. If we have a range that includes [-1,0,1,2,3] `(Interval (Just (-1)) (Just 3))`, then the negation of that + range would include [1,0,-1,-2,-3] `(Interval (Just (-3)) (Just 1))` We can compute that by negating + and switching the bounds. + +3. The `(geqSpec 6)` means `x` such that `x >= 6`. We encode that as `(Interval (Just 6) Nothing)` +4. The Haskell functions `leqSpec`, `gtSpec`, and `ltSpec` use similar reasoning + + +``` +instance Logic IntegerSym where + propagate tag ctx spec = case (tag, ctx, spec) of + (_, _, TrueSpec) -> TrueSpec + (_, _, ErrorSpec xs) -> ErrorSpec xs + (f, context, SuspendedSpec v ps) -> + constrained $ \v' -> Let (App f (fromListCtx context v')) (v :-> ps) + (LessOrEqW, HOLE :<| l, bspec) -> + caseBoolSpec bspec $ \case True -> leqSpec l; False -> gtSpec l + (LessOrEqW, l :|> HOLE, bspec) -> + caseBoolSpec bspec $ \case True -> geqSpec l; False -> ltSpec l + (GreaterOrEqW, HOLE :<| x, spec1) -> + propagate LessOrEqW (x :|> HOLE) spec1 + (GreaterOrEqW, x :|> HOLE, spec2) -> + propagate LessOrEqW (HOLE :<| x) spec2 + (NegateW, Unary HOLE, TypeSpec interval cant) -> typeSpec (negateRange interval) <> notMemberSpec (map negate cant) + (NegateW, Unary HOLE, MemberSpec xs) -> MemberSpec $ NE.nub $ fmap negate xs + (PlusW, HOLE :<| n, TypeSpec (Interval lo hi) bad) -> + TypeSpec (Interval ((minus n) <$> lo) ((minus n) <$> hi)) (map (minus n) bad) + (PlusW, HOLE :<| n, MemberSpec xs) -> + MemberSpec (fmap (minus n) xs) + (PlusW, n :|> HOLE, TypeSpec (Interval lo hi) bad) -> + TypeSpec (Interval ((minus n) <$> lo) ((minus n) <$> hi)) (map (minus n) bad) + (PlusW, n :|> HOLE, MemberSpec xs) -> MemberSpec (fmap (minus n) xs) + (MinusW, HOLE :<| n, TypeSpec (Interval lo hi) bad) -> + TypeSpec (Interval ((+ n) <$> lo) ((+ n) <$> hi)) (map (+ n) bad) + (MinusW, HOLE :<| n, MemberSpec xs) -> + MemberSpec (fmap (+ n) xs) + (MinusW, n :|> HOLE, TypeSpec (Interval lo hi) bad) -> + TypeSpec (negateRange (Interval ((minus n) <$> lo) ((minus n) <$> hi))) (map (minus n) bad) + (MinusW, n :|> HOLE, MemberSpec xs) -> + MemberSpec (fmap (minus n) xs) +``` + +The type of the method `propagate` is +``` +propagate :: (Logic t, AppRequires t as b, HasSpec a) => + t as b -> ListCtx as (HOLE a) -> Spec b -> Spec a +``` + +Notice that `t as b` is a function symbol, for example `LessOrEqW :: IntegerSym '[Integer,Integer] Bool`. +A `ListCtx` is a `Term` with a single `HOLE` and no variables (i.e. everything else is a literal constant). +For a binary function there are exactly two contexts `(HOLE :<| n)` and `(n :|> HOLE)`. Here `n` is a constant, +so for `LessOrEqW` there are two cases to consider, one for each type of +context (`HOLE` on the right, and `HOLE` on the left). Lets consider the case where the `HOLE` is on the left. + +Recall the things an the left of the `=` are known, and we are trying to compute the `intSpec` on the right of the `=`. + +1. `propagate LessOrEqW (HOLE :<| intConstant) boolSpec = intSpec` + To make this concrete we will name the hole `x`, and give a concrete value for `intConstant` which is `7`. + Since `boolSpec` could be either `True` or `False`, we have two concrete visual representations of the example. + - `(x <=. 7) = True`, so what do we know about `x` ? We know `(x <=. 7)` + - `(x <=. 7) = False`, so what do we know about `x` ? We know `(x > 7)` + +Given a property about a function call, how do we translate that property to one about the variable that is an input to the function call. + +This is the code that implements this case branch above. Note `caseBoolSpec` allows us to consider +the two cases separately. + +``` (LessOrEqW, HOLE :<| l, bspec) -> + caseBoolSpec bspec $ \case True -> leqSpec l; False -> gtSpec l +``` + +All the cases follow similar reasoning. In the cases for `PlusW` and `MinusW` and `NegateW`, the known `Spec` will have type +`(Spec Integer)` rather than `(Spec Bool)` because unlike `LessOrEqW` whose range is `Bool`, the range of +`MinusW` is `Integer`. One other difference is that we must consider that the `(Spec Integer)` can be one of +the two `Spec` constructors `MemberSpec` or `TypeSpec`. For `LessOrEqW` we didn't need to handle both cases +since the function `caseBoolSpec` does that analysis internally. Lets consider the cases for `MinusW` where the +`HOLE` is on the left. + +When the output spec is a `TypeSpec` we have something like this. + +``` +(MinusW, HOLE :<| n, TypeSpec (Interval lo hi) bad) -> + TypeSpec (Interval ((+ n) <$> lo) ((+ n) <$> hi)) (map (+ n) bad) +``` +Lets use the same trick we did earlier by naming the hole `x`, and concretizing the constants. +The concrete example is: + +``` +(x - 3) conformsTo (TypeSpec (Interval (Just 6) (Just 12)) [7]) +``` + +So the range is bewteen 6 and 12, but cannot be 7. So what does that say about `x` +It says `x` must be bewteen 9(6+3) and 15(12+3), but cannot be 10(7+3). Which correponds to the code above. + +When the code is a `MemberSpec` we have something like this + +``` +(MinusW, HOLE :<| n, MemberSpec xs) -> MemberSpec (fmap (+ n) xs) +``` +With concrete example `(x - 3) conformsTo MemberSpec [8,13]` So what does that say about `x` +It says `x conformsTo MemberSpec [11(8+3), 16(13+3)`. Which corresponds to the code. + + +Finally we define the lifted versions of the function symbols that work on `Term` + +``` +(<=.) :: Term Integer -> Term Integer -> Term Bool +(<=.) x y = App LessOrEqW (x :> y :> Nil) + +(>=.) :: Term Integer -> Term Integer -> Term Bool +(>=.) x y = App GreaterOrEqW (x :> y :> Nil) + +(+.) :: Term Integer -> Term Integer -> Term Integer +(+.) x y = App PlusW (x :> y :> Nil) + +(-.) :: Term Integer -> Term Integer -> Term Integer +(-.) x y = App MinusW (x :> y :> Nil) + +negate_ :: Term Integer -> Term Integer +negate_ x = App NegateW (x :> Nil) +``` + + +## HasSpec (Set a) instance + +The `HasSpec` instance for `Set` is interesting, because + +1. It has a rich set of function symbols, +2. It is the first type that we have seen that has a `Container` instance. +3. Introduces a rather complex `TypeSpec` to account for the first two. + +It is best to start our explanation with the `Container` class. A type can have a `(Container t e)` instance +if the type `t` stores elements of type `e`, and we can enumerate over the `e` in `t`. The operation +on constainer types is that they support the quantifier `forAll` which states that every element `e` in `t` +meets some constraint. In this minimal implementation, only the `Set` type supports `Container` but in the full +system, with `HasSpec` instances for types such as `List` and `Map` , could also have `Container` instances. + +``` +class Container t e | t -> e where + fromForAllSpec :: (HasSpec t, HasSpec e) => Spec e -> Spec t + forAllToList :: t -> [e] + +forAll :: (Container t a, HasSpec t, HasSpec a) => Term t -> (Term a -> Pred) -> Pred +``` + +Set supports the function symbols defined by the constructors of the `SetSym` datatype. + +``` +data SetSym (dom :: [Type]) rng where + MemberW :: (HasSpec a, Ord a) => SetSym [a, Set a] Bool + SizeW :: (HasSpec a, Ord a) => SetSym '[Set a] Integer + SubsetW :: (HasSpec a, Ord a) => SetSym [Set a, Set a] Bool +``` + +So the type specific `TypeSpec` will have to encode at least 4 distinct properties. +Set membership, set size, subset, and the quantification over its elements. + +We will use this type `SetSpec` as the `TypeSpec` for `Set` . Like many `TypeSpecs`, it +has both `Semigroup` and `Monoid` instances + +``` +data SetSpec a = SetSpec {setMust :: Set a, setAll :: Spec a, setCount :: Spec Integer} + +instance (Ord a, HasSpec a) => Semigroup (SetSpec a) where + SetSpec must es size <> SetSpec must' es' size' = + SetSpec (must <> must') (es <> es') (size <> size') + +instance (Ord a, HasSpec a) => Monoid (SetSpec a) where + mempty = SetSpec mempty mempty TrueSpec +``` + +We will complete the `HasSpec (Set a)` instance description in a number of steps. + +1. Describe the `Container` instance +2. Describe the `Syntax` and `Semantics` instances for `SetSym` +3. Describe some helper functions needed to define the `HasSpec` instance +4. Describe the `HasSpec` instance +5. Describe the `Logic` instance of `SetSym` + +### `Container` instance for `Set` + +``` +instance Ord s => Container (Set s) s where + fromForAllSpec e = typeSpec $ SetSpec mempty e TrueSpec + forAllToList = Set.toList +``` + +1. The method `forAllToList` turns all the elemenst in the container into list. +2. The method `fromForAllSpec` lifts a `Spec e` describing the property every + element the set must have, into a `Spec t` describing the property the whole set should have. + Due to to some forethought, this is quite easy. We embed `e` in the `setAll` field of `SetSpec` + and fill the other components with with values that enforce no constraints. + +### `Syntax` and `Semantics` instances for `SetSym` + +``` +setSize :: Set a -> Integer +setSize = toInteger . Set.size + +deriving instance Eq (SetSym dom rng) + +instance Show (SetSym dom rng) where show = name + +instance Syntax SetSym where + name MemberW = "member_" + name SizeW = "size_" + name SubsetW = "subset_" + inFix _ = False + +instance Semantics SetSym where + semantics MemberW = Set.member + semantics SizeW = setSize + semantics SubsetW = Set.isSubsetOf + + rewriteRules SubsetW (Lit s :> _ :> Nil) Evidence | null s = Just $ Lit True + rewriteRules SubsetW (x :> Lit s :> Nil) Evidence | null s = Just $ x ==. Lit Set.empty + rewriteRules MemberW (t :> Lit s :> Nil) Evidence + | null s = Just $ Lit False + | [a] <- Set.toList s = Just $ t ==. Lit a + rewriteRules t l Evidence = Lit <$> (applyFunSym @SetSym (semantics t) l) +``` + +Most of this is straight-forward. The only new part is that this is the first `Semantics` instance that supports +`rewriteRules` over the function symbols. Here are the rewrite rules in a more relaxed notation where `{a}` +denote a Set with 1 element `a`, and `{}` denotes the empty set. + +- `subset_ (Lit {}) x --> Lit True` +- `subset_ x (Lit {}) --> x ==. Lit {}` +- `member_ t (Lit {}) --> Lit False` +- `member_ t (Lit {a}) --> t ==. Lit a` +- `member_ (Lit t) (Lit ts) --> Lit (member t ts )` + +### Helper functions needed to define the `HasSpec (Set a)` instance + +``` +setSize :: Set a -> Integer +setSize = toInteger . Set.size + +guardSetSpec :: (HasSpec a, Ord a) => SetSpec a -> Spec (Set a) +guardSetSpec (SetSpec must elemS ((<> geqSpec 0) -> size)) + | Just u <- knownUpperBound size + , u < 0 = + ErrorSpec (("guardSetSpec: negative size " ++ show u) :| []) + | not (all (`conformsToSpec` elemS) must) = + ErrorSpec (("Some 'must' items do not conform to 'element' spec: " ++ show elemS) :| []) + | isErrorLike size = ErrorSpec ("guardSetSpec: error in size" :| []) + | isErrorLike (geqSpec (setSize must) <> size) = + ErrorSpec $ + ("Must set size " ++ show (setSize must) ++ ", is inconsistent with SetSpec size" ++ show size) + :| [] + | otherwise = typeSpec (SetSpec must elemS size) + +knownUpperBound :: Spec Integer -> Maybe Integer +knownUpperBound TrueSpec = Nothing +knownUpperBound (MemberSpec as) = Just $ maximum as +knownUpperBound ErrorSpec {} = Nothing +knownUpperBound SuspendedSpec {} = Nothing +knownUpperBound (TypeSpec (Interval lo hi) cant) = upper lo hi + where + upper _ Nothing = Nothing + upper Nothing (Just b) = listToMaybe $ [b, b - 1 ..] \\ cant + upper (Just a) (Just b) + | a == b = a <$ guard (a `notElem` cant) + | otherwise = listToMaybe $ [b, b - 1 .. a] \\ cant +``` + +1. `setSize` computes the size of a Haskell Set as an Integer (not an Int). +2. `knownUpperBound` computes a concrete upperbound from a Spec if one can be found. +3. `guardSetSpec` looks for inconsistencies in a `SetSpec` + - We can compute an upperbound from the `setCount` field, and that + upperbound is not positive, all set sizes are positive. + - Some element that must be in the set, does not meet the `setAll` field `Spec` + - The `setCount` field `Spec` is inconsistent + +### `HasSpec` instance for Set + +``` +instance (Container (Set a) a, Ord a, HasSpec a) => HasSpec (Set a) where + type TypeSpec (Set a) = SetSpec a + + anySpec = SetSpec Set.empty TrueSpec TrueSpec + + combineSpec x y = guardSetSpec (x <> y) + + conformsTo s (SetSpec must es size) = + and + [ setSize s `conformsToSpec` size + , must `Set.isSubsetOf` s + , all (`conformsToSpec` es) s + ] + + toPreds s (SetSpec m es size) = + Foldable.fold $ + -- Don't include this if the must set 'm' is empty + [Assert $ subset_ (Lit m) s | not $ Set.null m] + ++ [ forAll s (\e -> satisfies e es) + , satisfies (size_ s) size + ] + + guardTypeSpec = guardSetSpec + + genFromTypeSpec (SetSpec must e _) + | any (not . (`conformsToSpec` e)) must = + genErrorNE + ( NE.fromList + [ "Failed to generate set" + , "Some element in the must set does not conform to the elem specification" + , "Unconforming elements from the must set:" + , unlines (map (\x -> " " ++ show x) (filter (not . (`conformsToSpec` e)) (Set.toList must))) + , "Element Specifcation" + , " " ++ show e + ] + ) + -- Special case when elemS is a MemberSpec. + -- Just union 'must' with enough elements of 'xs' to meet 'szSpec' + genFromTypeSpec (SetSpec must (MemberSpec xs) szSpec) = do + let szSpec' = szSpec <> geqSpec (setSize must) + choices <- pureGen $ shuffle (NE.toList xs \\ Set.toList must) + size <- fromInteger <$> genFromSpecT szSpec' + let additions = Set.fromList $ take (size - Set.size must) choices + pure (Set.union must additions) + genFromTypeSpec (SetSpec must elemS szSpec) = do + let szSpec' = szSpec <> geqSpec (setSize must) + sizecount <- + explain "Choose a size for the Set to be generated" $ + genFromSpecT szSpec' + let targetSize = sizecount - setSize must + explainNE + ( NE.fromList + [ "Choose size count = " ++ show sizecount + , "szSpec' = " ++ show szSpec' + , "Picking items not in must = " ++ show (Set.toList must) + , "that also meet the element test: " + , " " ++ show elemS + ] + ) + $ go 100 targetSize must + where + go _ n s | n <= 0 = pure s + go tries n s = do + e <- + explainNE + ( NE.fromList + [ "Generate set member at type " ++ showType @a + , " number of items starting with = " ++ show (Set.size must) + , " number of items left to pick = " ++ show n + , " number of items already picked = " ++ show (Set.size s) + ] + ) + $ withMode Strict + $ suchThatWithTryT tries (genFromSpecT elemS) (`Set.notMember` s) + + go tries (n - 1) (Set.insert e s) +``` + +1. The method `TypeSpec` has 3 components + - `setMust :: Set a`. Elements that must be in the Set. Captures `memberW` function symbol information. + - `setAll :: Spec a`. Property that must be True about all elements in the set. + Captures `forAll` information from the `Container` class. + - `setCount :: Spec Integer`. The size of the set. Captures `sizeW` function symbol information. +2. The method `anySpec`. All three componets are set to `TrueSpec` +3. The method `combineSpec`. Guard the result of using the `Monoid (SetSpec t)` instance to combine the two inputs. +4. The method `conformsTo`. Individually test all three parts of the `SetSpec` against the input. +5. The method `toPreds` . Split the `setSpec` into 3 sets of `Pred`, one for each field, and join them all together + using `Foldable.fold` that uses the `Monoid(Pred)` instance. +6. The method `guardTypeSpec`. Uses the `guardSetSpec` function described above. +7. The method `genFromSpecT`. This a complicated function that must deal with all tree fields of + the `SetSpec` simultaneously. We will step through them one at a time. + - The first clause `(genFromTypeSpec (SetSpec must e _))` tests for inconsistencies + between the `must` set and the `setAll` Spec `e` that must hold for every elememt. + - The second clause `(genFromTypeSpec (SetSpec must (MemberSpec xs) szSpec))` handles + the case when the `setAll` Spec is a `(MemberSpec xs)`. This tells us that every element + must come from `xs`. We know the `must` set is a subset of `xs` or we wouldn't have gotten past + the first clause. We compute a bunch of `choices` from the elements in `xs` but not in `must`. + Then we pick a valid size, take the correct number of elements from the `choices`, call it `additions`, + and return the union of the `must` and `additions` set. + - The third clause `(genFromTypeSpec (SetSpec must elemS szSpec))` must satisfy all 3 sub-Specs with + no special information. So we start by picking a satisfying size `sizeCount`. Compute `targetSize` that + accounts for the `must` set, Then loop using `(go 100 targetSize must)`. This starts with the set `must` + then tries to add `targetSize` additional elements to `must` each of which meets the `setAll` spec `elemS`. + For each of the `targetSize` iterations, we get `100` tries to find an element that + meets `elemS` and is not already in the set `s` we have accumulated so far. + + +### `Logic` instance of `SetSym` + +``` +instance Logic SetSym where + propagate tag ctx spec = case (tag, ctx, spec) of + (_, _, TrueSpec) -> TrueSpec + (_, _, ErrorSpec es) -> ErrorSpec es + (f, context, SuspendedSpec v ps) -> constrained $ \v' -> Let (App f (fromListCtx context v')) (v :-> ps) + (MemberW, HOLE :<| (s :: Set a), spec1) -> + caseBoolSpec spec1 $ \case + True -> memberSpec (Set.toList s) (pure "propagateSpecFun on (Member x s) where s is Set.empty") + False -> notMemberSpec s + (MemberW, e :|> HOLE, spec2) -> + caseBoolSpec spec2 $ \case + True -> typeSpec $ SetSpec (Set.singleton e) mempty mempty + False -> typeSpec $ SetSpec mempty (notEqualSpec e) mempty + (SizeW, Unary HOLE, spec3) -> typeSpec (SetSpec mempty mempty spec3) + (SubsetW, HOLE :<| big, spec4) -> caseBoolSpec spec4 $ \case + True -> constrained $ \small -> + And + [ Assert $ size_ small <=. Lit (setSize big) + , forAll small $ \x -> Assert $ member_ x (Lit big) + ] + False -> constrained $ \small -> + exists (\eval -> headGE $ Set.difference big (eval small)) $ \e -> + And + [ -- set `DependsOn` e, + Assert $ not_ $ member_ e (Lit big) + , Assert $ member_ e small + ] + (SubsetW, small :|> HOLE, spec5) -> caseBoolSpec spec5 $ \case + True -> typeSpec $ SetSpec small TrueSpec mempty + False -> constrained $ \big -> + exists (\eval -> headGE $ Set.difference (eval big) small) $ \e -> + And + [ -- set `DependsOn` e, + Assert $ member_ e (Lit small) + , Assert $ not_ $ member_ e big + ] +``` + +## SolverPlans + +The function `genFromSpecT` generates a random value from a `Spec`. Of the 5 constructors of `Spec`, the case analysis over +4 of them (`ErrorSpec`, `TrueSpec`, `TypeSpec`, and `MemberSpec`) is straight forward. The 5th, `SuspendedSpec` is more +involved, in that it consists of 1 or more `Pred`, and possibly multiple variables. Here is a simplified version of `genFromSpecT` +to illustrate that point. + +``` +genFromSpecT :: + forall a m. (HasCallStack, HasSpec a, MonadGenError m) => Spec a -> GenT m a +genFromSpecT (simplifySpec -> spec) = case spec of + MemberSpec as -> pureGen (elements (NE.toList as)) -- Pick from `as` + TrueSpec -> genFromSpecT (typeSpec $ anySpec @a) -- Use the type specific `genFromSpecT` on no constraints + SuspendedSpec x preds -> do + env <- genFromPreds mempty preds -- Somehow solve all the `preds` + findEnv env x + TypeSpec s cant -> do -- Use the type specific `genFromSpecT, + mode <- getMode + genFromTypeSpec s `suchThatT` (`notElem` cant) + ErrorSpec e -> genErrorNE e -- raise an error. +``` + + +The key to `genFromPreds` is to turn `preds` into a `SolverPlan` which consists of choosing an order +to solve all the variables in `preds`. For each variable, we create a `SolverStep` which includes that variable, +and the subset of `preds` that mention it. Then we solve each `SolverStep` in the order they appear in the plan. + +Let's step throught the process on a the simple `SuspendedSpec` **spec3** given below. + +``` +spec3 :: Spec (Integer, Integer, Integer) +spec3 = constrained $ \ v4 -> + match v4 $ \ v3 v1 v0 -> And [Assert $ v3 <=. v1, Assert $ v1 <=. v0] +``` + +In the code block below, each `SolverStep` includes the variable name, followed by `<-` and a list of sub predicates, terminated by `---` + +``` + SolverPlan + Linearization: + v_0 <- + --- + v_1 <- + assert $ v_1 <=. v_0 + --- + v_3 <- + assert $ v_3 <=. v_1 + --- + v_2 <- + assert $ fst_ v_2 ==. v_1 + assert $ snd_ v_2 ==. v_0 + --- + v_4 <- + assert $ head_ v_4 ==. v_3 + assert $ tail_ v_4 ==. v_2 + --- +``` + +The plan orders the variables in this order `[v_0, v_1, v_3, v_2, v_4]`. It has the property that in each step, the sub predicates +mention only the variable for that step, and variables that appear in previous steps. The reader should verify that this is true. Solving +the plan, chooses the first variable and a conforming random value, adds it to the environment, and then substituting that value into +the rest of the plan, and then solving that shorter plan, until no variables are left. Let's step through the example. + +Choosing `v_0` and a random conforming value `19`, note how the the `Pred` for `v_1` after substitution +`(assert $ v_1 <=. 19)` simplifies to the `TypeSpec (Interval Nothing (Just 19)) []` . +Every substitution may enable further simplifcation, and replaces that variable with something without any variables +in the succeeding `SolverSteps`. + +``` +Env {v_0 -> 19} +Step v_1 + SolverPlan + Linearization: + v_1 <- + TypeSpec (Interval Nothing (Just 19)) [] + --- + v_3 <- + assert $ v_3 <=. v_1 + --- + v_2 <- + TypeSpec (Cartesian (TrueSpec @(Integer)) (MemberSpec [ 19 ])) [] + assert $ fst_ v_2 ==. v_1 + --- + v_4 <- + assert $ head_ v_4 ==. v_3 + assert $ tail_ v_4 ==. v_2 + --- +``` + +Choosing `v_1` and a random conforming value `-2`, and subsituting we get + +``` +Env {v_0 -> 19, v_1 -> -2} +Step v_3 + SolverPlan + Linearization: + v_3 <- + TypeSpec (Interval Nothing (Just (-2))) [] + --- + v_2 <- + TypeSpec (Cartesian (MemberSpec [ -2 ]) (MemberSpec [ 19 ])) [] + --- + v_4 <- + assert $ head_ v_4 ==. v_3 + assert $ tail_ v_4 ==. v_2 +``` + +Choosing `v_3`, and a conforming value `-29`, and substituting, we get + +``` +Env { v_0 -> 19, v_1 -> -2, v_3 -> -29 } +Step v_2 + SolverPlan + Linearization: + v_2 <- + TypeSpec (Cartesian (MemberSpec [ -2 ]) (MemberSpec [ 19 ])) [] + --- + v_4 <- + TypeSpec (MemberSpec [ -29 ],TrueSpec @((Integer,Integer))) [] + assert $ tail_ v_4 ==. v_2 +``` + +Choosing `v_2`, note how the there is only one conforming solution `(-2,19)`, after substituting we get + +``` +Env { v_0 -> 19, v_1 -> -2, v_2 -> (-2,19), v_3 -> -29 } +Step v_4 + SolverPlan + Linearization: + v_4 <- + MemberSpec [ (-29,-2,19) ] +``` + +Finally there is only one variable left to choose `v_4`, and one conforming value, which completes the final environment for every variable +in the original `SuspendedSpec`. + +``` +Env { v_0 -> 19, v_1 -> -2, v_2 -> (-2,19), v_3 -> -29 , v_4 -> (-29,-2,19) } +``` + +Since the `constrained` library function defining the original `spec3`, binds the variable `v_4`, +the solution to the `Spec` can be found in the final environment bound by `v_4` which is `(-29,-2,19)`, the other +variables are intermediate, and are used only to build the final solution. + +``` +spec3 = constrained $ \ v_4 -> + match v_4 $ \ v_3 v_1 v_0 -> And [Assert $ v_3 <=. v_1, Assert $ v_1 <=. v_0] +``` + +## Translating `Pred` to `Spec` and the use of `propagate` + +For each SolverStep. in the process above, we have a variable `v` and a `Pred`. We need to convert that `Pred` into a `Spec` +for `v`. The strategy for generating things from `Pred`s is relatively straightforward +and relies on one key fact: any constraint that has only one free variable `v` +and where `v` occurs only once can be turned into a `Specification` for `v`. To way we do that, is to use the +function `computeSpecSimplified` which runs in the `GE` monad which can collect errors +if they occur. The function `localGESpec` catches `GenError`s and turns them into `ErrorSpec` , and re-raises +`FatalSpec` if that occurs. + +``` +-- | We want `genError` to turn into `ErrorSpec` and we want `FatalError` to turn into `FatalError` +localGESpec ge = case ge of + (GenError xs) -> Result $ ErrorSpec (catMessageList xs) + (FatalError es) -> FatalError es + (Result v) -> Result v +``` + +Here is a partial definition for `computeSpecSimplified` that illustrates the important operation of handling +multiple `Pred` embedded in the FOTL connective `And` + +``` +computeSpecSimplified :: forall a. (HasSpec a, HasCallStack) => Var a -> Pred -> GE (Spec a) +computeSpecSimplified x preds = localGESpec $ case simplifyPred preds of + And ps -> do + spec <- fold <$> mapM (computeSpecSimplified x) ps + case spec of + SuspendedSpec y ps' -> pure $ SuspendedSpec y $ simplifyPred ps' + s -> pure s + ... +``` + +The function works as follows +1. Guard the `GE` computation with `localGESpec`. This may: return a valid `Result`, return a `ErrorSpec`, or raise a Haskell Error. +2. First apply simplifying rules to `preds` +3. Then a case analysis over the resulting simplified `Pred` is preformed +4. If the case is `And`, map `computeSpecSimplified` over each of them, and then `fold` over the results using the `Monoid` operator `(<>)` for `Spec` + - One may visualize this as `(computeSpecSimplified x1 <> computeSpecSimplified x2 <> ... <> computeSpecSimplified xN)` + - Recall each of the sub-`Pred` mentions only the variable `x`, so all the resulting `Spec` are for the same variable + - Each application of `(<>)` either builds a richer `Spec` for `x`, encompassing all the constraints, or + returns an `ErrorSpec` if they are inconsistent. +5. If the final resulting `Spec` is a `SuspendedSpec`, simplify it's `Pred` and return +6. If it is any other `Spec` just return the `Spec` + +The complete definition follows. The other (non-And) rules fall into two cases. +1. A literal constant occurs, and we can compute the final `Spec` using the properties of constants. +3. A non literal `Term` occurs. Which must mention the single variable `x` and some (possibly nested) function symbols. + - We use `toCtx` to build a 1-Hole context, and then use `propagateSpecM` + (which calls `propagateSpec`, which makes one or more calls to `propagate`), + to compute the result. + +``` +computeSpecSimplified :: + forall a. (HasSpec a, HasCallStack) => Var a -> Pred -> GE (Spec a) +computeSpecSimplified x pred3 = localGESpec $ case simplifyPred pred3 of + And ps -> do + spec <- fold <$> mapM (computeSpecSimplified x) ps + case spec of + SuspendedSpec y ps' -> pure $ SuspendedSpec y $ simplifyPred ps' + s -> pure s + ElemPred True t xs -> propagateSpecM (MemberSpec xs) (toCtx x t) + ElemPred False (t :: Term b) xs -> propagateSpecM (TypeSpec @b (anySpec @b) (NE.toList xs)) (toCtx x t) + TruePred -> pure mempty + FalsePred es -> genErrorNE es + Assert (Lit True) -> pure mempty + Assert (Lit False) -> genError (show pred3) + Assert t -> propagateSpecM (equalSpec True) (toCtx x t) + ForAll (Lit s) b -> fold <$> mapM (\val -> computeSpec x $ unBind val b) (forAllToList s) + ForAll t b -> do + bSpec <- computeSpecBinderSimplified b + propagateSpecM (fromForAllSpec bSpec) (toCtx x t) + Case (Lit val) as bs -> runCaseOn val as bs $ \va vaVal psa -> computeSpec x (substPred (singletonEnv va vaVal) psa) + Case t as bs -> do + simpAs <- computeSpecBinderSimplified as + simpBs <- computeSpecBinderSimplified bs + propagateSpecM (typeSpec (SumSpec simpAs simpBs)) (toCtx x t) + + Let t b -> pure $ SuspendedSpec x (Let t b) + Exists k b -> pure $ SuspendedSpec x (Exists k b) +``` + + + +## HasSpec (Either a b) instance + +We include the `HasSpec (Either a b)` instance because it illustrates how one may make a `HasSpec` instance +for a datatype with more than one constructor function. In the full system such datatypes are handled +by using a Sum of Products representation, which is recursive, nested version of the Pairs and Either. +This adds considerable complexity to the system, but the key ideas are fuly illustrated by the pair +`(a,b)` and `(Either a b)` instances. + +The `TypeSpec` for `(Either a b)` is the datatype `(SumSpec a b)` + +``` +data SumSpec a b = SumSpec (Spec a) (Spec b) + deriving (Show) + +deriving instance (Eq (Spec a), Eq (Spec b)) => Eq (SumSpec a b) + +guardSum :: forall a b. (HasSpec a, HasSpec b) => Spec a -> Spec b -> Spec (Either a b) +guardSum (ErrorSpec es) (ErrorSpec fs) = ErrorSpec (es <> fs) +guardSum (ErrorSpec es) _ = ErrorSpec (NE.cons "sum error on left" es) +guardSum _ (ErrorSpec es) = ErrorSpec (NE.cons "sum error on right" es) +guardSum s s' = typeSpec $ SumSpec s s' +``` + +The `HasSpec (Either a b)` instance is quite simple, essentially carrying around two different +`Specs` in `SumSpec a b`. One for values `(Left a)` and a different one for values `(Right b)`. +Note the special construtor `Case` of `Pred` specially designed for the type `Either`. + +``` +instance (HasSpec a, HasSpec b) => HasSpec (Either a b) where + type TypeSpec (Either a b) = SumSpec a b + + anySpec = SumSpec mempty mempty + + combineSpec (SumSpec a b) (SumSpec c d) = guardSum (a <> c) (b <> d) + + conformsTo (Left a) (SumSpec sa _) = conformsToSpec a sa + conformsTo (Right b) (SumSpec _ sb) = conformsToSpec b sb + + toPreds x (SumSpec a b) = Case x (bind $ \y -> satisfies y a) (bind $ \y -> satisfies y b) + + genFromTypeSpec (SumSpec (simplifySpec -> sa) (simplifySpec -> sb)) + | emptyA, emptyB = genError "genFromTypeSpec @SumSpec: empty" + | emptyA = Right <$> genFromSpecT sb -- Gen a `b` object + | emptyB = Left <$> genFromSpecT sa -- Gen a `a` object + | otherwise = oneofT [Left <$> genFromSpecT sa, Right <$> genFromSpecT sb] + where + emptyA = isErrorLike sa + emptyB = isErrorLike sb +``` + +The function symbols on the Either type are embedded in the `EitherSym` datatype + +``` +data EitherSym (dom :: [Type]) rng where + LeftW :: EitherSym '[a] (Either a b) + RightW :: EitherSym '[b] (Either a b) + +deriving instance Eq (EitherSym dom rng) + +instance Show (EitherSym dom rng) where show = name + +instance Syntax EitherSym where + name LeftW = "left_" + name RightW = "right_" + inFix _ = False + +instance Semantics EitherSym where + semantics LeftW = Left + semantics RightW = Right + +instance Logic EitherSym where + propagateTypeSpec LeftW (Unary HOLE) (SumSpec sl _) cant = sl <> foldMap notEqualSpec [a | Left a <- cant] + propagateTypeSpec RightW (Unary HOLE) (SumSpec _ sr) cant = sr <> foldMap notEqualSpec [a | Right a <- cant] + + propagateMemberSpec LeftW (Unary HOLE) es = + case [a | Left a <- NE.toList es] of + (x : xs) -> MemberSpec (x :| xs) + [] -> + ErrorSpec $ + pure $ + "propMemberSpec (left_ HOLE) on (MemberSpec es) with no Left in es: " ++ show (NE.toList es) + propagateMemberSpec RightW (Unary HOLE) es = + case [a | Right a <- NE.toList es] of + (x : xs) -> MemberSpec (x :| xs) + [] -> + ErrorSpec $ + pure $ + "propagate (Right HOLE) on (MemberSpec es) with no Right in es: " ++ show (NE.toList es) + +left_ :: (HasSpec a, HasSpec b) => Term a -> Term (Either a b) +left_ x = App LeftW (x :> Nil) + +right_ :: (HasSpec a, HasSpec b) => Term b -> Term (Either a b) +right_ x = App RightW (x :> Nil) +``` + + +## HasSpec (a,b) instance + +Pairs are the dual of `Either` having two components for a single constructor. They suggest a method for making `HasSpec` instances +for datatypes where a single constructor has multiple components. We will look at binary pairs first. Then look at how to generalize that +by nesting binary pairs. The `TypeSpec` for binary pairs is the type `PairSpec`. Which simply pairs two `Spec`. Since +`Spec` has a `Monoid` instance, it is trival to make a `Monoid` instance for `PairSpec`. + +``` +data PairSpec a b = Cartesian (Spec a) (Spec b) + +instance (HasSpec a, HasSpec b) => Show (PairSpec a b) where + show (Cartesian l r) = "(Cartesian " ++ "(" ++ show l ++ ") (" ++ show r ++ "))" + +instance (HasSpec a, HasSpec b) => Semigroup (PairSpec a b) where + (Cartesian x y) <> (Cartesian a b) = Cartesian (x <> a) (y <> b) + +instance (HasSpec a, HasSpec b) => Monoid (PairSpec a b) where mempty = Cartesian mempty mempty + +``` +Grouping together two `Spec` is a common occurence. It happens in `SumSpec` (the `TypeSpec` for `Either`) above, +and `PairSpec` (the `TypeSpec` for `(a,b)`) here. So combining two `Spec` where one of them may be an `ErrorSpec` +deserves a few helper functions. In a sense `hasError` lifts the `HasSpec` method `guardTypeSpec` from type specific `TypeSpec` to +the more general type `Spec`. + +``` +-- | Return (Just errormessage) if the input contains an `ErrorSpec` +hasError :: forall a. Spec a -> Maybe (NonEmpty String) +hasError (ErrorSpec ss) = Just ss +hasError (TypeSpec x _) = + case guardTypeSpec @a x of + ErrorSpec ss -> Just ss + _ -> Nothing +hasError _ = Nothing + + +-- | Given two 'Spec', return an 'ErrorSpec' if one or more is an 'ErrorSpec' +-- If neither is an 'ErrorSpec' apply the continuation 'f' + +guardPair :: forall a b. (HasSpec a, HasSpec b) => Spec a -> Spec b -> Spec (a, b) +guardPair specA specB = handleErrors specA specB (\s t -> typeSpec (Cartesian s t)) + +-- | If either of the first two arguments are an ErrorSpec, return an ErrorSpec. +-- If Both are error free, then apply the third argument, a continuation, to +-- the error-free inputs. This pattern occurs frequently. + +handleErrors :: Spec a -> Spec b -> (Spec a -> Spec b -> Spec c) -> Spec c +handleErrors spec1 spec2 f = case (hasError spec1, hasError spec2) of + (Just m1, Just m2) -> ErrorSpec (m1 <> m2) + (Just m1, _) -> ErrorSpec m1 + (_, Just m2) -> ErrorSpec m2 + (Nothing, Nothing) -> f spec1 spec2 + +The `HasSpec (a,b)` instance for pairs is quite simple, relying on the `HasSpec` instances for `a` and `b`. + +``` +instance (HasSpec a, HasSpec b) => HasSpec (a, b) where + type TypeSpec (a, b) = PairSpec a b + + anySpec = Cartesian mempty mempty + + combineSpec (Cartesian a b) (Cartesian a' b') = guardPair (a <> a') (b <> b') + + conformsTo (a, b) (Cartesian sa sb) = conformsToSpec a sa && conformsToSpec b sb + + guardTypeSpec (Cartesian x y) = guardPair x y + + genFromTypeSpec (Cartesian sa sb) = (,) <$> genFromSpecT sa <*> genFromSpecT sb + + toPreds x (Cartesian sf ss) = + satisfies (fst_ x) sf + <> satisfies (snd_ x) ss +``` + +The function symbols for pairs are encoded in the datatype `PairSym`. Its `Syntax` and `Semantics` instances are +straightforward, except there are two `rewriteRules` which explain how `FstW` and `SndW` distribute over `PairW`. + +``` +data PairSym (dom :: [Type]) rng where + FstW :: PairSym '[(a, b)] a + SndW :: PairSym '[(a, b)] b + PairW :: PairSym '[a, b] (a, b) + + toPreds x (Cartesian sf ss) = satisfies (fst_ x) sf <> satisfies (snd_ x) ss +``` + +### Syntax, Semantics, and Logic instances for PairSym + +``` +deriving instance Eq (PairSym dom rng) + +instance Show (PairSym dom rng) where show = name + +instance Syntax PairSym where + name FstW = "fst_" + name SndW = "snd_" + name PairW = "pair_" + inFix _ = False + +instance Semantics PairSym where + semantics FstW = fst + semantics SndW = snd + semantics PairW = (,) + rewriteRules FstW (Pair x _) Evidence = Just x + rewriteRules SndW (Pair _ y) Evidence = Just y + rewriteRules t l Evidence = Lit <$> applyFunSym @PairSym (semantics t) l +``` + +The `Syntax` and `Semantics` instances are very simple, except for the `rewriteRules` which tell +how `FstW` and `SndW` project over a `PairW` application. The pattern matching over +the over the application of the `PairW` application uses the Haskell Pattern synonym `Pair` + +``` +-- | Create a Haskell Pattern where (Pair x y) matches (App PairW (x :> y :> Nil)) +pattern Pair :: + forall c. () => forall a b. (c ~ (a, b), HasSpec a, HasSpec b) => Term a -> Term b -> Term c +pattern Pair x y <- App (getWitness -> Just PairW) (x :> y :> Nil) +``` + +``` +instance Logic PairSym where + propagateTypeSpec FstW (Unary HOLE) ts cant = typeSpec $ Cartesian (TypeSpec ts cant) TrueSpec + propagateTypeSpec SndW (Unary HOLE) ts cant = typeSpec $ Cartesian TrueSpec (TypeSpec ts cant) + propagateTypeSpec PairW (a :|> HOLE) sc@(Cartesian sa sb) cant + | a `conformsToSpec` sa = sb <> foldMap notEqualSpec (sameFst a cant) + | otherwise = + ErrorSpec + ( NE.fromList + ["propagate (pair_ " ++ show a ++ " HOLE) has conformance failure on a", show (TypeSpec sc cant)] + ) + propagateTypeSpec PairW (HOLE :<| b) sc@(Cartesian sa sb) cant + | b `conformsToSpec` sb = sa <> foldMap notEqualSpec (sameSnd b cant) + | otherwise = + ErrorSpec + ( NE.fromList + ["propagate (pair_ HOLE " ++ show b ++ ") has conformance failure on b", show (TypeSpec sc cant)] + ) + + propagateMemberSpec FstW (Unary HOLE) es = typeSpec $ Cartesian (MemberSpec es) TrueSpec + propagateMemberSpec SndW (Unary HOLE) es = typeSpec $ Cartesian TrueSpec (MemberSpec es) + propagateMemberSpec PairW (a :|> HOLE) es = + case (nub (sameFst a (NE.toList es))) of + (w : ws) -> MemberSpec (w :| ws) + [] -> + ErrorSpec $ + NE.fromList + [ "propagate (pair_ HOLE " ++ show a ++ ") on (MemberSpec " ++ show (NE.toList es) + , "Where " ++ show a ++ " does not appear as the fst component of anything in the MemberSpec." + ] + propagateMemberSpec PairW (HOLE :<| b) es = + case (nub (sameSnd b (NE.toList es))) of + (w : ws) -> MemberSpec (w :| ws) + [] -> + ErrorSpec $ + NE.fromList + [ "propagate (pair_ HOLE " ++ show b ++ ") on (MemberSpec " ++ show (NE.toList es) + , "Where " ++ show b ++ " does not appear as the snd component of anything in the MemberSpec." + ] + +sameFst :: Eq a1 => a1 -> [(a1, a2)] -> [a2] +sameFst a ps = [b | (a', b) <- ps, a == a'] + +sameSnd :: Eq a1 => a1 -> [(a2, a1)] -> [a2] +sameSnd b ps = [a | (a, b') <- ps, b == b'] +``` + +Finally we introduce the actual function symbols, that create `Term` + +``` +fst_ :: (HasSpec a, HasSpec b) => Term (a, b) -> Term a +fst_ x = App FstW (x :> Nil) + +snd_ :: (HasSpec a, HasSpec b) => Term (a, b) -> Term b +snd_ x = App SndW (x :> Nil) + +pair_ :: (HasSpec a, HasSpec b) => Term a -> Term b -> Term (a, b) +pair_ a b = App PairW (a :> b :> Nil) +``` + +## Generalizing to N-ary Products and Sums + +Here we explain how we can write `HasSpec` instances for tuples. The idea is +to write a series of `HasSpec` instances, that taken together, build arbitrary tuples inductively. +The base case is the binary tuple instance we defined above, and the inductive case +adds one more component to instance that is one component smaller in size. Here is +an example of that process for ternary tuples. For the moment the most important part to +study is the `TypeSpec` type familiy instance that makes clear the inductive step. + +``` +instance (HasSpec a, HasSpec b, HasSpec c) => HasSpec (a, b, c) where + type TypeSpec (a, b, c) = (Spec a, Spec (b, c)) -- Add one more component to binary tuples. + + anySpec = (mempty @(Spec a), mempty @(Spec (b, c))) + + combineSpec (a, b) (a', b') = guardTypeSpec (a <> a', b <> b') + + conformsTo (a, b, c) (sa, sbc) = conformsToSpec a sa && conformsToSpec (b, c) sbc + + guardTypeSpec (a, bc) = + handleErrors + a + bc + ( \x y -> case (x :: Spec a, y :: Spec (b, c)) of + (MemberSpec xs, MemberSpec ys) -> + -- Given two MemberSpec, build one MemberSpec, by joining all combinations + MemberSpec + ( NE.fromList + [ (a', b', c') + | a' <- NE.toList xs + , (b', c') <- NE.toList ys + ] + ) + (specA, specBC) -> constrained $ \p -> And [satisfies (head3_ p) specA, satisfies (tail3_ p) specBC] + ) + + genFromTypeSpec (a, bc) = f <$> genFromSpecT a <*> genFromSpecT bc + where + f a' (b', c') = (a', b', c') + toPreds x (a, bc) = satisfies (head3_ x) a <> satisfies (tail3_ x) bc +``` +The other interesting parts to notice is how we put the two pieces (an `a` with a `(b,c)`) together or take them apart, +to build and deconstruct the ternary tuple. This can be found in the definition of `guardTypeSpec` , which is called +inside of `combineSpec`, to merge the simple sub-cases on smaller tuples. +Deconstructing the ternary tuple happens in `anySpec`, `conformsTo`, `toPreds` and `genFromTypeSpec`. + +The last interesting part is the function symbols `head3_` and `tail3_`, which we will come back to in a moment. + +Just to emphasize the technique, we give the 4-tuple instance, to be sure to further demonstrate the inductive nature. +Here adding an `a` part, to ternary tuple `(b,c,d)`. + +``` +instance (HasSpec a, HasSpec b, HasSpec c, HasSpec d) => HasSpec (a, b, c, d) where + type TypeSpec (a, b, c, d) = (Spec a, Spec (b, c, d)) + + anySpec = (mempty @(Spec a), mempty @(Spec (b, c, d))) + + combineSpec (a, bcd) (a', bcd') = guardTypeSpec (a <> a', bcd <> bcd') + + conformsTo (a, b, c, d) (sA, sBCD) = conformsToSpec a sA && conformsToSpec (b, c, d) sBCD + + guardTypeSpec (a, bcd) = + handleErrors + a + bcd + ( \x y -> case (x, y) of + (MemberSpec xs, MemberSpec ys) -> + MemberSpec + ( NE.fromList + [ (s, b, c, d) + | s <- NE.toList xs + , (b, c, d) <- NE.toList ys + ] + ) + (specA, specBCD) -> constrained $ \ps -> And [satisfies (head4_ ps) specA, satisfies (tail4_ ps) specBCD] + ) + + genFromTypeSpec (a, bcd) = f <$> genFromSpecT a <*> genFromSpecT bcd + where + f a' (b, c, d) = (a', b, c, d) + toPreds x (a, bcd) = satisfies (head4_ x) a <> satisfies (tail4_ x) bcd +``` + +### The function symbols for 3 and 4 tuples. + +Function symbols, to break Bigger tuples into sub-tuples + +``` +data TupleSym (ds :: [Type]) r where + Head3W :: All HasSpec '[a, b, c] => TupleSym '[(a, b, c)] a + Tail3W :: All HasSpec '[a, b, c] => TupleSym '[(a, b, c)] (b, c) + Head4W :: All HasSpec '[a, b, c, d] => TupleSym '[(a, b, c, d)] a + Tail4W :: All HasSpec '[a, b, c, d] => TupleSym '[(a, b, c, d)] (b, c, d) + +deriving instance Eq (TupleSym ds r) + +instance Show (TupleSym ds r) where show = name + +instance Syntax TupleSym where + inFix _ = False + name Head3W = "head3_" + name Tail3W = "tail3_" + name Head4W = "head4_" + name Tail4W = "tail4_" + +instance Semantics TupleSym where + semantics Head3W = \(a, _b, _c) -> a + semantics Tail3W = \(_a, b, c) -> (b, c) + semantics Head4W = \(a, _b, _c, _d) -> a + semantics Tail4W = \(_a, b, c, d) -> (b, c, d) +``` + +The `Logic` instance shows how to propagate `Spec` over the function +symbols for 3-tuples `Head3W` and `Tail3W`, and 4-tuples `Head4W` and `Tail4W` + +``` +instance Logic TupleSym where + propagate _ _ TrueSpec = TrueSpec + propagate _ _ (ErrorSpec msgs) = ErrorSpec msgs + propagate symW (Unary HOLE) (SuspendedSpec v ps) = + constrained $ \v' -> Let (App symW (v' :> Nil)) (v :-> ps) + propagate Head3W (Unary HOLE) specA = anyTail3 specA + propagate Tail3W (Unary HOLE) specBC = anyHead3 specBC + propagate Head4W (Unary HOLE) specA = anyTail4 specA + propagate Tail4W (Unary HOLE) specBCD = anyHead4 specBCD + +-- The Spec for a 3-tuple, (a,b,c) where their is no constraint on 'a' +anyHead3 :: forall a b c. (HasSpec a, HasSpec b, HasSpec c) => Spec (b, c) -> Spec (a, b, c) +anyHead3 specBC = typeSpec @(a, b, c) (mempty @(Spec a), specBC) + +anyTail3 :: forall a b c. (HasSpec a, HasSpec b, HasSpec c) => Spec a -> Spec (a, b, c) +anyTail3 specA = typeSpec (specA, mempty @(Spec (b, c))) + +anyHead4 :: + forall a b c d. (HasSpec a, HasSpec b, HasSpec c, HasSpec d) => Spec (b, c, d) -> Spec (a, b, c, d) +anyHead4 specBCD = typeSpec (mempty @(Spec a), specBCD) + +anyTail4 :: + forall a b c d. (HasSpec a, HasSpec b, HasSpec c, HasSpec d) => Spec a -> Spec (a, b, c, d) +anyTail4 specA = typeSpec (specA, mempty @(Spec (b, c, d))) +``` + +So the point of this rather large set of examples is that + +1. The `HasSpec` instances for `(c,d)`, `(b,c,d)` and `(a,b,c,d)` are very similar +2. Each tuple is built using the definition of the tuple one size smaller. +3. This suggests an inductive structure for tuples. + +It is possible to write a single `HasSpec` instance for all tuples. But this adds quite a bit +of complexity to the system. + +1. It needs to handle definitions using Derive-Generics +2. It needs to understand how to handle arbitrary Sum-of-Products +3. It needs to provide default method instances that use + [Default method signatures](https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/default_signatures.html) for every method of `HasSpec`. + +The amount of extra code is very large, and actually hides the the important ideas behind +the Design Principles we discuss in this document. So we have factored it out of the code for +this minimal implementation. diff --git a/docs/Manual.md b/docs/Manual.md new file mode 100644 index 0000000..611ef02 --- /dev/null +++ b/docs/Manual.md @@ -0,0 +1,1673 @@ +# Constrained Generators Manual + +The markdown source of this file can be obtained +[here](https://github.com/input-output-hk/constrained-generators/blob/master/docs/Manual.md) + +All the examples in this file can be obtained +[here](https://github.com/input-output-hk/constrained-generators/blob/master/examples/Constrained/Examples/ManualExamples.hs) + + +**Table of Contents** + +- [Constrained Generators Manual](#constrained-generators-manual) + - [Constrained Generators is a First-Order Logic](#constrained-generators-is-a-first-order-logic) + - [Design Goals of the Library](#design-goals-of-the-library) + - [HasSpec instances](#hasspec-instances) + - [Building logic specifications using Haskell functions](#building-logic-specifications-using-haskell-functions) + - [Another example using conjunction and simple arithmetic](#another-example-using-conjunction-and-simple-arithmetic) + - [Function Symbols](#function-symbols) + - [Predefined HasSpec instances and their function symbols.](#predefined-hasspec-instances-and-their-function-symbols) + - [Function symbols for numeric types](#function-symbols-for-numeric-types) + - [` Function symbols for Bool`](#-function-symbols-for-bool) + - [Function symbols for List](#function-symbols-for-list) + - [Function symbols for Set](#function-symbols-for-set) + - [Function symbols for Map](#function-symbols-for-map) + - [Generating from and checking against specifications](#generating-from-and-checking-against-specifications) + - [How we solve the constraints](#how-we-solve-the-constraints) + - [How to pick the variable order](#how-to-pick-the-variable-order) + - [The total definition requirement ](#the-total-definition-requirement) + - [Using Match to introduce new variables for subcomponents](#using-match-to-introduce-new-variables-for-subcomponents) + - [Overloaded types in the library](#overloaded-types-in-the-library) +- [Library functions to build Term, Pred, Specification](#library-functions-to-build-term-pred-specification) + - [From Term to Pred](#from-term-to-pred) + - [For all elements in a container type (List, Set, Map)](#for-all-elements-in-a-container-type-list-set-map) + - [Reification](#reification) + - [Disjunction, choosing between multiple things with the same type](#disjunction-choosing-between-multiple-things-with-the-same-type) + - [Primed library functions which are compositions with match](#primed-library-functions-which-are-compositions-with-match) + - [Constructors and Selectors](#constructors-and-selectors) + - [Naming introduced lambda bound Term variables](#naming-introduced-lambda-bound-term-variables) + - [Existential quantifiers](#existential-quantifiers) + - [Conditionals](#conditionals) + - [Explanations](#explanations) + - [Operations to define and use Specifications](#operations-to-define-and-use-specifications) + - [Utility functions](#utility-functions) + - [Escape Hatch to QuickCheck Gen monad](#escape-hatch-to-quickcheck-gen-monad) +- [Strategy for constraining a large type with many nested sub-components.](#strategy-for-constraining-a-large-type-with-many-nested-sub-components) +- [Writing HasSpec instances by hand.](#writing-hasspec-instances-by-hand) + - [Strategy 1 using GHC.Generics](#strategy-1-using-ghcgenerics) + - [Strategy 2 writing your own SimpleRep instance](#strategy-2-writing-your-own-simplerep-instance) + - [Strategy 3 defining the SimpleRep instance in terms of another type with a SimpleRep instance](#strategy-3-defining-the-simplerep-instance-in-terms-of-another-type-with-a-simplerep-instance) + - [Strategy 4, bypassing SimpleRep, and write the HasSpec instance by Hand](#strategy-4-bypassing-simplerep-and-write-the-hasspec-instance-by-hand) +- [A look into the internals of the system.](#a-look-into-the-internals-of-the-system) + + + +## Constrained Generators is a First-Order Logic + + +A First-order typed logic (FOTL) has 4 components, where each component uses types to ensure well-formedness. + +1. **Terms** consisting of + - Variables: `x`, `y` . + - Constants: `5`, `"abc"`, `True` . + - Applications: `elem_ "abc" xs`. Applications apply a function symbol (i.e. `elem_`) to a list of Terms, or support infix application (i.e. `abc ==. y`) +2. **Predicates** (Assert (x ==. 5)). Predicates are the assertions of boolean typed terms. +3. **Connectives** (And, Not, =>). Connectives make more complex Predicates out of simpler ones. +4. **Quantifiers** (Forall, Exists) + +The **Constrained generators** system is almost a FOTL implemented as an embedded domain specific language in Haskell. +We say almost, because it doesn't really have a complete set of connectives. +It allows programmers to write Haskell programs with type `Specification T` that denote a set of random +values for the type `T`, that are subject to a set of constraints expressed as Predicates. This supports property +based testing where a completely random set of values may not be useful. + + +## Design Goals of the Library + +The system was designed to have several important properties + +1. A Specification determines a `QuickCheck` generator for generating random values subject to constraints. +2. A Specification determines a check that a particular value meets all the specifications constraints. +3. If a Specification is over constrained, the system tries to explain where the conflict occurs. +4. The system is extensible so specifications can be extended to any type. +5. The system is modular so it can be broken into independant modules. + +The first part is implemented by the function `genFromSpec` + +``` +genFromSpec :: HasSpec a => Specification a -> QuickCheck.Gen a +``` + +The second is implemented by the function `conformsToSpec` + +``` +conformsToSpec :: HasSpec a => a -> Specification a -> Bool +``` + +The third is embodied by the error messages when solving a constraint fails + + +The **Constrained Generators** system is implemented as an embeded domain specific language in Haskell. +The Terms, Predicates, Connectives, and Quantifiers of the first order logic are embedded into three +Haskell datatypes (`Specification t`, `Term t`, and `Pred`). There is a rich (extendable) library +of Haskell functions that can be used to define and construct values of these types. The library is +implemented in a such a way, that the four parts of the logic are defined in ways that are similar +to Haskell expressions with type Bool. + +Let us look at a simple example, and study how this is done. Below is a Haskell declaration that defines a +specification of a pair of Int (`p`) , subject to the constraint that the first component (`x`) +is less than or equal to the second component (`y`) plus 2 + +``` +leqPair :: Specification (Int, Int) +leqPair = constrained $ \ p -> + match p $ \ x y -> + assert (x <=. (y + lit 2)) +``` + +The library uses Haskell lambda expressions to introduce variables in the Term language of the system, +and Haskell functions to build Terms and Predicates. The Haskell function `lit` takes Haskell values +and turns them into constants in the Term language. We give monomorphic types to the Haskell functions used in the +above definitions. We discuss more general types in a [later section](#overloaded). + +``` +constrained :: HasSpec a => (Term a -> Pred) -> Specification a + +match :: (HasSpec a, HasSpec b) => Term (a,b) -> (Term a -> Term b -> Pred) -> Pred + +lit :: HasSpec a -> a -> Term a + +assert :: Term Bool -> Pred + +(<=.) :: OrdLike a => Term a -> Term a -> Term Bool +``` + +The Haskell Constraint `HasSpec a` states that the type `a` has been admitted to the system as one of the types +that can be subject to constraints. The system comes with a large set of `HasSpec` instances, including ones for: + +1. Bool +2. Tuples +3. Sums +4. Numeric types (Int, Integer, Natural, Int8, Int16, Int32, Int64, Word8, Word16, Word32, Word64) +5. Lists +6. Maps +7. Sets +8. Trees +9. Maybe +10. Either +11. () + +## HasSpec instances + +`HasSpec` instances can always be added to admit more types. Any type with a `GHC.Generics(Generic)` instance can be +given a default instance by using its Sum-of-Products generic definition. In the Cardano Ledger System +over 200 types in the Ledger code base have been given `HasSpec` instances, either by using the `GHC.Generics` path, or by writing the instances by hand. + +## Building logic specifications using Haskell functions + +Note that `constrained` and `match` take functions, which abstract over terms, and return `Specification` and `Pred`. +Using the library functions, variables in the Term language are always introduced using Haskell lambda abstractions. And the library +functions combine these into Terms, Preds, and Specifications. + +## Another example using conjunction and simple arithmetic + +Suppose we want to put more than one simple condition on the pair of Ints. We would do that using the connective `And` that converts a `[Pred]` into a `Pred` + +``` +sumPair :: Specification (Int, Int) +sumPair = constrained $ \p -> + match p $ \x y -> + And + [ assert $ x <=. y + , assert $ y >=. 20 + , assert $ x + y ==. 25 + ] +``` + +This example also re-illustrates that `(Term Int)` has a (partial) Num instance, and that we can constrain +multiple (different) variables using simple `Num` methods (`(+)`, `(-)`, and `negate`). Note also +the operator: `(==.) :: (Eq n, HasSpec n) => Term n -> Term n -> Term Bool` + +## Function Symbols + +Note that `(<=.)` , and `(==.)` are two of the function symbols in the first order logic. They obey a +useful naming convention. Infix function symbols corresponding to Haskell infix operators have +corresponding infix operators, lifting Haskell infix functions with type `(a -> b -> c)`, to library infix +functions which have analogous types `(Term a -> Term b -> Term c)` +and are named using the convention that we add the dot `(.)` to the end of the Haskell operator. + +A similar naming convention holds for function symbols which are not infix, except instead of adding a +dot to the end of the Haskell name, we add an underscore (`_`) to the end of the Haskell functions's +name. Some examples follow. + +``` +(fst :: (a,b) -> a)` **to** +(fst_ :: (HasSpec a,HasSpec b) => Term (a,b) -> Term a) +``` +``` +(snd :: (a,b) -> b) **to** +(snd_ :: (HasSpec a,HasSpec b) => Term (a,b) -> Term b) +``` + +``` +(not :: Bool -> Bool) **to** +(not_ :: Term Bool -> Term Bool) +``` + +``` +(member :: a -> Set a -> Bool) **to** +(member_ :: HasSpec a => Term a -> Term (Set a) -> Term Bool) +``` + +While the underscored function symbols, may appear to be just to an Applicative lifting over `Term`, that is not the case. +An Applicative lifting would allow the base function to be applied under the `Term` type, but The underscored function symbols +also know how to reason logically about the function. + + +## Predefined HasSpec instances and their function symbols. + +In order to write specifications for a particular type, that type must have a `HasSpec` instance. +A type with a `HasSpec` instance might have a number of Function Symbols that operate on that type. +There are a number of types that have predefined `HasSpec` instances. As a reference, we list them +here along with the types of their function symbols. + +### Function symbols for numeric types + +`(Int, Integer, Natural, Int8, Int16, Int32, Int64, Word8, Word16, Word32, Word64)` + +The function symbols of numeric types are: + + 1. `(<=.) :: OrdLike a => Term a -> Term a -> Term Bool` + 2. `(<.) :: OrdLike a => Term a -> Term a -> Term Bool` + 3. `(>=.) :: OrdLike a => Term a -> Term a -> Term Bool` + 4. `(>.) :: OrdLike a => Term a -> Term a -> Term Bool` + 5. `(==.) :: (Eq a, HasSpec a) => Term a -> Term a -> Term Bool` + 6. A partial Num instance for (Term n) where n is a Numeric type. Operators `(+)`, `(-)`, `(*)` + +### Function symbols for Bool + +The function symbols of `Bool` are: + + 1. `(||.) :: Term Bool -> Term Bool -> Term Bool` infix `or` + 2. `not_ :: Term Bool -> Term Bool` + +### Function symbols for List + +`HasSpec a => HasSpec [a]` + +The function symbols of `[a]` are: + + 1. `foldMap_ :: (Sized [a], Foldy b, HasSpec a) => (Term a -> Term b) -> Term [a] -> Term b` + 2. `singletonList_ :: (Sized [a], HasSpec a) => Term a -> Term [a]` + 3. `append_ :: (Sized [a], HasSpec a) => Term [a] -> Term [a] -> Term [a]` + +### Function symbols for Set + +`HasSpec a => HasSpec (Set a)` + +The function symbols of `(Set a)` are: + + 1. `singleton_ :: (Ord a, HasSpec a) => Term a -> Term (Set a)` + 2. `union_ :: (Ord a, HasSpec a) => Term (Set a) -> Term (Set a) -> Term (Set a)` + 3. `subset_ :: (Ord a, HasSpec a) => Term (Set a) -> Term (Set a) -> Term Bool` + 4. `member_ :: (Ord a, HasSpec a) => Term a -> Term (Set a) -> Term Bool` + 5. `disjoint_ :: (Ord a, HasSpec a) => Term (Set a) -> Term (Set a) -> Term Bool` + 6. `fromList_ :: (Ord a, HasSpec a) => Term [a] -> Term (Set a)` + +### Function symbols for Map + +`(HasSpec k, HasSpec v) => HasSpec (Map k v)` + +The function symbols of `(Map k v)` are: + +1. `dom_ :: Ord k => Term (Map k v) -> Term (Set k)` +2. `rng_ :: Ord k => Term (Map k v) -> Term [v]` +3. `lookup_ :: Ord k => Term k -> Term (Map k v) -> Term (Maybe v)` + +## Generating from and checking against specifications + +Once we have written a `Specification` what can we do with it? Specifications have two interpretations. + +1. We can interpret it as a generator of values the meet all the constraints inside the specification. +2. We can interpret it as a function that checks if a given value meets all the constraints inside the specification. + + +The first interpretation of the specification is the function `genFromSpec` + +``` +--| Generate a value from the spec in the QuickCheck monad 'Gen' +genFromSpec:: (HasCallStack, HasSpec a) => Specification a -> QuickCheck.Gen a +``` + +This function is very useful when writing QuickCheck properties. With it we can write +`Gen` instances that are not completely random, but instead meet a set of constraints. + +Consider a system of 4 variables _w,x,y,z_ where we want to test the QuickCheck *implication* property +`(w < x && x < y && y < z) ==> property (w < z)` +We might write a QuickCheck property like this + +``` +prop1 :: Gen Property +prop1 = do + (w,x,y,z) <- arbitrary :: Gen (Int,Int,Int,Int) + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +The problem with this is that the probability that the condition `(w < x && x < y && y < z)` is True, for random +`w`, `x`, `y`, and `z`, is pretty low, so the property will pass vacuously most of the time, making a poor test. +We can observe this by + +``` +ghci> quickCheck prop1 +*** Gave up! Passed only 29 tests; 1000 discarded tests. +``` + +A vacuous pass, becomes a QuickCheck `discard`, so we cannot find 100 successful passes. +We can do a better job by constraining the condition using `genFromSpec` + +``` +spec1 :: Specification (Int,Int,Int,Int) +spec1 = constrained' $ \ w x y z -> [w <. x, x <. y, y <. z] + +prop2:: Gen Property +prop2 = do + (w,x,y,z) <- genFromSpec spec1 + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +Now the test passes. + +``` +ghci> quickCheck prop2 ++++ OK, passed 100 tests. +``` + +Now this isn't a very good test either, since the precedent is alway true. A better solution would be to +generate a mix, where the precedent is True most of the time, but sometimes False. Like this. + +``` +prop3:: Gen Property +prop3 = do + (w,x,y,z) <- frequency [(9,genFromSpec spec1),(1,arbitrary)] + pure $ (w < x && x < y && y < z) ==> property (w < z) +``` + +Observe the result: + +``` +ghci> quickCheck prop3 ++++ OK, passed 100 tests; 7 discarded. +``` + +The `7 discarded` arises because in most random samples of `w`, `x`, `y`, and `z`, the ordering constraint will +not be true, so a discard will occur. Because the ratio of constrained to random is 9 to 1, so the number +of failures is small enough that we don't `give up`. The role of the `arbitrary` is to guard against the possibility +that the constrained spec might be over constrained, and in that case will be values that will never be chosen in a random test. +Which means we will not be choosing a good sample of the test space, so some bugs might escape detection. +By adjusting the test so that any `arbitrary` samples, that are not ordered, pass the test, we identify +a piuntin the sample space that the code does not handle correctly. In a later section we discuss how to use +QuickChrck `classify` and `cover` write better tests. + +The purpose of constrained generators is to make it possible to write conditional 'implication' properties +that have a high probability of not being vacuously true, and to combine this with other random +techniques to make better samples of the test-space. + + +The second interpretation of the specification is as a constraint checker, implemented as the function. + +``` +conformsToSpec :: HasSpec a => a -> Specification a -> Bool +``` + +## How we solve the constraints + +The strategy for generating things from `Pred`s is relatively straightforward +and relies on one key fact: any constraint that has only one free variable `x` +and where `x` occurs only once can be turned into a `Specification` for `x`. + +We say that such constraints _define_ `x` and given a set of constraints `ps` +and a variable `x` we can split `ps` into the constraints that define `x` and +any constraints that don't. We can then generate a value from `x` by computing +a spec for each defining constraint in `ps` and using the `Semigroup` structure +of `Specification`s to combine them and give them to `genFromSpecT`. Once we obtain a +value for `x` we can substitute this value in all other constraints and pick +another variable to solve. + +For example, given the following constraints on integers `x` and `y` + +``` + x <. 10 + 3 <=. x + y <. x +``` + +we see that `x <. 10` and `3 <=. x.` are defining constraints for `x` and there +are no defining constraints for `y`. We compute a `Specification` for `x` for each +constraint, in this case `x <. 10` turns into something like `(-∞,10)` and +`3 <=. x` turns into `[3, ∞)`. We combine the specs to form `[3, 10)` from which we +can generate a value, e.g. 4 (chosen by fair dice roll). We then substitute +`[x := 4]` in the remaining constraints and obtain `y <. 4`, giving us a defining +constraint for `y`. + +### How to pick the variable order + +At this point it should be relatively clear that the order we pick for the +variables matters a great deal. If we choose to generate `y` before `x` in our +example we will have no defining constraints for `y` and so we pick a value for +it freely. But that renders `x` unsolvable if `y > 9` - which will result in +the generator failing to generate a value (one could consider backtracking, but +that is very computationally expensive so _relying_ on it would probably not be +wise). + +Computing a good choice of variable order that leaves the least room for error +is obviously undecidable and difficult and we choose instead an explicit +syntax-directed variable order. Specifically, variable dependency in terms is +_left-to-right_, meaning that the variables in `x + y <. z` will be solved in +the order `z -> y -> x`. On top of that there is a constraint `dependsOn y x` +that allows you to overwrite the order of two variables. Consequently, the +following constraints will be solved in the order `z -> x -> y`: + +``` + x + y <. z + y `dependsOn` x +``` + +A consequence of this is that it is possible to form dependency loops by +specifying multiple constraints, e.g. in: + +``` + x <. y + y <. x + 10 +``` + +However, this situation can be addressed by the introduction of `dependsOn` to +settle the order. It is worth noting that the choice of order in `dependsOn` +is important as it affects the solvability of the constraints (as we saw +above). We leave the choice of `dependsOn` in the example below as an exercise +for the reader. + +``` + x <. y + y <. x + 10 + 0 <. x + ? `dependsOn` ? +``` + +### The total definition requirement + +For the sake of efficiency we require that all constraints are dispatched as +defining constraints for a variable before we begin solving. We call this the +total definition requirement. This requirement is necessary because a set of +constraints with left over constraints are unlikely to be solvable. + +Consider the following example for `p :: (Int, Int)` + +``` +fst_ p <. snd_ p +``` + +in which there is no defining constraint for `p`, which would lead us to +compute the spec `mempty` for `p` during solving - meaning we would pick an +arbitrary `p` that is irrespective of the constraints. This is problematic as +the probability of picking `p = (x, y)` such that `x <. y` is roughly `1/2`, as +you add more constraints things get much worse. + +The principal problem above is that information that is present in the +constraints is lost, which would force us to rely on a `suchThat` approach to +generation - which will become very slow as constraint systems grow. + +### Using Match to introduce new variables for subcomponents + +A solution to the total definition requirement is to *introduce more variables*. +We can rewrite the problematic `fst p <. snd p` example below as: + +``` +fst_ p ==. x +snd_ p ==. y +x <. y +``` + +The dependency graph for these constraints will be the following: + +``` +x `dependsOn` y +p `dependsOn` x +``` + +This configuration is solvable, one picks `y` first, then picks `x <. y` +and finally constructs `p = (x, y)`. + +Note that (1) we introduced more variables than were initially in the +constraints - these need to be bound somewhere - and (2) the order of +`fst_ p = x` is important, `p` depends on `x`, and not the other way +around. + +To do both of these things at the same time we use the `match` construct +to the language: + +``` +match :: Term (a,b) -> (Term a -> Term b -> Pred) -> Pred +``` + +Since `p` has type `(Term (Int,Int))` we can redo the example + +``` +fst_ p ==. x +snd_ p ==. y +x <. y +``` + +using match. This shows how to bring the new variables `x` and `y` into scope.. + +``` +match p $ \ x y -> x <. y +``` + + +## Overloaded types in the library + +In previous sections we provided some types for several of the library functions: `constrained`, `match`, + + +``` +constrained:: HasSpec a => (Term a -> Pred) -> Specification a + +match :: (HasSpec a, HasSpec b) => + Term (a,b) -> (Term a -> Term b -> Pred) -> Pred +``` + +It turns out, that these functions would be much more useful with more general types. This also applies to some other +library functions (`reify`, `caseOn`, etc.) we have not yet introduced. The general type of `constrained` is: + +``` +constrained:: (IsPred p, HasSpec a) => (Term a -> p) -> Specification a +``` + +This general type allows the function passed to `constrained` to be any function, that given a Term, returns any type that acts like a `Pred`. +The class IsPred is defined as follows, and has 4 instances. + +``` +class Show p => IsPred p where + toPred :: p -> Pred + +instance IsPred Bool +instance IsPred p => IsPred [p] +instance IsPred (Term Bool) +instance IsPred Pred +``` + +Thus the following would be type-correct calls to constrained. + +``` +ex1 :: Specification Int +ex1 = constrained $ \ x -> True +-- Any Haskell Boolean value + +ex2 :: Specification Int +ex2 = constrained $ \ x -> x ==. lit 3 +-- Any Haskell term, with type (Term Bool) + +ex3 :: Specification Int +ex3 = constrained $ \ x -> [ x <=. lit 2, x >=. lit 5 ] +-- A list of things that act like a Pred + +ex4 :: Specification Int +ex4 = constrained $ \ x -> assert $ x == lit 9 +-- Anything with type Pred +``` + +The type of `match` is also overloaded. It supports writing specifications over type with sub-components, allowing +each sub-component to be individually constrained. + +The `match` library function is used to introduce new `Term` variables for the sub-components of another type. +If `t` has type `T`, and `T` has 3 sub-components, then the `match` function would take a Haskell +lambda expression with three parameters. `(match t $ \ x y z -> ...)`. Its overloaded type is: + +``` +match + :: (HasSpec a, IsProductType a, IsPred p) => + Term a -> FunTy (MapList Term (ProductAsList a)) p -> Pred +``` + +The meaning of this is a bit hard to parse: `IsProductType a`. It means the type `a` is isomorphic to a product type. +I.e. isomorphic to `(t1,t2, ..., tn)` So all tuples would work. So would any type whose constructor had +one or more arguments, So would any type whose HasSpec instance was derived via the GHC.Generics instance. +So in summary, if the type `a` has **n** distinct parts, then the constraint (`IsProductType a`) is met, +and the interpretation of the `FunTy` is a function with **n** parameters. + +``` +type FunTy (MapList Term (ProductAsList a)) p = t1 -> t2 -> ... -> tn -> p +``` + +# Library functions to build Term, Pred, Specification + +## From Term to Pred +1. `assert :: IsPred p => p -> Pred` + +`assert` lifts a `(Term Bool)` to a `Pred`. by using the `IsPred` class, we can often get around using it, but it becomes necessary when we want to use the `And` operator, and the operands of `And` are a mix of `Pred`, `(Term Bool), and other operations. Here is a very simple use. Further examples illustrate its use in more challenging contexts. + +``` +ex5 :: Specification [Int] +ex5 = constrained $ \ xs -> assert $ elem_ 7 xs +``` + +Note that `elem_` is the function symbol corresponding to `Data.List.elem`. + +## For all elements in a container type (List, Set, Map) +1. `forAll :: (Forallable t a, HasSpec t, HasSpec a, IsPred p) => Term t -> (Term a -> p) -> Pred` + +The library function `forAll` is used to impose a constraint on every element of a container type. There are +three `Forallable` instances in the Base system. + +``` +class Forallable t e | t -> e where +instance Ord k => Forallable (Map k v) (k, v) +instance Ord a => Forallable (Set a) a +instance Forallable [a] a +``` + +Here is an example of its use. + +``` +ex6 :: Specification [Int] +ex6 = constrained $ \ xs -> + forAll xs $ \ x -> [ x <=. 10, x >. 1] +``` + +We sample this specification using the library function `debugSpec` as follows + +``` +ghci> debugSpec ex6 +constrained $ \ v_1 -> + forall v_0 in v_1 $ + {assert $ v_0 <=. 10 + assert $ v_0 >. 1} +[6,3,7,4,9,10,5,6,10,5,8,3,6,7,8,5,5,6] +True +``` + +The library function `debugSpec` prints 3 things + +1. The internal representation of the specification. +2. A sample solution. +3. The boolean result of testing if the sample solution, actually conforms to the spec. + +If the specification fails to find a solution, it prints out an explanation of why it failed. +Most of the time this means the spec was overconstrained, and the explanation attempts to identify +the part of the specification that cause this conflict. It also returns an explanation when +the default (right to left) dependency order does not work, but this is an area where there +could be a lot of improvement. + +## Reification +1. `reifies :: (HasSpec a, HasSpec b) => Term b -> Term a -> (a -> b) -> Pred` +2. `reify :: (HasSpec a, HasSpec b, IsPred p) => Term a -> (a -> b) -> (Term b -> p) -> Pred` +3. `assertReified :: (HasSpec Bool, HasSpec a) => Term a -> (a -> Bool) -> Pred` + +Reification is used to produce a constrained value that can be obtained as the application of a Haskell function to +the value of another constrained term. This ensures a known relationship between the two. That relationship is exactly +the action that Haskell function computes from the value of the second `Term` to obtain the first. Unfortunately this +doesn't quite make sense as the Haskell function work on values not `Terms`. Reification makes it possible +to use any Haskell function from `(a -> b)` as if it had type `(Term a -> Term b)`. +This comes in several flavors, two of which have an internal (hidden) existential constraint. + +The first use: `(reifies b a f)`, says a term `b` can be obtained from a term `a` using the Haskell function `(f :: a -> b)`. +Here is an example of its use. Internally, it works by forcing the solution of `a` before solving for `b`, applying the `f` to +the solution for `a`, and then constructing a `(Term b)` obtained from this value. + +``` +ex7 :: Specification (Int,[Int]) +ex7 = constrained $ \ pair -> + match pair $ \ n xs -> + reifies n xs sum +``` + +Note that `sum` is a Haskell function, not a function symbol in the system that can be applied to `Term`s. +The system solves for `xs`, +then applies `sum` to the list obtained by solving `xs`. Then binds the `Term` variable `n` to that literal value. +Here is a sample solution. + +``` +ghci> debugSpec ex7 +constrained $ \ v_3 -> + [to v_3/v_2]let v_1 = prodFst_ v_2 in + let v_0 = prodSnd_ v_2 in reifies v_1 v_0 +(82,[-24,22,28,19,16,24,-20,1,16]) +True +``` + +The second operation `reify` can be defined using `reifies` by placing an existential constraint on the range of the function. +Here is an example of the use of `reify` + +``` +ex8 :: Specification ([Int],[Int]) +ex8 = constrained $ \ pair -> + match pair $ \ xs1 xs2 -> + [ assert $ sizeOf_ xs1 <=. 5 + , forAll xs1 $ \ x -> x <=. 10 + , reify xs1 reverse $ \ t -> xs2 ==. t + ] +``` + +Here we are saying there exists `t` which can be obtained by `reverse`ing `xs1` and that `xs2` equals `t` +Here is a sample solution. + +``` +ghci> debugSpec ex8 +constrained $ \ v_4 -> + [to v_4/v_3]let v_2 = prodFst_ v_3 in + let v_1 = prodSnd_ v_3 in + {assert $ sizeOf_ v_2 <=. 5 + forall v_0 in v_2 $ assert $ v_0 <=. 10 + exists v_0 in + {reifies v_0 v_2 + Explain ["reify v_2 somef $"] $ assert $ v_1 ==. v_0}} +([9,-1,-11],[-11,-1,9]) +True +``` + +The third operation `assertReified` can be used to place a boolean constraint on the existential. +Here is an example of its use. + +``` +ex9 :: Specification Int +ex9 = constrained $ \x -> + [ assert $ x <=. 10 + , assertReified x (<= 3) -- Note that (<= 3) is a Bool function on values, not Terms. + ] +``` + +Because`x` is constrained to be less than or equal to 10, the boolean function +`(<=3)` is falling back to a call to QuickCheck `suchThat` . Note we must solve for `x` before we can +apply the boolean function, to the only way to do that is to keep trying until `suchThat` also gets a value +`(<= 3)` as well. + +The reification library functions allow one to use regular Haskell functions on values to write +constraints, but they come at the cost of forcing the solution order of the variables. The `Term`s +being reified, must be solved before the `Term`s mentioned in the Haskell lambda expression +with the `Pred` based range. I.e. in `reify xs1 reverse $ \ t -> xs2 ==. t` the term `xs1` must +be solved before the terms `t` and `xs2` in the lambda expression `\ t -> xs2 ==. t)` can be solved. +This is because the value assigned to `xs1` must be passed to the Haskell function `reverse`. + +## Disjunction, choosing between multiple things with the same type + +1. `caseOn`, `branch`, `branchW` +2. `chooseSpec` + +Sometimes we want to choose between several different specifications for the same type. This come in two flavors. + +- When the choices we want to make arise from different constructors of the same (sum of products) type +- When we have two logically distinct constraints, either of which will meet our needs. + +Let's look at the first. Multiple constuctors from the same type. This uses the `caseOn` library functions and its two +helper functions `branch` (where each constructor is choosen equally) and `branchW` (where we can give weights, to determine the frequency each constructor is choosen). The type of `caseOn` + +``` +caseOn + :: Term a + -> FunTy + (MapList + (Weighted Binder) (Constrained.Spec.SumProd.Cases (SimpleRep a))) + Pred +``` + +The way to interpret this, is that `caseOn` is a function we apply to a `(Term a)` and *n* different branches, one for each +of the constructors of type `a`. Each branch is either weighted (using `(weight, branchW ...)` or +unweighted using `(branch ...)`, where the `...` is a function with *m* parameters, one for each of the +subcomponents of that constructor). First we introduce a sum of products type `Three` and use the GHC.Generics +instance to derive the HasSpec instance. + +``` +data Three = One Int | Two Bool | Three Int deriving (Ord, Eq, Show, Generic) +instance HasSimpleRep Three +instance HasSpec Three +``` + +Here is an example using the mechanism with no weights (`branch`), where every branch is equilikely. + +``` +ex10 :: Specification Three +ex10 = constrained $ \ three -> + caseOn three + (branch $ \ i -> i ==. 1) -- One + (branch $ \ b -> assert (not_ b)) -- Two + (branch $ \ j -> j ==. 3) -- Three +``` + +Note the trailing Haskell comments, they remind us which constructor we are dealing with. The system +expects the branches to be in the same order the constructors are introduced in the `data` defintions for `Three` + +Here is another example using the weighted mechanism (`branchW`). + +``` +ex11 :: Specification Three +ex11 = constrained $ \ three -> + caseOn three + (branchW 1 $ \ i -> i <. 0) -- One, weight 1 + (branchW 2 $ \ b -> assert b) -- Two, weight 2 + (branchW 3 $ \ j -> j >. 0) -- Three, weight 3 +``` + + Another sample solution of this +spec can be found in the section about `monitor` in a [later section](#weights), where we can +observe that some branches are more likely than others. + +The second way to specify disjunctions is to choose `chooseSpec`, where the two choices use the same +type, but are distinguished logically by the two input specifications. The type of +`chooseSpec` is as follows, where the `Int` determines the frequency of each choice. + +``` +chooseSpec:: HasSpec a => (Int, Specification a) -> (Int, Specification a) -> Specification a +``` + +Here is an example. + +``` +ex12 :: Specification (Int,[Int]) +ex12 = chooseSpec + (5, constrained $ \ pair -> + match pair $ \ total xs -> [ total >. lit 10, sum_ xs ==. total , sizeOf_ xs ==. lit 3]) + (3, constrained $ \ pair -> + match pair $ \ total xs -> [ total <. lit 10, sum_ xs ==. total, sizeOf_ xs ==. lit 6]) +``` + +Here are two samples, one using the first spec (total > 10 and size == 3), the other using the +second spec (total < 10 and size == 6). We have elided printing the internal representation (using ...) +because it is rather large. + +``` +ghci> debugSpec ex12 +constrained $ \ v_5 -> ... +(41,[15,3,23]) +True + +ghci> debugSpec ex12 +constrained $ \ v_5 -> ... +(9,[3,0,0,2,2,2]) +True +``` + +## Primed library functions which are compositions with match + +Some library functions introduce a new term variable using a Haskell lambda-abstraction. If that variable +has sub-components, then we often need to `match` that variable in order to introduce new term +variables for each of those components. The following library functions make that easier, because they are +predefined to compose the unprimed base library functions with `match` + +1. `forAll'` +2. `constrained'` +3. `reify'` + +Here are three examples using these primed library functions. Each example has two parts. The first part +defined interms of the un-primed function and `match`, and the second defined nterms of the primed librbary function. + +The primed version of `forAll` is `forAll'` + +``` +ex13a :: Specification [(Int,Int)] +ex13a = constrained $ \ xs -> + forAll xs $ \ x -> match x $ \ a b -> a ==. negate b + +ex13b :: Specification [(Int,Int)] +ex13b = constrained $ \ xs -> + forAll' xs $ \ a b -> a ==. negate b +``` + +The primed version of `constrained` is `constrained'` + +``` +ex14a :: Specification (Int,Int,Int) +ex14a = constrained $ \ triple -> + match triple $ \ a b c -> [ b ==. a + lit 1, c ==. b + lit 1] + +ex14b :: Specification (Int,Int,Int) +ex14b = constrained' $ \ a b c -> [ b ==. a + lit 1, c ==. b + lit 1] +``` + +The primed version of `reify` is `reify'` + + +``` +ex15a :: Specification (Int,Int,Int) +ex15a = constrained $ \ triple -> + match triple $ \ x1 x2 x3 -> + reify x1 (\ a -> (a+1,a+2)) $ \ t -> + match t $ \ b c -> [x2 ==. b, x3 ==. c] + +ex15b :: Specification (Int,Int,Int) +ex15b = + constrained $ \ triple -> + match triple $ \ x1 x2 x3-> + reify' x1 (\ a -> (a+1,a+2)) $ \ b c -> [x2 ==. b, x3 ==. c] +``` + +## Constructors and Selectors +1. `onCon` +2. `sel` +4. `isJust` + +In Haskell we can define data types with multiple constructors, and constructors with multiple sub-components. +The library functions `onCon`, `sel`, and `isJust`, allow us to constrain such types in a way less verbose +than using the `caseOn` library function. Consider the following + +``` +ex16 :: Specification Three +ex16 = constrained $ \ three -> + caseOn three + (branchW 1 $ \ i -> i ==. lit 1) -- One, weight 1 + (branchW 2 $ \ b -> assert (not_ b)) -- Two, weight 2 + (branchW 3 $ \ j -> j ==. 3) -- Three, weight 3 +``` + +We can express the same constraints using the `onCon` library function. To use `onCon` one must +define the `HasSpec` instance for the type using the `GHC.Generics` instance mechanism (see the example of the type `Three` in the *Disjunction* section). This makes sure the +system knows about the constructors and selectors of the type. To use `onCon` one must +type apply it to a String that names one of the constructors. Like this `(onCon @"One" ...)` . +This requires that the GHC language directive`{-# LANGUAGE DataKinds #-}` be in the source file. +Then apply it to a Term with the type returned by that constructor, followed by a Haskell function with one parameter +for each subcomponent of that constructor, that returns a Pred. Here is `ex16` redone using three `onCon` predicates. + +``` +ex17:: Specification Three +ex17 = constrained $ \ three -> + [ onCon @"One" three (\ x -> x==. lit 1) + , onCon @"Two" three (\ x -> not_ x) + , onCon @"Three" three ( \ x -> x==. lit 3) + ] +``` + +The real power of `onCon` is when you only want to constrain one (or a subset) of the constructors of a type, +and the other constructors remain unconstrained. Here we only constrain the constructor `Three` and the constructors +`One` and `Two` remain unconstrained. + +``` +ex18:: Specification Three +ex18 = constrained $ \ three -> + onCon @"Three" three ( \ x -> x==. lit 3) +``` + +Here is another example where we only constrain the constructor `Just` of the maybe type. +``` +ex19 :: Specification (Maybe Bool) +ex19 = constrained $ \ mb -> onCon @"Just" mb (\ x -> x==. lit False) +``` + +Haskell allows the definition of data types with named selectors. Here is an example. + +``` +data Dimensions + where Dimensions :: + { length :: Int + , width :: Int + , depth :: Int } -> Dimensions + deriving (Ord, Eq, Show, Generic) +instance HasSimpleRep Dimensions +instance HasSpec Dimensions +``` + +This introduces Haskell functions with types + +``` +length :: Dimensions -> Int +width :: Dimensions -> Int +depth :: Dimensions -> Int +``` + +If we use the `GHC.Generics` path to derive the `HasSimpleRep` and the `HasSpec` +instances, the we can use the `sel` library function to create lifted versions of +the Haskell selector functions like this. Nothe that the lifted `width` uses the trailing +under score convention: `width_` because it manipulates `Term`s not values. + +``` +width_ :: Term Dimensions -> Term Int +width_ d = sel @1 d +``` + +This requires the `DataKinds` directive, and importing `GHC.Generics` and `GHC.TypeLits` to work. + +``` +{-# LANGUAGE DataKinds #-} +import GHC.Generics +import GHC.TypeLits +``` + +When we type-apply the library function `sel` to a type-level `Natural` number like this `(sel @1)` it +selects the `ith` selector function. The selectors are numbered from `0` to `n-1` . Selection +can always be expressed using `match` like this: + +``` +ex20a :: Specification Dimensions +ex20a = constrained $ \ d -> + match d $ \ l w d -> [ l >. lit 10, w ==. lit 5, d <. lit 20] +``` + +which can be reexpressed using `sel` as this. + +``` +ex20b :: Specification Dimensions +ex20b = constrained $ \ d -> + [sel @0 d >. lit 10 + ,sel @1 d ==. lit 5 + ,sel @2 d <. lit 20] +``` + +When we wish to constrain just a subset of the subcomponents, selectors make it possible +to write more concise `Specification`s. + +``` +ex21 :: Specification Dimensions +ex21 = constrained $ \ d -> width_ d ==. lit 1 +``` + + +## Naming introduced lambda bound Term variables +1. [var|name|] + +When we use a library function that introduces new Term variable using a Haskell lambda expression, the system +gives the Haskell variable a unique Term level name such as `v0` or `v1` or `v2` etc. When the specification is +over constrained, the system attempts to explain what is wrong, but when doing this, it uses the internal +Term level names that were assigned to the Haskell variables. This means the error messages are often hard to +understand. For example consider the over constrained specification. It is over constrained because +`left` cannot be simultaneously equal to `right` and `right + lit 1` + +``` +ex22a :: Specification (Int,Int) +ex22a = constrained $ \ pair -> + match pair $ \ left right -> + [ left ==. right, left ==. right + lit 1] +``` + +When we attempt to solve this specification for `pair` we get the following error message + +``` +debugSpec ex22a +StepPlan for variable: v_1 fails to produce Specification, probably overconstrained. +Original spec ErrorSpec + Intersecting: + MemberSpec [20] + MemberSpec [19] + Empty intersection +``` + +Note the error message is in terms of the internal Term name `v1`. Which is not very useful. + +To make error messages clearer we can name Haskell lambda bound variables using `[var|left|]` instead of just `left`. In order to do this we must have the following directives in our file. + +``` +{-# LANGUAGE QuasiQuotes #-} +{-# LANGUAGE ViewPatterns #-} +``` + +Here is the same example using this naming schema. + +``` +ex22b :: Specification (Int,Int) +ex22b = constrained $ \ [var|pair|] -> + match pair $ \ [var|left|] [var|right|] -> [ left ==. right, left ==. right + lit 1] +``` + +We now get a better error message + +``` +ghci> debugSpec ex22b +StepPlan for variable: left_1 fails to produce Specification, probably overconstrained. +Original spec ErrorSpec + Intersecting: + MemberSpec [29] + MemberSpec [28] + Empty intersection +``` + + +## Existential quantifiers +1. `exists` +2. `unsafeExists` + + +Sometimes we want to constrain a variable, in terms of another internal or hidden variable. +A classic example is constraining a number to be odd. A number `x` is odd, if there exists +another internal number `y`, such that, `x` is equal to `y + y + 1` + + +Here is an example. + +``` +ex22 :: Specification Int +ex22 = constrained $ \ [var|oddx|] -> + unsafeExists + (\ [var|y|] -> [Assert $ oddx ==. y + y + 1]) +``` + +Why do we call the library function `unsafeExists`? It is unsafe because the library function +`conformsToSpec` will fail to return `True` when called on a generated result. Why? +Because the system does not know how to find the internal, hidden variable. To solve this +the safe function `exists` takes two Haskell lambda expressions. The first is a function +that tells how to compute the hidden variable from the values returned by solving the constraints. +The second is the normal use of a Haskell lambda expression to introduce a Term variable +naming the hidden variable. Here is an example. + +``` +ex23 :: Specification Int +ex23 = ExplainSpec ["odd via (y+y+1)"] $ + constrained $ \ [var|oddx|] -> + exists + -- first lambda, how to compute the hidden value for 'y' + (\eval -> pure (div (eval oddx - 1) 2)) + + -- second lambda, introduces name for hidden value 'y' + (\ [var|y|] -> [Assert $ oddx ==. y + y + 1]) +``` + +One might ask what is the role of the parameter `eval` in the first lambda expression passed to `exists`. +Recall that we are trying assist the function `conformsToSpec` in determining if a completely known value +conforms to the specification. In this context, every Term variable is known. But in the specification we +only have variables with type `(Term a)`, what we need are variables with type `a`. The `eval` functional parameter +has type + +``` +eval :: Term a -> a +``` + +This, lets us access those values. In the example the Term variable `(oddx :: Term Int)` is in scope, so +`(eval oddx :: Int)`, and since this is only used when calling `conformsToSpec`, we actually have the +value of the `oddx` for `eval` to return. Finally `(div (eval oddx - 1) 2))` tells us how to compute +the value of the hidden variable. + + +## Conditionals +1. `whenTrue` +2. `ifElse` + +If one has a term `x` with type `(Term Bool)` one could use the value of this term to +define a Specification conditional on this term by using the `(caseOn x ...)` library function. +There are two more concise library function one can use instead. + +``` +whenTrue :: IsPred p => Term Bool -> p -> Pred +ifElse :: (IsPred p, IsPred q) => Term Bool -> p -> q -> Pred +``` + +For example consider the data type `Rectangle` where the selector `square` indicates +that the rectangle has equal length width and length. + +``` +data Rectangle = Rectangle { wid :: Int, len :: Int, square :: Bool} + deriving (Show, Eq,Generic) +instance HasSimpleRep Rectangle +instance HasSpec Rectangle +``` + +We can enforce this in a specification as follows. + +``` +ex26 :: Specification Rectangle +ex26 = constrained' $ \ wid len square -> + [ assert $ wid >=. lit 0 + , assert $ len >=. lit 0 + , whenTrue square (assert $ wid ==. len) + ] + ``` + +Note that there are no constraints relating `wid` and `len` if the selector `square` is `False` +The library function `ifElse` allows two separate sets of constraints, one when the +`square` is `True` and a completely separate set when `square` is `False` + +``` +ex27 :: Specification Rectangle +ex27 = constrained' $ \ wid len square -> + ifElse square + (assert $ wid ==. len) -- This is the True branch + [ assert $ wid >=. lit 0 -- This compound `Pred` is the False branch + , assert $ len >=. lit 0 ] +``` + +## Explanations +1. `assertExplain` +2. `explanation` +3. `ExplainSpec` + +Explanations allow the writer of a specification to add textual message, that can be used +if solving a specification fails. These library functions have type + +``` +explanation :: NonEmpty String -> Pred -> Pred +assertExplain :: IsPred p => NonEmpty String -> p -> Pred +ExplainSpec :: [String] -> Specification a -> Specification a +``` + +Here is a specification with no explanations + +``` +ex28a :: Specification (Set Int) +ex28a = constrained $ \ s -> + [ assert $ member_ (lit 5) s + , forAll s $ \ x -> [ x >. lit 6, x <. lit 20] + ] +``` +This specification is over constrained. Here is what the error message returns. + +``` +ghci> debugSpec ex28a +Some 'must' items do not conform to 'element' spec: TypeSpec [7..19] [] +While combining 2 SetSpecs + (SetSpec must=[] elem=TypeSpec [7..19] [] size=TrueSpec @(Integer)) + (SetSpec must=[ 5 ] elem=TrueSpec @(Int) size=TrueSpec @(Integer)) +``` + +By adding an `ExplainSpec` like this + +``` +ex28b :: Specification (Set Int) +ex28b = ExplainSpec ["5 must be in the set"] $ + constrained $ \ s -> + [ assert $ member_ (lit 5) s + , forAll s $ \ x -> [ x >. lit 6, x <. lit 20] + ] +``` + +we get a better explanation + +``` +ghci> debugSpec ex28b +5 must be in the set + +Some 'must' items do not conform to 'element' spec: TypeSpec [7..19] [] +While combining 2 SetSpecs + (SetSpec must=[] elem=TypeSpec [7..19] [] size=TrueSpec @(Integer)) + (SetSpec must=[ 5 ] elem=TrueSpec @(Int) size=TrueSpec @(Integer)) +``` + +``` +ex28c :: Specification (Set Int) +ex28c = + constrained $ \ s -> explanation (pure "5 must be in the set") + [ assert $ member_ (lit 5) s + , forAll s $ \ x -> [ x >. lit 6, x <. lit 20] + ] +``` + +Where to add explanations is some what of an art. In many cases explanations are simply +lost. We are looking for examples where explanations get lost so that we can improve the system. + +## Operations to define and use Specifications +1. `satisfies` +2. `equalSpec` +3. `notEqualSpec` +4. `notMemberSpec` +5. `leqSpec` +6. `ltSpec` +7. `geqSpec` +8. `gtSpec` +5. `cardinality` + +There are a number of library functions that create specifications directly +and do not use the `constrained` library function. These are particulary useful +in conjunction with the library function `satisfies` which converts a `Specification` into a `Pred` + +``` +satisfies :: HasSpec a => Term a -> Specification a -> Pred +equalSpec :: a -> Specification a +notEqualSpec :: HasSpec a => a -> Specification a +notMemberSpec :: (HasSpec a, Foldable f) => f a -> Specification a +leqSpec :: OrdLike a => a -> Specification a +ltSpec :: OrdLike a => a -> Specification a +geqSpec :: OrdLike a => a -> Specification a +gtSpec :: OrdLike a => a -> Specification a +cardinality :: (Number Integer, HasSpec a) => Specification a -> Specification Integer +``` + +Here is an example of the use of `satisfies` in conjunction with `notMemberSpec` + +``` +ex29 :: Specification Int +ex29 = constrained $ \ x -> + [ assert $ x >=. lit 0 + , assert $ x <=. lit 5 + , satisfies x (notMemberSpec [2,3]) -- meaning (x /= 2) and (x /= 3) +``` + +In essence this spec says `x` is either `0`, `1`, `4`, or `5`, where +`2` and `3` are excluded by the `notMemberSpec` + +## Utility functions +1. `simplifyTerm` +2. `simplifySpec` +3. `genFromSpecT` +4. `genFromSpec` +5. `genFromSpecWithSeed` +6. `debugSpec` + +These functions are generally useful when writing and debugging specifications. + +``` +-- | Simplify a Term +simplifyTerm :: Term a -> Term a + +-- | Simplify a Spec, sometimes makes big simplications +simplifySpec :: HasSpec a => Specification a -> Specification a + +-- Generate a value from the Spec in the internal monad GenT +genFromSpecT + :: (HasSpec a,MonadGenError m) => + Specification a -> GenT m a + +-- Generate a value from the spec in the QuickCheck monad 'Gen' +genFromSpec + :: (GHC.Stack.Types.HasCallStack, HasSpec a) => + Specification a -> QuickCheck.Gen a + +-- | generate a value from a Spec using a seed (reproduceable) +genFromSpecWithSeed + :: HasSpec a => Int -> Int -> Specification a -> a + +-- | Run a Spec in the IO monad, prints helpful intermediate information +debugSpec :: HasSpec a => Specification a -> IO () + +``` + + +## Call backs to the QuickCheck Gen monad +1. `monitor` + +The function `forAllSpec` allows one to turn a `Specification` into a QuickCheck `Property` + +``` +forAllSpec :: + (HasSpec a, QuickCheck.Testable p) => + Specification a -> (a -> p) -> QuickCheck.Property +``` + +The librrary function `monitor` allows specification writers to access some of the QuickCheck property +modifiers, like `classify`, `label`, and `cover`, by turning them into a `Pred` using `monitor` + +``` +monitor + :: ((forall a. Term a -> a) -> QuickCheck.Property -> QuickCheck.Property) + -> Pred +``` + +The monitor `Pred` has no effect, unless the `Specification` that embeds the `monitor` call, is lifted +to a `Property` using `forAllSpec. Here is a example. + +``` +ex30 :: Specification (Int, Int) +ex30 = constrained $ \ [var|p|] -> + match p $ \ [var|x|] [var|y|] -> + [ assert $ x /=. 0 + , -- You can use `monitor` to add QuickCheck property modifiers for + -- monitoring distribution, like classify, label, and cover, to your + -- specification + monitor $ \eval -> + QuickCheck.classify (eval y > 0) "positive y" + . QuickCheck.classify (eval x > 0) "positive x" + ] +``` +The `monitor` library function use the same programming device as `reify`, it accepts a function +(with an functional argument, usually called `eval`, that turns a `(Term a)` into a value `a`. +Thus each successfull generation of the `Specification` lifts one or more `Specification` +variables ( `x` and `y` in the example) to a value that is then +passed to the `QuickCheck` modifier (`classify` in the example). Then `forAllSpec` turns the modifier on. +Here are two examples. The first just runs the specification using `debugSpec`, since this is not inside a call to +`forAllSpec`, no modification is created. + +``` +ghci> debugSpec ex30 +constrained $ \ v_3 -> + [to p_3/v_2]let v_1 = prodFst_ v_2 in + let v_0 = prodSnd_ v_2 in + {assert $ not_ (x_1 ==. 0) + monitor} +(-12,-15) +True +``` + +The second example uses `forAllSpec` to create a `Property`, that passes as long as the generator does not fail. + +``` +prop31 :: QuickCheck.Property +prop31 = forAllSpec ex30 $ \_ -> True +``` + +We can now use `QuicCheck.quickCheck` to test the property. + +``` +ghci> QuickCheck.quickCheck $ prop31 ++++ OK, passed 100 tests: +52% positive x +46% positive y +``` + +Where the `classify` statistics are reported. + + +### Classify with weights example. +As a final example we redo the weighted branch example `ex11` from the discussion of `caseOn` above. +We modify it by adding a `monitor` predicate. The purpose of this example is two fold. + +1. Illustrate the use of `monitor` +2. Demonstrate that the weighted `caseOn` changes the frequency at which the branches are choosen. + +``` +ex11m :: Specification Three +ex11m = constrained $ \ three -> + [ caseOn three + (branchW 1 $ \ i -> i <. 0) -- One, weight 1 + (branchW 2 $ \ b -> assert b) -- Two, weight 2 + (branchW 3 $ \ j -> j >. 0) -- Three, weight 3 + , monitor $ \eval -> + case (eval three) of + One _ -> QuickCheck.classify True "One should be about 1/6" + Two _ -> QuickCheck.classify True "Two should be about 2/6" + Three _ -> QuickCheck.classify True "Three should be about 3/6" + ] + +propex11 :: QuickCheck.Property +propex11 = forAllSpec ex11m $ \_ -> True + +ex33 = QuickCheck.quickCheck $ propex11 +``` + +If we run `quickCheck` on `propex11` we get the following results. We run it several times to illustrate +that the weights are some what random, but do change the frequency. + +``` +ghci> ex33 ++++ OK, passed 100 tests: +45% Three should be about 3/6 +38% Two should be about 2/6 +17% One should be about 1/6 + +ghci> ex33 ++++ OK, passed 100 tests: +63% Three should be about 3/6 +21% Two should be about 2/6 +16% One should be about 1/6 + +ghci> ex33 ++++ OK, passed 100 tests: +52% Three should be about 3/6 +36% Two should be about 2/6 +12% One should be about 1/6 + +``` + +# Strategy for constraining a large type with many nested sub-components. + +When writing a `Specification` for a complex type with deeply nested subcomponents, the first thing that we need +to do, is introduce a `Term` variable for each subcomponent that needs to be constrained. A good way to start is +to build a skeleton `Specification` that does nothing more than just bring into scope a variable for each +(possibly nested) subcomponent. We build a deeply nested example by building on top of the types from +previous examples `Three` and `Rectangle`. We introduce a new deeply nested type `Nested` + +``` +data Nested = Nested Three Rectangle [Int] + deriving (Show,Eq,Generic) +instance HasSimpleRep Nested +instance HasSpec Nested +``` + +It is important that we derive `Show`, `Eq` and `Generic`, as these are needed to automatically derive the +`HasSimpleRep` and `HasSpec` instances. Now we write a `skeleton` specification. The process is to +introduce a `Term` variable for each nested component. We use the library functions +`constrained`, `match`, `caseOn` and `forAll`, each of which takes a Haskell lambda expression, +introducing a Haskell variable that is bound to a `Term` variable for each component. The goal in +this stage is to just introduce a variable for each subcomponent. So we use `TruePred` which places +no constraints on each of these subcomponents. We use named variables, and use a comment to label each `branch` +with the name of the construtor. + +``` +skeleton :: Specification Nested +skeleton = constrained $ \ [var|nest|] -> + match nest $ \ [var|three|] [var|rect|] [var|line|] -> + [ (caseOn three) + (branch $ \ i -> TruePred) -- One + (branch $ \ b -> TruePred) -- Two + (branch $ \ j -> TruePred) -- Three + , match rect $ \ [var|wid|] [var|len|] [var|square|] -> [ TruePred ] + , forAll line $ \ [var|point|] -> TruePred + ] +``` + +Once we have the `skeleton` compiling, we can replace each `TruePred` with some constraints. +In this stage, we worry just about the constraints, and which `Pred` to use, and not about how we bring +a variable into scope for each subcomponent. Experience show that we have way fewer compiler +errors, using this one step at a time strategy. + +# Writing HasSpec instances by hand. + +The `HasSpec` constraint tells us that a particular type has been admitted to the system, and that we can now +write constraints for that type. We have used four different strategies for writing HasSpec instances. We list +them here in increasing difficulty for the programmer, when writing an instance `(HasSpec T)` for some +new datatype `T`. + +1. Use a derived `GHC.Generic` instance for `T` to define a default `SimpleRep` instance. This uses automatic +means to define the `(SimpleRep T)` instance as a Sum-of-Products type. Then this Sum-of-Products type is used +to make a `(HasSpec T)` instance, using the default instances on Sum-Of-Products for the methods of `(HasSpec T)`. + +2. Write our own `(SimpleRep T)` instance. We need to this when the type `T` has some invariants not captured +by the stuctural description captured by Sum-Of-Products type. This requires a good understanding of the +relationship between the `SimpleRep` and `HasSpec` instances, and some of the internals of the system. + +3. Define the `(SimpleRep T)` instance in terms of another type `S` that already has a HasSpec instance. +This method works by thinking of the `T` as a newtype-like copy of `S`, and uses the `(match t)` library function +to bring into scope a variable of type `(Term S)`, and to then express the constraints in terms of `S`. +A classic example of this to define `(StrictSeq a)` in terms of `( [a] )`. This is possible because +the abstract properties of sequences and lists are the same. They only differ in terms of efficiency +on different function symbols. But the system constraints can only express things about the abstract properties, +not efficiency. + +4. Bypass the SimpleRep pathway, by defining the `(HasSpec T)` instance directly. This requires +defining the associated type family `TypeSpec` and all the methods of the `HasSpec` class. This is the method used +to define instances `(HasSpec [a])`, `(HasSpec (Set a))` , and `(HasSpec (Map k ))`. These are all abstract +types that have many properties not captured by the simple structural properties embedded in the types +constructor functions. Most of these properties are captured by relationships between the different +function symbols supported by that type. This is the most difficult pathway, but it can capture complicated +relationships on the type not possible using the other strategies. This requires a deep understanding of +the type `T`, its function symbols and their relationships, and internals of the system. This is a +complex task, but it gives the system much of its power, as it makes the system extendable, +to additional complex types, simply by adding a new `HasSpec` instance. + +## Strategy 1 using GHC.Generics + +This is the strategy we used in this manual for the types `Three`, `Dimensions`, `Rectangle`, and `Nested`. We will +not give another example here, but reiterate the requirement that the new type -must- have `Generic`, `Eq`, and `Show` instances, +so do not forget to put the deriving clause `deriving (Generic,Eq,Show)` in the data definition. + +## Strategy 2 writing your own SimpleRep instance + +This strategy depends on the provided `(SimpleRep T)` associated type family instance, being an actual Sum-of-Products type. +This requires quite a bit of knowledge about the internals of the system. Let's look closely at the `SimpleRep` class + +``` +class Typeable (SimpleRep a) => HasSimpleRep a where + type SimpleRep :: * -> * + type family SimpleRep a + toSimpleRep :: a -> SimpleRep a + fromSimpleRep :: HasSimpleRep a => SimpleRep a -> a +``` + +What kind of type lends itself to this strategy? +1. A type that has internal structure that enforces some internal invariants. +2. A type that has a -builder- function, that takes simple input, and constructs the internal struture. +3. A type that has an -accessor- function, that takes the internal structure, and returns the simple input. +4. A type where the -simple input- has a Sum-of-Products representation. + +Often the -builder- function is implemented as a Haskell Pattern. Here is an example that come from the Cardano Ledger. +A lot of complicated stuff is not fully describe here, but the example gives an overview of how it works. + + +``` +-- NOTE: this is a representation of the `ShelleyTxOut` type. You can't +-- simply use the generics to derive the `SimpleRep` for `ShelleyTxOut` +-- because the type is memoized (i.e. it stores a hidden copy of the actual bytes that +-- were deserialized when the TxOut was transmitted over the network). So instead, +-- we say that the representation is the same as what you would get from +-- using the `ShelleyTxOut` pattern. +type ShelleyTxOutTypes era = + '[ Addr + , Value era + ] +instance (Era era, Val (Value era)) => HasSimpleRep (ShelleyTxOut era) where + type TheSop (ShelleyTxOut era) = '["ShelleyTxOut" ::: ShelleyTxOutTypes era] + toSimpleRep (ShelleyTxOut addr val) = + inject @"ShelleyTxOut" @'["ShelleyTxOut" ::: ShelleyTxOutTypes era] + addr + val + fromSimpleRep rep = + algebra @'["ShelleyTxOut" ::: ShelleyTxOutTypes era] rep ShelleyTxOut + +instance (EraTxOut era, HasSpec (Value era)) => HasSpec (ShelleyTxOut era) +``` + +TODO add more explanation about the types. Much of the example of depends on properties of TxOut +that is not explained. + +## Strategy 3 defining the SimpleRep instance in terms of another type with a SimpleRep instance + +The type `Coin` is a defined as + +``` +newtype Coin = Coin {unCoin :: Integer} +``` +The operations on Coin ensure the invariant that one cannot build a `Coin` with a negative value. +We can enforce these invariants in a `SimpleRep` instance by making the `SimpleRep` type be a `Natural`. +`Natural` is one of the numeric types, and has its own `SimpleRep` instance, so it is a good choice. +Here is the `SimpleRep` instance, and the `HasSpec` instance defined using that representation. + +``` +instance HasSimpleRep Coin where + type SimpleRep Coin = Natural + toSimpleRep (Coin i) = case integerToNatural i of + Nothing -> error $ "The impossible happened in toSimpleRep for (Coin " ++ show i ++ ")" + Just w -> w + fromSimpleRep = naturalToCoin + +instance HasSpec Coin + +integerToNatural:: Integer -> Maybe Natural +integerToNatural c + | c < 0 = Nothing + | otherwise = Just $ fromIntegral c + +naturalToCoin :: Natural -> Coin +naturalToCoin = Coin . fromIntegral +``` + +To write a `Specification` for `Coin` we simply `match` against it, and then use the operations on the underlying +type `Word64` to constraint it. + +``` +ex34 :: Specification Coin +ex34 = constrained $ \coin -> + match coin $ \ nat -> [nat >=. lit 100, nat <=. lit 200] + -- Note that the match brings into scope the variable nat + -- which has type Natural, because the SimpleRep of Coin is Natural, +``` + +Here we use `debugSpec` to get a sample solution + +``` +ghci> debugSpec ex34 +constrained $ \ v_1 -> + [to v_1/v_0]{assert $ v_0 >=. 100 + assert $ v_0 <=. 200} +Coin {unCoin = 119} +``` + +## Strategy 4, bypassing SimpleRep, and write the HasSpec instance by Hand + +The `HasSpec` class has an associated type family and many methods. Here is a summary of some of it. + +``` +class (Typeable a, Eq a, Show a, Show (TypeSpec a), Typeable (TypeSpec a)) => HasSpec a where + -- | The `TypeSpec a` is the type-specific `Specification a`. + type TypeSpec a + + -- the default TypeSpec type, if one is not given. This is how + -- SimpleRep is used to automatically make a HasSpec instance. + type TypeSpec a = TypeSpec (SimpleRep a) + + -- `TypeSpec` behaves sort-of like a monoid with a neutral + -- element `anySpec` and a `combineSpec` for combining + -- two `TypeSpec a`. However, in order to provide flexibilty + -- `combineSpec` takes two `TypeSpec` and constucts a `Specification`. This + -- avoids e.g. having to have a separate implementation of `ErrorSpec` + -- and `MemberSpec` in `TypeSpec`. + + anySpec :: TypeSpec a + combineSpec :: TypeSpec a -> TypeSpec a -> Specification a + + -- | Generate a value that satisfies the `TypeSpec`. + -- The key property for this generator is soundness: + -- ∀ a ∈ genFromTypeSpec spec. a `conformsTo` spec + genFromTypeSpec :: (HasCallStack, MonadGenError m) => TypeSpec a -> GenT m a + + -- | Check conformance to the spec. + conformsTo :: HasCallStack => a -> TypeSpec a -> Bool + + -- | Shrink an `a` with the aide of a `TypeSpec` + shrinkWithTypeSpec :: TypeSpec a -> a -> [a] + + -- | Convert a spec to predicates: + -- The key property here is: + -- ∀ a. a `conformsTo` spec == a `conformsTo` constrained (\t -> toPreds t spec) + toPreds :: Term a -> TypeSpec a -> Pred + + -- | Compute an upper and lower bound on the number of solutions genFromTypeSpec might return + cardinalTypeSpec :: TypeSpec a -> Specification Integer +``` + +Writing HasSpec instance this way has many nuances. If you want to write your +own, then you should read the document, +[DesignPrinciples.md](https://github.com/input-output-hk/constrained-generators/blob/master/docs/DesignPrinciples.md) +and once you have mastered that, study the instances found in the source code +for the Constrained Generators System.