-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
513 lines (401 loc) · 23 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
# Haskell Theory Exploration #
Contact: chriswarbo@gmail.com
Homepage: http://chriswarbo.net/git/haskell-te
Mirrors:
- /ipns/QmWv958VzBJQchGjYKiSaxLC9ugrjvXkqMpVrmjp9AonXq
- https://github.com/Warbo/haskell-te
This program is free software: you can redistribute it and/or modify it under
the terms of the GNU Affero General Public License as published by the Free
Software Foundation, either version 3 of the License, or (at your option) any
later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along
with this program. If not, see <http://www.gnu.org/licenses/>.
## Introduction: A Wrapper Around QuickSpec ##
This repository provides commands for performing "theory exploration", mostly in
the Haskell programming language. Theory exploration describes the task of
taking in some function definitions and outputting conjectures about those
functions. For example, given the list functions `append` and `nil`, we might
get the following conjectures:
append v0 nil = v0
append nil v0 = v0
append v0 (append v1 v2) = append (append v0 v1) v2
In which case, we'd find that `append` and `nil` form a monoid for lists.
Currently, we're limited to finding equations. We rely on the excellent
QuickSpec library to do most of the work, but we also use and provide many
complementary tools. We use Nix to tame the menagerie of languages and systems
'under the hood'.
## Overview ##
We can think of these tools as "dynamic analysers" (as opposed to static
analysers), since we discover properties of the given code by *running it*, over
and over, in many combinations. Haskell is well suited to this, since its
"purity" prevents functions relying on any ambient 'context': everything a
Haskell function needs should be taken in as an argument. The type system tells
us which functions we can combine together, and what arguments they should be
given.
QuickSpec uses this knowledge, along with data generators for the popular test
framework QuickCheck, to find combinations of the given functions which seem to
behave the same way on many inputs. We conjecture that such expressions are
(beta) equivalent.
Whilst QuickSpec is very useful, it has a few weaknesses which we seek to
address:
- We provide a wrapper script around the QuickSpec library, allowing it to be
run as a standalone program.
- We provide a tool to extract all functions from a Haskell package, and
generate a suitable signature for QuickSpec, so the user doesn't have to.
- To generate our signatures, we automatically pick suitable names, types,
arities, data generators, comparison functions and free variables for all
relevant functions.
- We call out to Nix to ensure that all necessary packages (QuickSpec and
whatever is to be explored) are available.
- We replace QuickSpec's custom pretty-printer with a JSON format which is more
amenable to subsequent processing.
- We split apart QuickSpec's exploration phase from its 'filter out redundant
conjectures' phase, in case users want to explore without filtering, or wants
to filter some separate data.
## Commands ##
Here, we use "Haskell package" to mean a directory containing a Haskell project,
with a `.cabal` file, etc.
The Nix package defined in `default.nix` provides the following commands:
- `quickspec` takes the path to a Haskell package as an argument, explores the
code it finds inside, and writes the conjectures it discovers to stdout.
- `renderEqs` reads conjectures from stdin and pretty-prints a more
human-friendly syntax to stdout.
- `reduce-equations` reads conjectures from stdin and writes to stdout only
those which aren't "redundant" (i.e. it removes anything which is just a
special case of some other conjecture).
- `haskellPkgToAsts` takes the path to a Haskell package as an argument and
outputs a JSON array detailing the function definitions it finds inside.
- `quickspecAsts` will explore the equations described by a JSON array given on
stdin, in the same format outputted by `haskellPkgToAsts`. The relevant
Haskell packages should also be given as an argument, so GHC knows where to
find them.
- `concurrentQuickspec` takes a "2D array" of JSON on stdin, where each element
of the outer array is a JSON array as per `quickspecAsts`. A QuickSpec
process is launched for each, and the conjectures from all of them are
pooled.
The intermediate JSON data is useful if we want to remove particular functions
from being explored, or if we want to explore functions from multiple packages
together. Tools like `jq` are handy for manipulating this JSON.
## Examples ##
We can start a shell with these commands in scope as follows:
$ nix-shell --show-trace -p 'import ./. {}'
After much building and testing, this should drop us to a "nix shell" with an
augmented `PATH`:
[nix-shell]$ command -v quickspecAsts
/nix/store/...-haskell-theory-exploration/bin/quickspecAsts
Now we need some code to explore; you'll probably be using your own projects,
but here we'll fetch a package from Hackage:
[nix-shell]$ cabal get dlist
Downloading dlist-0.8.0.2...
Unpacking to dlist-0.8.0.2/
First let's try exploring it, by passing the directory path to `quickspec`:
[nix-shell]$ quickspec dlist-0.8.0.2 | tee dlist.eqs
...the usual messages from nix, haddock, and cabal...
Checking module availability
Found modules
Building type-extraction command
Extracting types
Building scope-checking command
Checking scope
Outputting JSON
Finished output
Set DEBUG=1 if you want to see gory GHCi output.
This stage is tricky. Set DEBUG=1 to see the debug info.
Getting ASTs
Getting types
Getting scope
Tagging
Tagged
...more messages from nix, haddock and cabal...
[
{
"relation": "~=",
...
},
...
]
This can take a while, and you might want to keep an eye on GHC's memory usage.
The messages in the middle are from our AST extraction system as it finds
functions definitions in the `dlist` packages, along with their types, arities,
etc.
The result is a bunch of conjectured equations involving functions from the
`dlist` package. These are in our JSON format, suitable for piping into other
tools. Since we only want to read them, we'll pipe them into `renderEqs`:
[nix-shell]$ renderEqs < dlist.eqs
((foldr v0) v1) empty ~= v1
(cons v0) empty ~= singleton v0
(map v0) empty ~= empty
(snoc empty) v0 ~= singleton v0
(apply empty) v0 ~= v0
head empty ~= undefined
tail empty ~= undefined
(append empty) empty ~= empty
(snoc ((replicate v0) v1)) v1 ~= (cons v1) ((replicate v0) v1)
(snoc (singleton v0)) v1 ~= (cons v0) (singleton v1)
head (singleton v0) ~= v0
tail (singleton v0) ~= empty
toList (fromList v0) ~= v0
(append ((unfoldr v0) v1)) (singleton v2) ~= (snoc ((unfoldr v0) v1)) v2
(append ((replicate v0) v1)) (singleton v2) ~= (snoc ((replicate v0) v1)) v2
(append ((replicate v0) v1)) ((replicate v2) v1) ~= (append ((replicate v2) v1)) ((replicate v0) v1)
(append (singleton v0)) ((unfoldr v1) v2) ~= (cons v0) ((unfoldr v1) v2)
(append (singleton v0)) ((replicate v1) v2) ~= (cons v0) ((replicate v1) v2)
(append empty) ((unfoldr v0) v1) ~= (unfoldr v0) v1
(append empty) ((replicate v0) v1) ~= (replicate v0) v1
(append ((unfoldr v0) v1)) empty ~= (unfoldr v0) v1
(append ((replicate v0) v1)) empty ~= (replicate v0) v1
(append (singleton v0)) (singleton v1) ~= (cons v0) (singleton v1)
(append (singleton v0)) (fromList v1) ~= (cons v0) (fromList v1)
(append (fromList v0)) (singleton v1) ~= (snoc (fromList v0)) v1
(append empty) (singleton v0) ~= singleton v0
(append empty) (fromList v0) ~= fromList v0
(append (singleton v0)) empty ~= singleton v0
(append (fromList v0)) empty ~= fromList v0
fromList (toList empty) ~= empty
(apply ((unfoldr v0) v1)) (toList empty) ~= toList ((unfoldr v0) v1)
(apply ((replicate v0) v1)) (toList empty) ~= toList ((replicate v0) v1)
(apply (singleton v0)) (toList empty) ~= toList (singleton v0)
(apply (fromList v0)) (toList empty) ~= v0
Here we've found that `toList` is the inverse of `fromList`, that `head` is the
inverse of `singleton`, etc.
We use the symbol `~=` to indicate that these are not definitions or proven
theorems, they're just conjectures based on many empirical tests. The terms
`v0`, `v1`, etc. are free variables, which are numbered from left to right per
equation; the prefix contains enough `v`s to prevent clashing with any function
name (e.g. if we had a function called `verbose`, or even `v0`, the variables
would render as `vv0`, `vv1`, etc.)
Notice that there are several equations of the form `(append empty) x = x`, but
that more general statement itself wasn't conjectured. The reason is that `x`
would be a variable of type `DList a` (in fact `DList Integer`, since we
monomorphise), but the `dlist` package doesn't expose a `DList a` instance of
QuickCheck's `Arbitrary` type class. Without this, we can't test equations
involving such variables, so we don't include them and hence the general
equation wasn't found.
If we want more control over what gets explored, we can use `haskellPkgToAsts`
to dump out a package's function definitions as JSON, without exploring them
immediately:
[nix-shell]$ haskellPkgToAsts dlist-0.8.0.2 | tee dlist.asts
...output from nix, haddock, cabal, ...
[{"name":"replicate","version":"0.8.0.2",...},...]
Tools like `jq` can manipulate this JSON. Here we'll take functions from the
`LazyLength` module of the `list-extras` package, and functions from `dlist`
which have `List` in their name, and combine them into one array:
[nix-shell]$ cabal get list-extras
[nix-shell]$ haskellPkgToAsts list-extras-0.4.1.4 > list-extras.asts
[nix-shell]$ jq -n --argfile le list-extras.asts \
--argfile dl dlist.asts \
'($dl | map(select(.name | contains("List")))) +
($le | map(select(.module | contains("LazyLength"))))' |
tee combined.asts
We can explore these together by using `quickspecAsts`. Note that we provide
*both* package directories as arguments, so GHC can find all of the necessary
code:
[nix-shell]$ quickspecAsts dlist-0.8.0.2 \
list-extras-0.4.1.4 < combined.asts | renderEqs
(lengthCompare v0) v0 ~= (lengthCompare v1) v1
v0 ~= toList (fromList v0)
In this case we didn't find any equations involving *both* `lengthCompare` and
`toList`/`fromList`, so there's no need to explore them together. Instead, we
could have explored them in separate processes using `concurrentQuickspec`.
We need a different `jq` command, since the one above uses `(...) + (...)` to
append two arrays together, resulting in:
[{"name":"lengthCompare",...},{"name":"toList",...},{"name":"fromList",...}]
This time we want multiple arrays, so we use `[...] + [...]` instead to get:
[[{"name":"lengthCompare",...}],
[{"name":"toList",...},{"name":"fromList",...}]]
Here's the call we use:
[nix-shell]$ jq -n --argfile le list-extras.asts \
--argfile dl dlist.asts \
'[$dl | map(select(.name | contains("List")))] +
[$le | map(select(.module | contains("LazyLength")))]' |
tee separate.asts
We send these nested arrays to `concurrentQuickspec` instead of `quickspecAsts`,
remembering to provide both directory paths as before. We also send the
resulting conjectures through `reduce-equations` to remove duplicates and
redundancies (`quickspec` and `quickspecAsts` currently do this automatically):
[nix-shell]$ concurrentQuickspec dlist-0.8.0.2 \
list-extras-0.4.1.4 < separate.asts |
reduce-equations | renderEqs
(lengthCompare v0) v0 ~= (lengthCompare v1) v1
v0 ~= toList (fromList v0)
The idea of `concurrentQuickspec` is to work around QuickSpec's exponential
worst-case complexity: turning O(2^N) into O(k*2^(N/k)). How we should divide up
one big array into multiple smaller ones, without losing good conjectures in the
process, is currently an open question!
## Design Goals ##
The design of this codebase is influenced by the following considerations:
*Exploring arbitrary (Haskell) code*
To be as useful as possible, we want to accept as much "normal" Haskell code as
we can, rather than placing too many requirements on users. Since the vast
majority of Haskell code only works with GHC, due to language extensions,
dependencies, etc. we're pretty much forced to use it.
Likewise, most code won't build without taking specific instructions into
account, like ensuring dependencies are available, preprocessors are executed,
commandline flags are provided, etc. This forces us to use a build system, so
we've opted for Cabal since it's been around for long enough to "just work" in
most situations, especially with Nix.
We're also forced to use an "eval" mechanism, since we need to run code supplied
by the user at runtime, but GHC is an ahead-of-time compiler. We've opted for
`nix-eval`, which spawns `runghc` in a subprocess and pipes in the given code.
As the name suggests, `nix-eval` runs this subprocess in an environment provided
by Nix, which lets us fetch, build and install any packages needed by the user's
code.
*Measuring performance*
We want a way to reliably measure and compare different exploration algorithms
and approaches. This requires the code to be:
- Highly automated, so many repetitions can be executed, against many different
inputs, with as little human intervention as possible.
- Modular, so that algorithms can be swapped out and separate components can be
run independently: in particular so we can time the exploration without
including things like function extraction.
- Deterministic and reproducible, as far as possible, so experiments can be
replicated and results can be compared.
- Fast at running a "happy path", allowing us to perform any expensive setup
(like compiling dependencies and creating environments) beforehand, so they
don't contribute to the measured runtime. In particular, we need a way to
prevent `nix-eval` from compile dependencies during an `eval` call.
## Caveats ##
We abstract over a lot of implementation details. These abstractions are leaky,
so you may encounter problems.
Barring dependency clashes, the packages most likely to work are "helper
libraries" for built-in types, like `list-extras`, since they can use the
`Arbitrary` instances bundled with QuickCheck (`Bool`, `[a]`, `Maybe a`, etc.).
Packages which define combinator libraries and simple 'sums of products'
datatypes should usually work, but you may need to define/expose some
`Arbitrary` instances, and maybe be selective about which functions get explored
together to avoid running out of memory.
Libraries making heavy use of type classes and records may struggle, as will
those whose "data" can't be serialised or compared (e.g. functions).
It goes without saying, but anything which invokes `IO` may cause damage to your
system. This should only be possible with `unsafePerformIO` and friends, but
don't blindly trust third-party code. A welcome feature would be to enforce
GHC's "safe haskell" mode when exploring!
Here are some known difficulties and workarounds:
*Function extraction uses a GHC compiler plugin*
This means we must be able to compile your code with GHC. We do this by invoking
Cabal, with specially crafted arguments in a specially crafted environment. If
your code doesn't have an accompanying `.cabal` file we *cannot* extract
function information from it. We have no plans to support alternative build
tools at this time.
Note that it is still be possible to *explore* non-Cabal packages (e.g. with
`quickspecAsts` or `concurrentQuickspec`), but `haskellPkgToAsts` cannot be used
to extract the function information; you'll have to provide it some other way.
*Packages must be installable by Nix*
We use Nix to set up our working environments, e.g. for compiling, for querying
GHCi, for exploring, etc. If your package can't be built by Nix then *it will
not work*. Unlike the Cabal requirement, this cannot be worked around.
There are two ways to make a package work with Nix. The easiest is to have a
working `.cabal` file, in which case we will automatically run `cabal2nix` to
convert it to Nix. If this doesn't work, or you don't have a `.cabal` file, then
you can provide your own definition in a `default.nix` file instead. This should
follow the same convention as those produced by `cabal2nix`.
*Dependency hell*
We use a few Haskell packages internally. Some, like `reduce-equations`, provide
standalone commands which don't interfere with the packages being explored.
Others, like `QuickSpec` and `AstPlugin`, must be installed into the same
`ghc-pkg` database as the packages being explored, so there is potential for
dependencies to clash.
Clashes with `AstPlugin` can be worked around by avoiding `quickspec` and
`haskellPkgToAsts`, and instead providing your own JSON to `quickspecAsts` or
`concurrentQuickspec`.
Clashes with packages used during exploration (e.g. `QuickSpec`) cannot be
worked around, due to the nature of the algorithm: functions from one must be
invoked by the other, so their dependencies must be compatible.
*Overrides*
We prefer to hard-code our dependency versions, to make results more
deterministic and reproducible, but this may prevent some user packages from
building, so we provide the following override mechanisms:
- Our package accepts overridable parameters, including `stable` which defaults
to `true`. If set to `false`, we will check out the latest revision of each
git repository we access, rather than the known-good revision we've
hard-coded. For example: `nix-shell -p 'import ./. { stable = false; }'`
- Some dependencies can be given in the `NIX_PATH` environment variable, e.g.
most of the custom Haskell packages in `nix-support/hsOverride.nix`.
- If the `HASKELL_PACKAGES` environment variable is set to a `.nix` file, that
will be used as the Haskell package set when exploring. Note that this will
not affect things *other* than exploration, like function extraction or setup
code.
- Note that we use the `nix-eval` package, but we *don't* expose its
`NIX_EVAL_HASKELL_PKGS` override. That's because we need to use it ourselves,
but the `HASKELL_PACKAGES` we provide acts in a similar way.
*Extraction caveats*
`haskellPkgToAsts` looks at all of the functions defined in the given package.
In particular, it *does not* extract anything which is re-exported from a
*different* package, and it does not include constuctors or type class methods.
If a package defines things in a "private" (non-exposed) module, those functions
may be deemed unexplorable, since we can't import the module; *even if* an
exposed module happens to re-export them elsewhere.
We can work around these problems by defining wrapper functions (e.g. in a
"helper" package). For example:
-- Wrap constructors in functions
nil = []
cons = (:)
-- Wrap methods in functions
listEq :: Eq a => [a] -> [a] -> Bool
listEq = (==)
-- Wrap unexposed definitions in functions
publicFoo = privateFoo
-- Wrap re-exported names in functions
definedInPublic = reexportedPrivateName
There is also an edge case if we try to explore a function whose module doesn't
expose the relevant types. We won't be able to add variables for these types,
since the type names won't be in scope. Unfortunately this will die ungracefully
with a GHC error.
We can work around this by re-exporting the desired types from a module which we
know will get imported. For example, we could wrap the relevant functions, and
also expose the types:
module ExplorationHelper (MyType1, MyType2, myFunc) where
import MyPkg.Types (MyType1, MyType2)
import qualified MyPkg.Funcs (myFunc)
myFunc = MyPkg.Funcs.myFunc
*Polymorphism, higher kinds, Arbitrary, Ord, etc.*
QuickSpec's exploration algorithm relies on the use of free variables, and
testing by plugging many different values into these variables to see what
happens.
To do this, we need instances of QuickCheck's `Arbitrary` typeclass for each
function's inputs, so we can generate random values. We also need the outputs to
be comparable, either by having an instance of `Ord` (we could use `Eq`, but it
would require O(n^2) comparisons) or an instance of `Serialize`.
We use the latter if possible, since serialised data can be easily hashed to a
fixed-size value, which may save time and memory as our function outputs grow.
Another consideration is that we need code to be monomorphic: we can't generate
random inputs if we don't know which particular type to generate values of. For
example, to explore `id :: (Arbitrary a) => a -> a`, how do we choose which `a`
to use?
We solve this by 'monomorphising': replacing type variables with a particular
concrete type. Currently, we replace variables of kind `*` with `Integer` and
those of kind `* -> *` with `[]` (i.e. lists). If the variable has constraints,
and these aren't satisfied by `Integer` or `[]`, then we use Template Haskell to
look up a valid instance in the module's scope; if there are multiple
possibilities, one is chosen non-deterministically.
These instance-finding mechanisms are rather basic and could certainly be
improved, but more sophisticated approaches (e.g. logic programming) would be
better off being developed as a separate project.
It's easy to work around most of these mechanisms' failures, by simply writing
down monomorphic versions of your functions, e.g.
myCrazyFunc :: Heavy Wizardry..
myCrazyFunc = ...
myMonoFunc :: Bool -> String
myMonoFunc = myCrazyFunc
One annoyance is that many projects use QuickCheck "internally", and hence their
test suites define `Arbitrary` instances, but these aren't provided by their
library interface, so we can't use them.
*Resource usage*
We use QuickSpec version 1, which has a rather slow algorithm. In particular,
memory usage may explode when exploring some functions and not for others.
Characterising this behaviour is itself an interesting question.
To prevent GHC eating all of your resources, it can be killed after a time limit
by setting the `MAX_SECS` env var, or memory limit by setting `MAX_KB`.
## Other Details ##
- Tested on NixOS Linux, but other Linux systems should also work
- The main "user facing" functionality is the package provided by `default.nix`
- The main "researcher facing" functionality is the benchmark suite, which is
explained in more detail by `benchmarks/README`
- We track issues using [Artemis](http://www.mrzv.org/software/artemis/). Issue
data is stored in `.issues` and managed by Git.
- Bug reports, fixes and enhancements are welcome. These can be emailed to the
author for wider dissemination. Note that Git can be used to email changes,
if you like.