QUICr is a collection of abstract domains for sets with a common interface.
Currently, it is focused on symbolic sets with singletons. Therefore, it does not (currently) support the content reasoning originally offered by QUICr. The original version of QUICr that offers content reasoning is available at http://pl.cs.colorado.edu/projects/quicgraphs/
QUICr is a common domain interface for multiple abstractions. It is designed to support many underlying abstractions and reductions between them under a single common interface. Thus any abstract interpreter that utilizes the interface can swap in and out any of the included set abstractions.
The following abstractions intend to be explored (though they many not all be included at this time).
These abstractions use binary decision diagrams as the underlying data structure. This means that they have a normal form that may not be optimal, but they are quite general and precise. In this space, we plan to implement the following abstractions:
This directly implements a set abstraction using a BDDs and, or, and not operations to represent set intersection, union, and complement respectively. As a result, this does not necessarily offer a compact representation for operations like disjoint union, which requires encoding in the BDD the pair-wise disjointness constraints.
This abstraction does not encode disjointness constraints. They could be added externally through a domain combinator, but are not supported in the BDD. This attempts to reduce the overhead of keeping track of these disjointness constraints from the BDD. Consequently all disjoint union operations are over-approximated as union operations, dropping the disjointness constraint.
This abstraction is a BDD-based set abstraction without the global memoization table. This is a persistent version of the simple BDD-based set abstraction.
These are set abstractions that are based on theorem provers or theorem-prover-like data structures. Because these are designed around search, they do not maintain normal forms, but they can, in theory be precise and flexible for representing set constraints.
Use a SAT solver to solve set constraints and for the inclusion check proofs. Essentially this abstraction accumulates formulas representing the constraints. When an inclusion check is started, the formula is converted into a SAT problem and solved. Join and widening are implemented as disjunction.
Set constraints can be represented in a sequent calculus and can thus use sequent-calculus-theorem-prover techniques to solve inclusion check. As opposed to the SAT based techniques, these could provide on-the-fly formula simplifications and perform more incremental inclusion checks, possibly saving significant work.
The original QUIC graphs is a variant of a sequent calculus representation. However, it is not utilizing proof search techniques such as focusing. It uses a simple cut-based technique, combined with a heuristic join and widening algorithm.
Use the Espresso logic minimizer to perform on-the-fly simplifications of the current representation. This efficiently reduces sequent-calculus-based representations and may improve performance of possible join algorithms.
A hardware model checking engine can be used to infer search for optimal union-only constraints in a QUIC-graph-like structure. It can be use for directly implementing join and may serve as a good mechanism for pairing with other theorem-prover-based techniques, which typically only optimize for inclusion check.
These techniques are more targeted at specific applications. They are designed for use for solving certain kinds of problems and thus they can have data structures designed for the application-specific predicates.
This is a set abstraction built around the idea of linear set constraints such as:
X = {a, b, c} ⊎ Y ⊎ Z
It is restricted to support only disjoint union and limited use of subset constraints. Join and inclusion check operations are heuristic and tuned for the intended application.
This is an abstraction that extends linear set constraints with difference constraints to eliminate case splits that may occur when materializing elements from a set.
The FixBag abstraction uses saturation based inclusion check and join based on a limited set of propagation an simplification rules. It uses a syntactic representation of logical formulas as its data structure.
TBD
This package will also define the SDSL, a domain specific language for manipulating symbolic sets. The language has the following syntax:
k ::= x = e
| x = choose e // overwrite x with an element of e
| k; k // sequencing two commands
| branch { k } else { k } // non-deterministic branch
| both { k } and { k } // execute both branches in parallel
| if ( c ) { k } // conditional execution
| if ( c ) { k } else { k }
| while ( c ) { k } // looping
| loop { k } // non-deterministic loop
| for ( x in e ) { k } // loop over elements of e (e is mutable)
| kill x // project out variable
| rename x y // rename the variable x to be variable y
| assume ( c ) // make an assumption about state
| assert ( c ) // check property
e ::= {} // empty set
| v // set variable
| { v } // singleton set
| e U e // union
| e U+ e // disjoint union
| e \ e // set difference
| e ^ e // intersection
| ~e // set complement
c ::= e = e // set equality
| e <= e // subset or equality
| v in e // element containment
Examples of this language are contained in the tests
directory. These are
sample programs that exercise the built-in abstract interpreter.
The included SDSL abstract interpreter is contained in the sdsl
directory and
the corresponding SDSL
module. The static analyzer provides a number of
printing and usability options.
-step
Prints each step of the program as it is interpreted and prints the abstract state after each step of the interpretation.-final
Prints only the final abstract state-brace
Prints braces around the abstract states (useful if abstract states require multiple to represent)-color <color>
Shows abstract states in color to ease reading. The<color>
argument supports the following colorsblack
,red
,green
,yellow
,blue
,magenta
,cyan
, andwhite
. Additionally each color can be converted to it's "bright" version by appending it with!
. For example,red!
would be bright red.-time
Times the analysis (excluding parse/prepreocess time) and reports the total time at the end of the analysis.
Options as above can be set on each command line. However if they should
persist, they can be assigned through an environment variable. Use the
variable SDSLPARAMS
as an additional command-line place. For example to make
step printing and bright red abstract states standard, execute the following
bash command:
$ export SDSLPARAMS='-step -color red!'
The command line implements a stack-based declarative language for selecting and combining abstract domains. By default the stack is empty, so running the tool without specifying a domain will give an error:
$ ./Main.d.byte
Error: Domain stack is empty, please specify an abstract domain
There are two kinds of abstract domains that can be specified: base domains and
domain combinators. A base domain implements a set abstraction without any
external assistance. A domain combinator implements a set abstraction based
on one or more other abstractions. For example, the -bdd-full
domain is a
base domain that is implemented using binary decision diagrams, whereas the
-logger
domains is a domain combinator that logs all domain interaction to a
file.
The way the stack is used is through the order of arguments on the command line. For example, consider the following command line:
$ ./Main.d.byte -bdd-full -bdd-full -logger output.log tests/test1.sdsl
The first argument pushes the BDD domain onto the stack; the stack looks like this:
Stack |
---|
BDD(1) |
The second argument pushes a second BDD domain onto the stack; the stack looks like this:
Stack |
---|
BDD(2) |
BDD(1) |
The third argument pushes the logger combinator onto the stack. This removes the top element and adds a logger wrapper to it. As a result, the stack now looks like this:
Stack |
---|
Log("output.txt", BDD(2)) |
BDD(1) |
At this point the analyzer would be run with the topmost domain Log("output.txt", BDD(2)) and would raise a warning for not using all of the domains on the stack; the domain BDD(1) would remain unused.
To help in comparing abstract domains as used by other tools, the analyzer
supports the ability to replay traces. Traces with the extension *.strace
and have the following format where x
and y
are variables that represent
abstract states, v
, v1
, etc are dimensions (integers) in the abstract
state.
k ::= let x = t // evaluate a transfer function
| le x y // compare abstract states x and y
| is_bottom x // check if abstract state x is bottom
| is_top x // check if abstract state x is top
| sat x c // check if abstract state satisfies constraint
t ::= top // return top
| bottom // return top
| constrain c x // return x constrained by c
| join x y // overapproximate disjunction of abstract states x and y
| widening x y // overapproximate disjunction of abstract states x and y (terminating)
| meet x y // overapproximate conjunction of abstract states x and y
| forget v1 v2 ... x // drop dimensions v1, v2 etc from x
| rename [v1 -> v2; v3 -> v4 ...] x // rename dimension v1 to v2, and v3 to v4 in x
e ::= {} // empty set
| v // set variable
| { v } // singleton set
| e U e // union
| e U+ e // disjoint union
| e \ e // set difference
| e ^ e // intersection
| ~e // set complement
c ::= e = e // set equality
| e <= e // subset or equality
| v in e // element containment
The result of running an analysis on a trace is statistics of the run:
$ ./Main.native -lin tests/test1.strace
sat: 0/1
le : 0/0
bot: 1/1
top: 1/1
This shows the four kinds of query and the ratio of true
results to total
results for those queries.
The benchmarks for the domains are shown here. This table is
generated from the results
python script by running make benchmark
. This
runs every test contained in tests/*.sdsl
or tests/*.strace
using each
configuration given in configurations
.
The configurations
file contains a list of options for the Main.native
tool. Each line is a unique configuration. Lines that have # in the first
column are comments.
The results
script depends on two python libraries:
pip install pexpect tabulate
The SDSL abstract interpreter (and QUICr) is designed to ease the addition of
new abstract domains and domain combinators. To add a new domain or
combinator, the code should implement the interface defined in
Interface.Domain
with integer symbols, LogicSymbolicSet.t
constraints and
output, and LogicSymbolicSet.q
queries. The code should be placed in a new
directory that starts with a lower-case letter (to be properly supported by the
build system). By convention, the main file for the domain should be called
Domain.ml
. A combinator should also be contained in Domain.ml
, but should
define the functor Make
.
A .mlpack
file should be created for the domain. This should have an
upper-case first letter and will represent the name of the domain or combinator
inside the main file. Similarly, the _tags file should be updated to both
indicate any dependencies for the new domain as well as adding an appropriate
for-pack()
line.
Finally, the Main.ml
file should be modified to allow using the domain. For
base domains, this involves adding a new line in the domains
variable. For
combinators, a line should be added to combinators
.
For example, if a new domain were called test-domain, the following changes would be required:
Add the directory test-domain
.
Add the file Domain.ml
to test-domain
.
Add the file TestDomain.mlpack
to .
with the following contents:
test-domain/Domain
Modify the _tags
file to add the following line:
<test-domain/*>: for-pack(TestDomain)
Modify the Main.ml
file to add the following line to the domains
list:
("-test-domain", Arg.Unit (fun () -> push (module TestDomain.Domain)), " Test set domain");
Note that there is a blank space before the description. The first word in the description documents the argument for the domain (if there is one).