Anthony Widjaja Lin edited this page Nov 9, 2017 · 33 revisions

Table of Contents

Sloth - An SMT Solver for String Constraints

About

Sloth is a decision procedure for several relevant fragments of string constraints, including the straight-line fragment and acyclic formulas. In contrast to most other string solvers, Sloth is able to decide satisfiability of constraints combining concatenation, regular expressions, and transduction, all of which are essential in applications. Sloth uses succinct alternating finite-state automata (AFAs) as concise symbolic representations of string constraints, and uses model checking algorithms like IC3 for solving emptiness of the AFA. More details on the algorithm can be found in our POPL'18 paper; a technical report is available here.

Requirements

It is easiest to work with one of the binary releases of sloth. The following packages are needed in order to run Sloth on your system:

  • OpenJDK >= 1.7
  • One of the following model checkers:
    • abc >= 1.0.1 (also requires aiger >= 1.9.4)
    • nuXmv >= 1.1.1

Sloth is heavily based on the Princess theorem prover, which is included in the binary release packages.

Installation as Binary

In order to compile a binary release of Sloth, you have to do following steps:

  1. Download one of these packages and install them as directed:
  2. Copy the files aigtoaig (aiger), abc and nuXmv into /usr/bin/
  3. Download one of the binary releases, unpack it, and call ./sloth
    • Alternatively, you can also invoke sloth using the command java -jar target/scala-2.11/sloth-assembly-1.0.jar

Download and Installation from Sources

In order to compile Sloth from the sources, in addition to one of the model checkers (nuXvm or abc) you need the Scala build system, which is included in common Linux distributions:

  • sbt >= 1.0

The follow the steps:

  1. Clone the repository to the local repository
    $ git clone https://github.com/uuverifiers/sloth.git
  2. Run sbt assembly
  3. Call ./sloth to invoke the tool

Usage

 ./sloth [options] [file]
 Sloth options:
    +model                           print models/satisfying assignments
    +splitOpt                        enable split optimization. This optimization 
                                     sometimes reduces the number of considered cases
                                     significantly
    -modelChecker=abc|nuxmv|simple   choose the model checker, default is nuxmv.
                                     Option "simple" enable our model checker that is 
                                     used on formulae without splitting (recommended
                                     uses with +splitOpt).
    +minimalSuccessors               enable an optimisation that reduces the set of
                                     of considered AFA states (use for abc, nuxmv)
    +assert                          enable runtime assertions in Sloth

Input language

Sloth processes constraints in the SMT-LIB 2.6 format, extended with a theory of strings. Sloth is mostly compatible with input language from CVC4 (http://cvc4.cs.stanford.edu/wiki/Strings) and supports a large subset of the CVC4 string operations; a summary of supported operations is given below. For examples, we recommend to have a look at the tests directory.

SMTLIB language
String Sort String
String literals "abcdef"
String equality (= s1 s2)
Concatenation (str.++ s1 ... sn)
Single replacement (str.replace s t1 t2)
Replacement (str.replaceall s t1 t2)
Membership in regular expression (str.in.re s r)
String to regular expression (str.to.re s)
Regular expression concatenation (re.++ r1 ... rn)
Regular expression union (re.union r1 ... rn)
Regular expression intersection (re.inter r1 ... rn)
Regular expression Kleene star (re.* r)
Regular expression plus (re.+ r)
Regular expression character range (re.range s t)

String literals

At the moment Sloth only supports 256 character (8-bit) Extended ASCII alphabets. String literals are composed from printable characters (value between 0x20 and 0x7e) in double quotes. Escape sequences are supported too. They are interpreted in the usual way, as shown in the following table.

\0 ... \9 represents ASCII character 0 … 9, respectively
\a, \b, \e, \f, \n,\r, \t, \v represents its corresponding ASCII character
\ooo encodes a single ASCII character where ooo consists of exactly three digits in the octal encoding of the character (from 0 to 377)
\xNN encodes a single ASCII character, where NN is two-digit hexadecimal constant
The backslash character (\) is silently ignored when it is followed by a sequence of characters not recognized as an escape sequence. Non-printable/extended ASCII character is printed in the exadecimal format \xNN, except for the character denoted by the escape sequences \a, \b, \e, \f, \n, \r, \t, \v, which are printed using those escape sequences.

Most Important Operations on Strings

To define a string variable/constant:
   (declare-fun x () String)
or
   (declare-const x String)

String Concatenation:
   (str.++ s1 ... sn)
where s1, ..., sn are string terms. String concatenation takes at least 2 arguments.

String Single Replacement:
   (str.replace s t1 t2)
where s, t1 and t2 are string terms, t1 is non-empty. This function searches t1 in s, and returns the string in which the first occurrence of t1 is replaced by the string t2.

String Replacement:
   (str.replaceall s t1 t2)
where s, t1 and t2 are string terms, t1 is non-empty. This function searches t1 the s, and returns a new string where all occurrences of t1 are replaced by string t2.

Symbolic Regular Expression

Membership Constraint:
   (str.in.re s r)
where s is a string term and r is a regular expression.

String to Regular Expression Conversion:
   (str.to.re s)
where s is a string term. The statement turns a regular expression that only contains a string s.

Regular Expression Concatenation:
   (re.++ r_1 ... r_n)
where r_1, ..., r_n are regular expressions.

Regular Expression Alternation:
   (re.union r_1 ... r_n)
where r_1, ..., r_n are regular expressions.

Regular Expression Intersection:
   (re.inter r_1 ... r_n)
where r_1, ..., r_n are regular expressions.

Regular Expression Kleene-Star:
   (re.* r)
where r is a regular expression.

Regular Expression Kleene-Cross:
   (re.+ r)
where r is a regular expression.

Regular Expression Option:
   (re.opt r)
where r is a regular expression.

Regular Expression Range:
   (re.range s t)
where s, t are single characters in double quotes, e.g. "a", "b". It returns a regular expression that contains any character between s and t.

The Empty Regular Expression (recognising the empty set):
   re.nostr

The Regular Expression that contains all characters:
   re.allchar

Transducers

Sloth can also handle string transformations expressed as transducers. We represent transducers as mutually recursive functions with multiple arguments in SMT-LIB. In general, the characteristic function of the language accepted by an transducer can be written as a set of mutually (primitive) recursive functions over strings; the equations defining the functions directly correspond to the transitions of the transducer.

For instance, considering strings over 8-bit bit-vectors, the following recursive function describes the to-uppercase transducer. The transducer definition makes use of the additional operations seq-head to extract the first character of a (non-empty) string, and seq-tail to compute the tail:

 (define-fun-rec toUpper ((x String) (y String)) Bool
    (or (and (= x "") (= y ""))
        (and (= (seq-head y)
                (ite (and (bvule (_ bv97 8) (seq-head x))   ; 'a'
                          (bvule (seq-head x) (_ bv122 8))) ; 'z'
                     (bvsub (seq-head x) (_ bv32 8))        ; 'a' -> 'A', etc.
                     (seq-head x)))
             (toUpper (seq-tail x) (seq-tail y)))))

This function can then be used to formulate string constraints, for instance:

  (declare-fun x0 () String)
  (declare-fun x1 () String)
  (declare-fun x2 () String)
  (declare-fun x3 () String)
  (declare-fun s0 () String) ; source variable
  ;
  (assert (toUpper x2 x3))
  (assert (= x0 (str.++ "he" s0 "o")))
  (assert (= x2 (str.++ x0 x1)))
  ;
  (check-sat)
  (get-model)

It is also possible to work with non-length-preserving transducers, i.e., transducers that map a string to a new string of potentially different length. For instance, the following transducer extracts the string in between the 0th and 1st occurrence of '=':

 (define-funs-rec ((extract1st ((x String) (y String)) Bool)
                   (extract1st_2 ((x String) (y String)) Bool)
                   (extract1st_3 ((x String) (y String)) Bool)) (
    ;
    ; extract1st
    (or (and (= x "") (= y ""))
        (and (not (= (seq-head x) (_ bv61 8)))  ; not '='
             (extract1st (seq-tail x) y))
        (and (= (seq-head x) (_ bv61 8))        ; '='
             (extract1st_2 (seq-tail x) y)))
    ;
    ; extract1st_2
    (or (and (= x "") (= y ""))
        (and (= (seq-head x) (seq-head y))
             (not (= (seq-head x) (_ bv61 8)))  ; not '='
             (extract1st_2 (seq-tail x) (seq-tail y)))
        (and (= (seq-head x) (_ bv61 8))        ; '='
             (extract1st_3 (seq-tail x) y)))
    ;
    ; extract1st_3
    (or (and (= x "") (= y ""))
        (extract1st_3 (seq-tail x) y))
    ;
  ))
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.