# Symbolic Fuzzing

One of the problems with traditional methods of fuzzing is that they fail to penetrate deeply into the program. Quite often the execution of a specific branch of execution may happen only with very specific inputs, which may represent an extremely small fraction of the input space. The traditional fuzzing methods relies on chance to produce inputs they need. However, relying on randomness to generate values that we want is a bad idea when the space to be explored is large. For example, given a function that accepts a string, even if one only considers the first $10$ characters, already has $2^{80}$ possible inputs. If one is looking for a specific string, random generation of values will take a few thousand years even in one of the super computers.

Symbolic execution is a way out of this problem. A program is a computation that can be treated as a system of equations that obtains the output values from the given inputs. Executing the program symbolically -- that is, solving these mathematically -- along with any specified objective such as covering a particular branch or obtaining a particular output will get us inputs that can accomplish this task. In this chapter, we investigate how _symbolic execution_ can be implemented, and how it can be used to obtain interesting values for fuzzing.

**Prerequisites**

* You should have read the [chapter on coverage](Coverage.ipynb).
* Some knowledge of inheritance in Python is required.
* A familiarity with the [chapter on search based fuzzing](SearchBasedFuzzer.ipynb) would be useful.

## The motivation for Symbolic Execution

In the chapter on [parsing and recombining inputs](SearchBasedFuzzer.ipynb), we saw how difficult it was to generate inputs for `process_vehicle()` -- a simple function that accepts a string. The solution given there was to rely on preexisting sample inputs. However, this solution is inadequate as it assumes the existence of sample inputs. What if there are sample inputs at hand?

For a simpler example, let us consider the following function. Can we generate inputs to cover all the paths?

In [None]:
def check_triangle(a,b,c):
    if a == b:
        if a == c:
            if b == c:
                return "Equilateral"
            else:
                return "Isosceles"
        else:
            return "Isosceles"
    else:
        if b != c:
            if a == c:
                return "Isosceles"
            else:
                return "Scalene"
        else:
              return "Isosceles"

The possible execution paths traced by the program can be represented as follows.

In [None]:
import fuzzingbook_utils

In [None]:
from GrammarFuzzer import display_tree, dot_escape, unicode_escape

In [None]:
def display_annotated_tree(tree, a_nodes, a_edges, log=False):
    def graph_attr(dot):
        dot.attr('node', shape='oval')
        #dot.graph_attr['rankdir'] = 'LR'

    def annotate_node(dot, nid, symbol, ann):
        if nid in a_nodes:
            dot.node(repr(nid), "%s (%s)" % (dot_escape(unicode_escape(symbol)), a_nodes[nid]))
        else:
            dot.node(repr(nid), dot_escape(unicode_escape(symbol)))

    def annotate_edge(dot, start_node, stop_node):
        if (start_node, stop_node) in a_edges:
            dot.edge(repr(start_node), repr(stop_node),
                     a_edges[(start_node, stop_node)])
        else:
            dot.edge(repr(start_node), repr(stop_node))

    display_tree(tree, log=log,
                 node_attr=annotate_node,
                 edge_attr=annotate_edge,
                 graph_attr=graph_attr)

In [None]:
graph = ('1: (a == b)', [('2: (b != c)', [('Isosceles', []), ('3: (a == c)', [('Scalene', []), ('Isosceles', [])])]), ('4: (a == c)', [('Isosceles', []), ('5: (b == c)', [('Isosceles', []), ('Equilateral', [])])])])

In [None]:
display_annotated_tree(graph, {}, {(0,1):'F', (0, 6):'T', (1,2):'F', (6,7):'F', (8,9):'F'}, log=False)

The function takes three parameters, and the possible execution paths are the following.

```python
1: [1, 2, Isosceles]
2: [1, 2, 3, Scalene]
3: [1, 2, 3, Isosceles]
4: [1, 4, Isosceles]
5: [1, 4, 5, Isosceles]
6: [1, 4, 5, Equilateral]
```

If we want to cover the path <1>, we need to solve the following constraints.

In [None]:
import z3

In [None]:
a, b, c = z3.Ints('a b c')

In [None]:
z3.solve(a == b, b != c)

Similarly, for solving path <2> we need:

In [None]:
a, b, c = z3.Ints('a b c')

In [None]:
z3.solve(a == b, z3.Not(b != c))

However, when we attempt path <3> we get a surprise.

In [None]:
a, b, c = z3.Ints('a b c')

In [None]:
z3.solve(a == b, b != c, a==c)

That is, there no input such that the path condition <3> can be satisfied.

## Symbolic Execution

Explanadum and explanans.

## Lessons Learned

* One can use symbolic execution to augment the inputs that explore all characteristics of a program.

## Next Steps

_Link to subsequent chapters (notebooks) here:_

## Background

\cite{KLEE}

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_