# Grammar Coverage

In this chapter, we explore how to systematically cover elements of a grammar, as well as element combinations.  \todo{Work in progress.}

**Prerequisites**

* You should have read the [chapter on grammars](Grammars.ipynb).
* You should have read the [chapter on efficient grammar fuzzing](GrammarFuzzer.ipynb).

## Covering Grammar Elements

[Producing inputs from grammars](GrammarFuzzer.ipynb) gives all possible expansions of a rule the same likelihood.  For producing a comprehensive test suite, however, it makes more sense to maximize _variety_ – for instance, by avoiding repeating the same expansions over and over again.  To achieve this, we can track the _coverage_ of individual expansions: If we have seen some expansion already, we can prefer other possible expansions in the future.  The idea of ensuring that each expansion in the grammar is used at least once goes back to Paul Purdom \cite{purdom1972}.

As an example, consider the grammar

```grammar
<start> ::= <digit><digit>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
```

Let us assume we have already produced a `0` in the first expansion of `<digit>`.  As it comes to expand the next digit, we would mark the `0` expansion as already covered, and choose one of the yet uncovered alternatives.  Only when we have covered all alternatives would we go back and consider expansions covered before.

### Tracking Grammar Coverage

This concept of coverage is very easy to implement.  We introduce a class `GrammarCoverageFuzzer` that keeps track of the current grammar coverage achieved:

In [None]:
import fuzzingbook_utils

In [None]:
from Grammars import DIGIT_GRAMMAR, EXPR_GRAMMAR, CGI_GRAMMAR, URL_GRAMMAR, START_SYMBOL

In [None]:
from GrammarFuzzer import GrammarFuzzer, all_terminals, nonterminals, display_tree

In [None]:
import random

In [None]:
class TrackingGrammarCoverageFuzzer(GrammarFuzzer):
    def __init__(self, *args, **kwargs):
        # invoke superclass __init__(), passing all arguments
        super().__init__(*args, **kwargs)
        self.reset_coverage()

    def reset_coverage(self):
        self.covered_expansions = set()

    def expansion_coverage(self):
        return self.covered_expansions

In this set `covered_expansions`, we store individual expansions seen as pairs of (_symbol_, _expansion_), using the method `expansion_key()` to generate a string representation for the pair.

In [None]:
class TrackingGrammarCoverageFuzzer(TrackingGrammarCoverageFuzzer):
    def expansion_key(self, symbol, expansion):
        """Convert (symbol, children) into a key.  `children` can be an expansion string or a derivation tree."""
        if not isinstance(expansion, str):
            children = expansion
            expansion = all_terminals((symbol, children))
        return symbol + " -> " + expansion

In [None]:
f = TrackingGrammarCoverageFuzzer(EXPR_GRAMMAR)
f.expansion_key(START_SYMBOL, EXPR_GRAMMAR[START_SYMBOL][0])

Instead of _expansion_, we can also pass a list of children as argument, which will then automatically be converted into a string.

In [None]:
children = [("<expr>", None), (" + ", []), ("<term>", None)]
f.expansion_key("<expr>", children)

We can compute the set of possible expansions in a grammar by enumerating all expansions:

In [None]:
class TrackingGrammarCoverageFuzzer(TrackingGrammarCoverageFuzzer):
    def max_expansion_coverage(self):
        """Return set of all expansions in a grammar"""
        expansions = set()
        for nonterminal in self.grammar:
            for expansion in self.grammar[nonterminal]:
                expansions.add(self.expansion_key(nonterminal, expansion))
        return expansions

In [None]:
f = TrackingGrammarCoverageFuzzer(DIGIT_GRAMMAR)
f.max_expansion_coverage()

During expansion, we can keep track of expansions seen.  To do so, we hook into the method `choose_node_expansion()`, expanding a single node in our [Grammar fuzzer](GrammarFuzzer.ipynb).

In [None]:
class TrackingGrammarCoverageFuzzer(TrackingGrammarCoverageFuzzer):
    def add_coverage(self, symbol, new_children):
        key = self.expansion_key(symbol, new_children)

        if self.log and key not in self.covered_expansions:
            print("Now covered:", key)
        self.covered_expansions.add(key)

    def choose_node_expansion(self, node, possible_children):
        (symbol, children) = node
        index = super().choose_node_expansion(node, possible_children)
        self.add_coverage(symbol, possible_children[index])
        return index

With this, we can now systematically check which expansions already have been covered – and which ones still have to be covered.

In [None]:
f = TrackingGrammarCoverageFuzzer(DIGIT_GRAMMAR, log=True)
f.fuzz()

In [None]:
f.fuzz()

In [None]:
f.fuzz()

Here's the set of covered expansions so far:

In [None]:
f.expansion_coverage()

On average, how many characters do we have to produce until all expansions are covered?

In [None]:
def average_length_until_full_coverage(fuzzer):
    trials = 50

    sum = 0
    for trial in range(trials):
        fuzzer.reset_coverage()
        while len(fuzzer.max_expansion_coverage() - fuzzer.expansion_coverage()) > 0:
            s = fuzzer.fuzz()
            sum += len(s)

    return sum / trials

In [None]:
average_length_until_full_coverage(TrackingGrammarCoverageFuzzer(EXPR_GRAMMAR))

### Covering Grammar Expansions

Let us now not only track coverage, but actually _produce_ coverage.  The idea is as follows:

1. We determine children yet uncovered (in `uncovered_children`)
2. If all children are covered, we fall back to the original method (i.e., choosing one expansion randomly)
3. Otherwise, we select a child from the uncovered children and mark it as covered.

To this end, we introduce a new fuzzer `SimpleGrammarCoverageFuzzer` that implements this strategy in the `choose_node_expansion()` method.

In [None]:
class SimpleGrammarCoverageFuzzer(TrackingGrammarCoverageFuzzer):
    def choose_node_expansion(self, node, possible_children):
        # Prefer uncovered expansions
        (symbol, children) = node
        uncovered_children = [(i, c) for (i, c) in enumerate(possible_children)
                              if self.expansion_key(symbol, c) not in self.covered_expansions]

        if len(uncovered_children) == 0:
            # All expansions covered - use superclass method
            if self.log:
                print("All", symbol, "alternatives covered")
            return super().choose_node_expansion(node, possible_children)

        # select a random expansion
        index = random.randrange(len(uncovered_children))
        (new_children_index, new_children) = uncovered_children[index]

        # Save the expansion as covered
        self.add_coverage(symbol, new_children)

        return new_children_index

By returning the set of expansions covered so far, we can invoke the fuzzer multiple times, each time adding to the grammar coverage.  With the `DIGIT_GRAMMAR` grammar, for instance, this lets the grammar produce one digit after the other:

In [None]:
f = SimpleGrammarCoverageFuzzer(DIGIT_GRAMMAR, log=True)
f.fuzz()

In [None]:
f.fuzz()

In [None]:
f.fuzz()

Here's the set of covered expansions so far:

In [None]:
f.expansion_coverage()

Let us fuzz some more. We see that with each iteration, we cover another expansion:

In [None]:
for i in range(7):
    f.fuzz()

At the end, all expansions are covered:

In [None]:
f.max_expansion_coverage() - f.expansion_coverage()

Let us apply this on a more complex grammar – e.g., the expression grammar.  We see that after a few iterations, we cover each and every digit, operator, and expansion:

In [None]:
f = SimpleGrammarCoverageFuzzer(EXPR_GRAMMAR)
for i in range(10):
    print(f.fuzz())

Again, all expansions are covered:

In [None]:
f.max_expansion_coverage() - f.expansion_coverage()

We see that our strategy is much more effective than random in achieving coverage:

In [None]:
average_length_until_full_coverage(SimpleGrammarCoverageFuzzer(EXPR_GRAMMAR))

## Deep Foresight

Selecting expansions for individual rules is a good start; however, it is not sufficient, as the following example shows.  We apply our coverage fuzzer on the CGI grammar:

In [None]:
f = SimpleGrammarCoverageFuzzer(CGI_GRAMMAR)

In [None]:
for i in range(10):
    print(f.fuzz())

After 10 iterations, we still have a number of expansions uncovered:

In [None]:
f.max_expansion_coverage() - f.expansion_coverage()

Why is that so?  The problem is that in the CGI grammar, the largest number of variations to be covered occurs in the `hexdigit` rule.  However, we first need to _reach_ this expansion.  When expanding a `<letter>` symbol, we have the choice between three possible expansions:

In [None]:
CGI_GRAMMAR["<letter>"]

If all three expansions are covered already, then `choose_node_expansion()` above will choose one randomly – even if there may be more expansions to cover when choosing `<percent>`.

What we need is a better strategy that will pick `<percent>` if there are more uncovered expansions following – even if `<percent>` is covered.

### Determining Maximum per-Symbol Coverage

To address this problem, we introduce a new class `GrammarCoverageFuzzer` that builds on `SimpleGrammarCoverageFuzzer`, but with a better strategy.  First, we need to compute the _maximum set of expansions_ that can be reached from a particular symbol. The idea is to later compute the _intersection_ of this set and the expansions already covered, such that we can favor those expansions with a non-empty intersection.

Our method `max_symbol_expansion_coverage()` computes this maximum set of expansions.  The helper method `_max_symbol_expansion_coverage()` does the heavy lifting, iterating through the grammar up to a given depth and tracking which symbols (`symbols_seen`) and which coverage (`cov`) has already been seen:

In [None]:
class GrammarCoverageFuzzer(SimpleGrammarCoverageFuzzer):
    def _max_symbol_expansion_coverage(
            self, symbol, max_depth, cov, symbols_seen):
        """Return set of all expansions in a grammar starting with `symbol`"""
        if max_depth <= 0:
            return (cov, symbols_seen)

        symbols_seen.add(symbol)
        for expansion in self.grammar[symbol]:
            key = self.expansion_key(symbol, expansion)
            if key in cov:
                continue

            cov.add(key)
            for s in nonterminals(expansion):
                if s in symbols_seen:
                    continue
                new_cov, new_symbols_seen = (
                    self._max_symbol_expansion_coverage(s, max_depth - 1, cov, symbols_seen))
                cov |= new_cov
                symbols_seen |= new_symbols_seen

        return (cov, symbols_seen)

The main method `max_symbol_expansion_coverage()` simply returns the coverage that can be achieved:

In [None]:
class GrammarCoverageFuzzer(GrammarCoverageFuzzer):
    def max_symbol_expansion_coverage(self, symbol, max_depth=float('inf')):
        cov, symbols_seen = self._max_symbol_expansion_coverage(
            symbol, max_depth, set(), set())
        return cov

With this, we can compute the possible expansions for every symbol

In [None]:
f = GrammarCoverageFuzzer(EXPR_GRAMMAR)
f.max_symbol_expansion_coverage('<integer>')

In [None]:
f.max_symbol_expansion_coverage('<digit>')

The maximum coverage achievable in a grammar is the same as starting with the start symbol:

In [None]:
assert f.max_expansion_coverage() == f.max_symbol_expansion_coverage(START_SYMBOL)

### Determining Children with new Coverage

The definition of `max_symbol_expansion_coverage()` allows us to determine the _new_ coverage for each child.  To this end, we _subtract_ the coverage already seen (`expansion_coverage()`) from the coverage that could be obtained.

In [None]:
class GrammarCoverageFuzzer(GrammarCoverageFuzzer):
    def _new_child_coverage(self, children, max_depth):
        new_cov = set()
        for (c_symbol, _) in children:
            if c_symbol in self.grammar:
                new_cov |= self.max_symbol_expansion_coverage(
                    c_symbol, max_depth)
        return new_cov

    def new_child_coverage(self, symbol, children, max_depth=float('inf')):
        """Return new coverage that would be obtained by expanding (symbol, children)"""
        new_cov = self._new_child_coverage(children, max_depth)
        for c in children:
            new_cov.add(self.expansion_key(symbol, children))
        new_cov -= self.expansion_coverage()   # set subtraction
        return new_cov

Let us illustrate `new_child_coverage()`.  We again start fuzzing, choosing expansions randomly.

In [None]:
f = GrammarCoverageFuzzer(DIGIT_GRAMMAR, log=True)
f.fuzz()

This is our current coverage:

In [None]:
f.expansion_coverage()

When we go through the individual expansion possibilities for `START_SYMBOL`, we see that all expansions offer additional coverage, _except_ for the one we have just seen.

In [None]:
for expansion in DIGIT_GRAMMAR[START_SYMBOL]:
    children = f.expansion_to_children(expansion)
    print(expansion, f.new_child_coverage(START_SYMBOL, children))

This means that whenever choosing an expansion, we can make use of `new_child_coverage()` and choose among the expansions that offer the greatest new (unseen) coverage.

### Adaptive Lookahead

When choosing a child, we do not look out for the maximum overall coverage to be obtained, as this would result in expansions with many uncovered possibilities totally dominate other expansions.  Instead, we aim for a _breadth-first_ strategy, first covering all expansions up to a given depth, and only then looking for a greater depth.  The method `new_coverages()` is at the heart of this strategy: Starting with a maximum depth (`max_depth`) of zero, it increases the depth until it finds at least one uncovered expansion.

In [None]:
class GrammarCoverageFuzzer(GrammarCoverageFuzzer):
    def new_coverages(self, node, possible_children):
        """Return coverage to be obtained for each child at minimum depth"""
        (symbol, children) = node
        for max_depth in range(len(self.grammar)):
            new_coverages = [
                self.new_child_coverage(
                    symbol, c, max_depth) for c in possible_children]
            max_new_coverage = max(len(new_coverage)
                                   for new_coverage in new_coverages)
            if max_new_coverage > 0:
                # Uncovered node found
                return new_coverages

        # All covered
        return None

### All Together

We can now define `choose_node_expansion()` to make use of this strategy: First, we determine the possible coverages to be obtained (using `new_coverages()`); then we (randomly) select among the children which sport the maximum coverage.

In [None]:
class GrammarCoverageFuzzer(GrammarCoverageFuzzer):
    def choose_node_expansion(self, node, possible_children):
        (symbol, children) = node
        new_coverages = self.new_coverages(node, possible_children)

        if new_coverages is None:
            # All expansions covered - use superclass method
            return GrammarFuzzer.choose_node_expansion(self, node, possible_children)

        max_new_coverage = max(len(cov) for cov in new_coverages)
        children_with_max_new_coverage = [(i, c) for (i, c) in enumerate(possible_children)
                                          if len(new_coverages[i]) == max_new_coverage]

        # select a random expansion
        new_children_index, new_children = random.choice(
            children_with_max_new_coverage)

        # Save the expansion as covered
        key = self.expansion_key(symbol, new_children)

        if self.log:
            print("Now covered:", key)
        self.covered_expansions.add(key)

        return new_children_index

Our fuzzer is now complete.  Let us apply it on a series of examples.  On expressions, it quickly covers all digits and operators:

In [None]:
f = GrammarCoverageFuzzer(EXPR_GRAMMAR, min_nonterminals=3)
f.fuzz()

In [None]:
f.max_expansion_coverage() - f.expansion_coverage()

On average, it is again faster than the simple strategy:

In [None]:
average_length_until_full_coverage(GrammarCoverageFuzzer(EXPR_GRAMMAR))

On the CGI grammar, it takes but a few iterations to cover all letters and digits:

In [None]:
f = GrammarCoverageFuzzer(CGI_GRAMMAR, min_nonterminals=5)
while len(f.max_expansion_coverage() - f.expansion_coverage()) > 0:
    print(f.fuzz())

This improvement can also be seen in comparing the random, expansion-only, and deep foresight strategies on the CGI grammar:

In [None]:
average_length_until_full_coverage(TrackingGrammarCoverageFuzzer(CGI_GRAMMAR))

In [None]:
average_length_until_full_coverage(SimpleGrammarCoverageFuzzer(CGI_GRAMMAR))

In [None]:
average_length_until_full_coverage(GrammarCoverageFuzzer(CGI_GRAMMAR))

## Grammar Coverage and Code Coverage

\todo{Add.}

## Advanced Grammar Coverage Metrics

\todo{Expand.}

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducing.ipynb)


## Exercises

Close the chapter with a few exercises such that people have things to do.  In Jupyter Notebook, use the `exercise2` nbextension to add solutions that can be interactively viewed or hidden:

* Mark the _last_ cell of the exercise (this should be a _text_ cell) as well as _all_ cells of the solution.  (Use the `rubberband` nbextension and use Shift+Drag to mark multiple cells.)
* Click on the `solution` button at the top.

(Alternatively, just copy the exercise and solution cells below with their metadata.)

### Exercise 1

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

_Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2

_Text of the exercise_

_Solution for the exercise_