# Statistical Debugging

In this chapter, we introduce _statistical debugging_ – the idea that specific events during execution could be _statistically correlated_ with failures. We start with coverage of individual lines and then proceed towards further execution features.

In [None]:
import bookutils

## Synopsis
<!-- Automatically generated. Do not edit. -->

To [use the code provided in this chapter](Importing.ipynb), write

```python
>>> from debuggingbook.StatisticalDebugger import <identifier>
```

and then make use of the following features.


_For those only interested in using the code in this chapter (without wanting to know how it works), give an example.  This will be copied to the beginning of the chapter (before the first section) as text with rendered input and output._

For instance, this is what we get for `x=1`:

You can use `int_fuzzer()` as:

```python
>>> print(2 + 2)
4
```


## Introduction

The idea behind _statistical debugging_ is fairly simple. We have a program that sometimes passes and sometimes fails. This outcome can be _correlated_ with events that precede it – properties of the input, properties of the execution, properties of the program state. If we, for instance, can find that "the program always fails when Line 123 is executed, and it always passes when Line 123 is _not_ executed", then we have a strong correlation between Line 123 being executed and failure.

Such _correlation_ does not necessarily mean _causation_. For this, we would have to prove that executing Line 123 _always_ leads to failure, and that _not_ executing it does not lead to (this) failure. Also, a correlation (or even a causation) does not mean that Line 123 contains the defect – for this, we would have to show that it actually is an error. Still, correlations make excellent hints as it comes to search for failure causes – in all generality, if you let your search be guided by _events that correlate with failures_, you are more likely to find _important hints on how the failure comes to be_.

## Collecting Events

How can we determine events that correlate with failure? We start with a general mechanism to actually _collect_ events during execution. The abstract `Collector` class provides

* a `collect()` method made for collecting events, called from the `traceit()` tracer; and
* an `events()` method made for retrieving these events.

Both of these are _abstract_ and will be defined further in subclasses.

In [None]:
from Tracer import Tracer

In [None]:
class Collector(Tracer):
    """A class to record events during execution."""

    def collect(self, frame, event, arg):
        """Collecting function. To be overridden in subclasses."""
        pass

    def events(self):
        """Return a collection of events. To be overridden in subclasses."""
        return set()

    def traceit(self, frame, event, arg):
        self.collect(frame, event, arg)

A `Collector` class is used like `Tracer`, using a `with` statement. Let us apply it on the buggy variant of `remove_html_markup()` from the [Introduction to Debugging](Intro_Debugging.ipynb):

In [None]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [None]:
c = Collector()
with c:
    out = remove_html_markup('"foo"')
out

There's not much we can do with our collector, as the `collect()` and `events()` methods are yet empty. However, we can introduce an `id()` method which returns a string identifying the collector. This string is defined from the _first function call_ encountered.

In [None]:
class Collector(Collector):
    def __init__(self):
        self._id = None

    def traceit(self, frame, event, arg):
        if self._id is None and event == 'call':
            # Save ID
            function = frame.f_code.co_name
            locals = frame.f_locals
            args = ", ".join([f"{var}={repr(locals[var])}" for var in locals])
            self._id = f"{function}({args})"

        self.collect(frame, event, arg)

    def id(self):
        return self._id

In [None]:
c = Collector()
with c:
    remove_html_markup('abc')
c.id()

## Collecting Coverage

So far, our `Collector` class does not collect any events. Let us extend it such that it collects _coverage_ information – that is, the set of lines executed. To this end, we introduce a `CoverageCollector` subclass which saves the coverage in a set:

In [None]:
class CoverageCollector(Collector):
    """A class to record covered lines during execution."""

    def __init__(self):
        super().__init__()
        self.coverage = set()

    def collect(self, frame, event, arg):
        self.coverage.add(frame.f_lineno)

We also override `events()` such that it returns the set of covered lines.

In [None]:
class CoverageCollector(CoverageCollector):
    def events(self):
        """Return a set of predicates holding for the execution"""
        return self.coverage

Here is how we can use `CoverageCollector` to determine the lines executed during a run of `remove_html_markup()`:

In [None]:
c = CoverageCollector()
with c:
    remove_html_markup('abc')
print(c.events())

Sets of line numbers alone are not too revealing. They provide more insights if we actually list the code, highlighting these numbers:

In [None]:
import inspect

In [None]:
from bookutils import getsourcelines    # like inspect.getsourcelines(), but in color

In [None]:
def list_with_coverage(function, coverage):
    source_lines, starting_line_number = \
       getsourcelines(function)

    line_number = starting_line_number
    for line in source_lines:
        marker = '*' if line_number in coverage else ' '
        print(f"{line_number:4} {marker} {line}", end='')
        line_number += 1

In [None]:
list_with_coverage(remove_html_markup, c.coverage)

Remember that the input `s` was `"abc"`? In this listing, we can see which lines were covered and which lines were not. From the listing already, we can see that `s` has neither tags nor quotes.

Such coverage computation plays a big role in _testing_, as one wants tests to cover as many different aspects of program execution (and notably code) as possible. But also during debugging, code coverage is essential: If some code was not even executed in the failing run, then any change to it will have no effect.

In [None]:
from bookutils import quiz

In [None]:
quiz("Let the input be <samp>&quot;&lt;b&gt;Don't do this!&lt;/b&gt;&quot;</samp>. "
     "Which of these lines are executed? Use the code to find out!",
     [
         "<samp>tag = True</samp>",
         "<samp>tag = False</samp>",
         "<samp>quote = not quote</samp>",
         "<samp>out = out + c</samp>"
     ], [ord(c) - ord('a') - 1 for c in 'cdf'])

To find the solution, try this out yourself:

In [None]:
c = CoverageCollector()
with c:
    remove_html_markup("<b>Don't do this!</b>")
# list_with_coverage(remove_html_markup, c.coverage)

## Computing Differences

Let us get back to the idea that we want to _correlate_ events with passing and failing outcomes. For this, we need to examine events in both _passing_ and _failing_ runs, and determine their _differences_ – since it is these differences we want to associate with their respective outcome.

### A Base Class for Statistical Debugging

The `StatisticalDebugger` base class takes a collector class (such as `CoverageCollector`). Its `collect()` method creates a new collector of that very class, which will be maintained by the debugger. As argument, `collect()` takes a string characterizing the outcome (such as `'PASS'` or `'FAIL'`). This is how one would use it:

```python
debugger = StatisticalDebugger(CoverageCollector)
with debugger.collect('PASS'):
    some_passing_run()
with debugger.collect('PASS'):
    another_passing_run()
with debugger.collect('FAIL'):
    some_failing_run()
```

Let us implement `StatisticalDebugger`. The base class gets a collector class as argument:

In [None]:
class StatisticalDebugger():
    """A class to collect events for multiple outcomes."""

    def __init__(self, collector_class):
        self.collector_class = collector_class
        self.collectors = {}

The `collect()` method creates (and stores) a collector for the given outcome, using the given outcome to characterize the run. Any additional arguments are passed to the collector.

In [None]:
class StatisticalDebugger(StatisticalDebugger):
    def collect(self, outcome, *args):
        collector = self.collector_class(*args)
        if outcome not in self.collectors:
            self.collectors[outcome] = []
        self.collectors[outcome].append(collector)
        return collector

Here's a simple example of `StatisticalDebugger` in action:

In [None]:
s = StatisticalDebugger(CoverageCollector)
with s.collect('PASS'):
    remove_html_markup("abc")
with s.collect('PASS'):
    remove_html_markup('<b>abc</b>')
with s.collect('FAIL'):
    remove_html_markup('"foo"')

The attribute `collectors` maps outcomes to lists of collectors:

In [None]:
s.collectors

Here's the collector of the one (and first) passing run:

In [None]:
s.collectors['PASS'][0].id()

In [None]:
s.collectors['PASS'][0].events()

To better highlight the differences between the collected events, we introduce a method `event_table()` that prints out whether an event took place in a run.

### Excursion: Printing an Event Table

In [None]:
from IPython.display import display, Markdown, HTML

In [None]:
class StatisticalDebugger(StatisticalDebugger):
    def event_table(self, show_ids=False):
        sep = ' | '

        all_events = set()
        for name in self.collectors:
            for collector in self.collectors[name]:
                all_events.update(collector.events())

        longest_event = max(len(f"{event}") for event in all_events)

        out = ""

        # Header
        if show_ids:
            out += '| ' + ' ' * longest_event + sep
            for name in self.collectors:
                for collector in self.collectors[name]:
                    out += '`' + collector.id() + '`' + sep
            out += '\n'
        else:
            out += '| ' + ' ' * longest_event + sep
            for name in self.collectors:
                for i in range(len(self.collectors[name])):
                    out += name + sep
            out += '\n'

        out += '| ' + '-' * longest_event + sep
        for name in self.collectors:
            for i in range(len(self.collectors[name])):
                out += '-' * len(name) + sep
        out += '\n'

        # Data
        for event in all_events:
            out += f"| {repr(event).rjust(longest_event)}" + sep
            for name in self.collectors:
                for collector in self.collectors[name]:
                    out += ' ' * (len(name) - 1)
                    if event in collector.events():
                        out += "X"
                    else:
                        out += "-"
                    out += sep
            out += '\n'

        return Markdown(out)

### End of Excursion

In [None]:
s = StatisticalDebugger(CoverageCollector)
with s.collect('PASS'):
    remove_html_markup("abc")
with s.collect('PASS'):
    remove_html_markup('<b>abc</b>')
with s.collect('FAIL'):
    remove_html_markup('"foo"')
s.event_table(show_ids=True)

In [None]:
quiz("How many lines are executed in the failing run only?",
    ["One", "Two", "Three"], int(chr(50)))

These lines only executed in the failing run would be a correlation to look for.

### Collecting Passing and Failing Runs

While our `StatisticalDebugger` class allows arbitrary outcomes, we are typically only interested in two outcomes, namely _passing_ vs. _failing_ runs. We therefore introduce a specialized `DifferenceDebugger()` class that provides customized methods to collect and access passing and failing runs.

In [None]:
class DifferenceDebugger(StatisticalDebugger):
    """A class to collect events for passing and failing outcomes."""

    PASS = 'PASS'
    FAIL = 'FAIL'

    def collect_pass(self):
        return self.collect(self.PASS)
    def collect_fail(self):
        return self.collect(self.FAIL)

    def pass_collectors(self):
        return self.collectors[self.PASS]
    def fail_collectors(self):
        return self.collectors[self.FAIL]

Here's how to use `DifferenceDebugger`:

In [None]:
def test_debugger_html(debugger):
    with debugger.collect_pass():
        remove_html_markup('abc')
    with debugger.collect_pass():
        remove_html_markup('<b>abc</b>')
    with debugger.collect_fail():
        remove_html_markup('"foo"')
    return debugger

In [None]:
debugger = test_debugger_html(DifferenceDebugger(CoverageCollector))

Since events come back as _sets_, we can compute _unions_ and _differences_ between these sets. For instance, we can compute which lines were executed in _any_ of the passing runs:

In [None]:
pass_1_events = debugger.pass_collectors()[0].events()

In [None]:
pass_2_events = debugger.pass_collectors()[1].events()

In [None]:
in_any_pass = pass_1_events | pass_2_events
in_any_pass

Likewise, we can determine which lines were _only_ executed in the failing run:

In [None]:
fail_events = debugger.fail_collectors()[0].events()

In [None]:
only_in_fail = fail_events - in_any_pass
only_in_fail

And we see that the "failing" run is characterized by processing quotes:

In [None]:
list_with_coverage(remove_html_markup, only_in_fail)

Let us add a few helper methods that return computations such as the above. The `all_events()` method produces a union of all events. If an outcome is given, it produces a union of all events with that outcome:

In [None]:
class DifferenceDebugger(DifferenceDebugger):
    def all_events(self, outcome=None):
        in_any = set()
        if outcome:
            for collector in self.collectors[outcome]:
                in_any.update(collector.events())
        else:
            for outcome in self.collectors:
                for collector in self.collectors[outcome]:
                    in_any.update(collector.events())
        return in_any

    def all_fail(self):
        return self.all_events(self.FAIL)

    def all_pass(self):
        return self.all_events(self.PASS)

We can now introduce helper methods that show the events occurring only in passing and failing runs, respectively:

In [None]:
class DifferenceDebugger(DifferenceDebugger):
    def only_fail(self):
        return self.all_fail() - self.all_pass()

    def only_pass(self):
        return self.all_pass() - self.all_fail()

In [None]:
debugger = test_debugger_html(DifferenceDebugger(CoverageCollector))

In [None]:
debugger.all_events()

These are the lines executed only in the failing run:

In [None]:
debugger.only_fail()

These are the lines executed only in the passing runs:

In [None]:
debugger.only_pass()

Again, having these lines individually is neat, but things become much more interesting if we can see the associated code lines just as well. That's what we will do in the next section.

## Visualizing Differences

To show correlations of line coverage in context, we introduce a number of _visualization_ techniques that _highlight_ code with different colors.

### Discrete Spectrum

The first idea is to use a _discrete_ spectrum of three colors:

* _red_ for code executed in failing runs only
* _green_ for code executed in passing runs only
* _yellow_ for code executed in both passing and failing runs.

Code that is not executed stays unhighlighted.

Our `DiscreteSpectrumDebugger` subclass provides a `color()` method that returns one of these three colors depending on the line number:

In [None]:
class DiscreteSpectrumDebugger(DifferenceDebugger):
    def color(self, line_number):
        passing = self.all_pass()
        failing = self.all_fail()

        if line_number in passing and line_number in failing:
            return 'lightyellow'
        elif line_number in failing:
            return 'mistyrose'
        elif line_number in passing:
            return 'honeydew'
        else:
            return None

The `list_with_spectrum()` method takes a function and shows each of its source code lines using the given spectrum, using HTML markup:

In [None]:
class DiscreteSpectrumDebugger(DiscreteSpectrumDebugger):
    def list_with_spectrum(self, function):
        source_lines, starting_line_number = \
           inspect.getsourcelines(function)

        line_number = starting_line_number
        out = ""
        for line in source_lines:
            if line.strip() == '':
                line = '&nbsp;'

            line = str(line_number).rjust(4) + ' ' + line

            color = self.color(line_number)
            if color:
                line = f'<pre style="background-color:{color}">' \
                        f'{line.rstrip()}</pre>'
            else:
                line = f'<pre>{line}</pre>'

            out += line
            line_number += 1

        return HTML(out)

This is how the `only_pass()` and `only_fail()` sets look like when visualized with code. The "culprit" line is well highlighted:

In [None]:
debugger = test_debugger_html(DiscreteSpectrumDebugger(CoverageCollector))

In [None]:
debugger.list_with_spectrum(remove_html_markup)

We can clearly see that the failure is correlated with the presence of quotes in the input string (which is an important hint!). But does this also show us _immediately_ where the defect to be fixed is?

In [None]:
quiz("Does the line <samp>quote = not quote</samp> actually contain the defect?",
    [
        "Yes, it should be fixed",
        "No, the defect is elsewhere"
    ],
     164 * 2 % 326
    )

Indeed, it is the preceding condition that is wrong. In order to fix a program, we have to find a location that

1. _causes_ the failure (i.e., it can be changed to make the failure go away); and
2. is a _defect_ (i.e., contains an error).

In our example above, the highlighted code line is a _symptom_ for the error. To some extent, it is also a _cause_, since, say, commenting it out would also resolve the given failure, at the cost of causing other failures. However, the preceding condition also is a cause, as is the presence of quotes in the input.

Only one of these also is a _defect_, though, and that is the preceding condition. Hence, while correlations can provide important hints, they do not necessarily locate defects.

### Continuous Spectrum

In [None]:
remove_html_markup('<b link="blue"></b>')

In [None]:
debugger = test_debugger_html(DiscreteSpectrumDebugger(CoverageCollector))
with debugger.collect_pass():
    remove_html_markup('<b link="blue"></b>')

In [None]:
debugger.list_with_spectrum(remove_html_markup)

We introduce the Tarantula method for highlighting differences. The color is defined as follows:

$$\textit{color}(\textit{line}) = \textit{low color(red)} + \frac{\%\textit{passed}(\textit{line})}{\%\textit{passed}(\textit{line}) + \%\textit{failed}(\textit{line})} \times \textit{color range}$$

In [None]:
class ContinuousSpectrumDebugger(DiscreteSpectrumDebugger):
    def event_fraction(self, event, category):
        all_runs = self.collectors[category]
        runs_with_event = set(collector for collector in all_runs 
                              if event in collector.events())
        fraction = len(runs_with_event) / len(all_runs)
        # print(f"%{category}({event}) = {fraction}")
        return fraction

    def passed(self, line_number):
        return self.event_fraction(line_number, self.PASS)

    def failed(self, line_number):
        return self.event_fraction(line_number, self.FAIL)

    def hue(self, line_number):
        passed = self.passed(line_number)
        failed = self.failed(line_number)
        if passed + failed > 0:
            return passed / (passed + failed)
        else:
            return None

In [None]:
debugger = test_debugger_html(ContinuousSpectrumDebugger(CoverageCollector))

In [None]:
for line in debugger.only_fail():
    print(line, debugger.hue(line))

In [None]:
for line in debugger.only_pass():
    print(line, debugger.hue(line))

The brightness is defined as follows:

$$\textit{bright}(line) = \max(\%\textit{passed}(\textit{line}), \%\textit{failed}(\textit{line}))$$

In [None]:
class ContinuousSpectrumDebugger(ContinuousSpectrumDebugger):
    def bright(self, line):
        return max(self.passed(line), self.failed(line))

In [None]:
debugger = test_debugger_html(ContinuousSpectrumDebugger(CoverageCollector))
for line in debugger.only_fail():
    print(line, debugger.bright(line))

In [None]:
class ContinuousSpectrumDebugger(ContinuousSpectrumDebugger):
    def color(self, line):
        hue = debugger.hue(line)
        if hue is None:
            return None
        saturation = debugger.bright(line)

        # HSL color values are specified with: 
        # hsl(hue, saturation, lightness).
        return f"hsl({hue * 120}, {saturation * 100}%, 80%)"

In [None]:
debugger = test_debugger_html(ContinuousSpectrumDebugger(CoverageCollector))

In [None]:
for line in debugger.only_fail():
    print(line, debugger.color(line))

In [None]:
for line in debugger.only_pass():
    print(line, debugger.color(line))

In [None]:
debugger.list_with_spectrum(remove_html_markup)

In [None]:
with debugger.collect_pass():
    out = remove_html_markup('<b link="blue"></b>')
out

In [None]:
debugger.list_with_spectrum(remove_html_markup)

Here's another example (right from the Tarantula paper source):

In [None]:
def middle(x, y, z):
    if y < z:
        if x < y:
            return y
        elif x < z:
            return y
    else:
        if x > y:
            return y
        elif x > z:
            return x
    return z

In [None]:
def test_debugger_middle(debugger):
    with debugger.collect_pass():
        middle(3, 3, 5)
    with debugger.collect_pass():
        middle(1, 2, 3)
    with debugger.collect_pass():
        middle(3, 2, 1)
    with debugger.collect_pass():
        middle(5, 5, 5)
    with debugger.collect_pass():
        middle(5, 3, 4)
    with debugger.collect_fail():
        middle(2, 1, 3)
    return debugger

Note that in order to collect data from multiple function invocations, you need to have a separate `with` clause for every invocation. The following will _not_ work correctly:

```python
    with debugger.collect_pass():
        middle(3, 3, 5)
        middle(1, 2, 3)
        ...
```

In [None]:
debugger = test_debugger_middle(ContinuousSpectrumDebugger(CoverageCollector))

In [None]:
debugger.event_table()

In [None]:
debugger.list_with_spectrum(middle)

## Ranking Lines by Suspiciousness

### The Tarantula Metric

### The Ochiai Metric

### How Effective is Ranking?

## Other Events besides Coverage

Our framework allows for tracking arbitrary events, not just coverage.

In [None]:
class ValueCollector(Collector):
    def __init__(self):
        super().__init__()
        self.vars = set()

    def collect(self, frame, event, arg):
        local_vars = frame.f_locals
        for var in local_vars:
            value = local_vars[var]
            self.vars.add((var, value))

    def events(self):
        return self.vars

In [None]:
debugger = test_debugger_html(DifferenceDebugger(ValueCollector))
debugger.event_table()

In [None]:
debugger.only_fail()

In [None]:
debugger = test_debugger_middle(DifferenceDebugger(ValueCollector))
debugger.event_table()

In [None]:
debugger.only_fail()

### Training a Classifier

In [None]:
from sklearn import tree

In [None]:
class ClassifyingDebugger(DifferenceDebugger):
    PASS_VALUE = +1
    FAIL_VALUE = -1

    def samples(self):
        samples = {}
        for collector in self.pass_collectors():
            samples[collector.id()] = self.PASS_VALUE
        for collector in debugger.fail_collectors():
            samples[collector.id()] = self.FAIL_VALUE
        return samples

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.samples()

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def features(self):
        features = {}
        for collector in debugger.pass_collectors():
            features[collector.id()] = collector.events()
        for collector in debugger.fail_collectors():
            features[collector.id()] = collector.events()
        return features

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.features()

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def feature_names(self):
        return [repr(feature) for feature in self.all_events()]

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.feature_names()

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def shape(self, sample):
        x = []
        features = self.features()
        for f in self.all_events():
            if f in features[sample]:
                x += [+1]
            else:
                x += [-1]
        return x

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.shape('middle(z=5, y=3, x=3)')

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def X(self):
        X = []
        samples = self.samples()
        for key in samples:
            X += [self.shape(key)]
        return X

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.X()

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def Y(self):
        Y = []
        samples = self.samples()
        for key in samples:
            Y += [samples[key]]
        return Y

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
debugger.Y()

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def classifier(self):
        classifier = tree.DecisionTreeClassifier()
        classifier = classifier.fit(self.X(), self.Y())
        return classifier

In [None]:
import graphviz

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def show_classifier(self, classifier):
        dot_data = tree.export_graphviz(classifier, out_file=None, 
                         filled=False, rounded=True,
                         feature_names=self.feature_names(),
                                class_names=["fail", "pass"],
                                impurity=False,
                         special_characters=True)
        dot_data = dot_data.replace('&le; 0.0', ': no')
        dot_data = dot_data.replace('&ge; 0.0', ': yes')

        return graphviz.Source(dot_data)

This is the tree we get.  A decision like `* <= 0` means that `*` is not part of the input.

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
classifier = debugger.classifier()
debugger.show_classifier(classifier)

In [None]:
class ClassifyingDebugger(ClassifyingDebugger):
    def predict(self, classifier, sample):
        return classifier.predict([self.shape(sample)])

In [None]:
debugger = test_debugger_middle(ClassifyingDebugger(CoverageCollector))
# debugger.predict(classifier, set(166))

## Synopsis

_For those only interested in using the code in this chapter (without wanting to know how it works), give an example.  This will be copied to the beginning of the chapter (before the first section) as text with rendered input and output._

For instance, this is what we get for `x=1`:

You can use `int_fuzzer()` as:

In [None]:
print(2 + 2)

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducer.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_