# Failure-Inducing Changes

"Yesterday, my program worked. Today, it does not. Why?" In debugging, as elsewhere in software development, code keeps on changing. Thus, it can happen that a piece of code that yesterday was working perfectly, now no longer runs – because we (or others) have made some changes to it that cause it to fail. The good news is that for debugging, we can actually _exploit_ this version history to narrow down _the change that caused the failure_.

In [1]:
from bookutils import YouTubeVideo
YouTubeVideo("w4u5gCgPlmg")

**Prerequisites**

* You should have read the [Chapter on Delta Debugging](DeltaDebugger.ipynb).
* Knowledge on version control systems (notably git) will be useful.

In [2]:
import bookutils

In [3]:
from bookutils import quiz, print_file, print_content

## Synopsis
<!-- Automatically generated. Do not edit. -->

To [use the code provided in this chapter](Importing.ipynb), write

```python
>>> from debuggingbook.ChangeDebugger import <identifier>
```

and then make use of the following features.


This chapter introduces a class `ChangeDebugger` that automatically determines failure-inducing code changes.

### High-Level Interface

Given two source files `source_pass` and `source_fail`, where `failing_function()` raises an exception in `source_pass`, but not in `source_fail`, you can use `ChangeDebugger` as follows:

```python
with ChangeDebugger(source_1, source_2) as cd:
    failing_function()
cd
```

This will produce the failure-inducing change between `source_pass` and `source_fail`.

```python
>>> print(version_1)
def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif not tag:
            out = out + c

    return out

>>> print(version_2)
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

>>> with ChangeDebugger(version_1, version_2) as cd:
>>>     test_remove_html_markup()
>>> cd
@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif
```
A programmatic interface is also available. The method `min_patches()` returns a triple (`pass_patches`, `fail_patches`, `diffs`) where

* applying `pass_patches` causes the call to pass
* applying `fail_patches` causes the call to fail
* `diffs` is the (minimal) difference between the two.

```python
>>> pass_patches, fail_patches, diffs = cd.min_patches()
>>> for p in diffs:
>>>     print(urllib.parse.unquote(str(p)))
@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

```
### Supporting Functions

`ChangeDebugger` relies on lower level `patch()` and `diff()` functions.

To apply patch objects on source code, use the `patch()` function. It takes a source code and a list of patches to be applied.

```python
>>> print(patch(version_1, diffs))
def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

```
Conversely, the `diff()` function computes patches between two texts. It returns a list of patch objects that can be applied on text.

```python
>>> for p in diff(version_1, version_2):
>>>     print(urllib.parse.unquote(str(p)))
@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

     

@@ -146,48 +146,45 @@
 rue

-        elif c == '>':  # end of markup

+        elif c == '>' and not quote:

     

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

```
The `ChangeDebugger` class uses [Delta Debugging](DeltaDebugger.ipynb) to determine minimal differences in patches applied.



## A Version History

We start with creating a little version history. (If you do not use version control for your projects, you are in debugging hell. Go and set it up now.)

Using the `remove_html_markup()` versions from [the introduction to debugging](Intro_Debugging.ipynb) and [the chapter on assertions](Assertions.ipynb), we create a little version history.

### Create a Working Directory

In [4]:
PROJECT = 'my_project'

In [5]:
import os
import shutil

In [6]:
try:
    shutil.rmtree(PROJECT)
except FileNotFoundError:
    pass
os.mkdir(PROJECT)

In [7]:
import sys

In [8]:
sys.path.append(os.getcwd())
os.chdir(PROJECT)

### Initialize git

In [9]:
!git init

Initialized empty Git repository in /Users/zeller/Projects/debuggingbook/notebooks/my_project/.git/


In [10]:
!git config advice.detachedHead False

In [11]:
def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif not tag:
            out = out + c

    return out

In [12]:
import inspect

In [13]:
def write_source(fun, filename=None):
    if filename is None:
        filename = fun.__name__ + '.py'
    with open(filename, 'w') as fh:
        fh.write(inspect.getsource(fun))

In [14]:
write_source(remove_html_markup)

In [15]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [16]:
!git add remove_html_markup.py

In [17]:
!git commit -m "First version"

[master (root-commit) ca1b5bf] First version
 1 file changed, 13 insertions(+)
 create mode 100644 remove_html_markup.py


In [18]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [19]:
write_source(remove_html_markup)

We can inspect the differences between the previously committed version and the current one.

In [20]:
!git diff remove_html_markup.py

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [21]:
!git commit -m "Second version" remove_html_markup.py

[master 6cfef1a] Second version
 1 file changed, 5 insertions(+), 2 deletions(-)


We create a few more revisions.

### Excursion: More Revisions

In [22]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        print("c =", repr(c), "tag =", tag, "quote =", quote)

        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [23]:
write_source(remove_html_markup)

In [24]:
!git commit -m "Third version (with debugging output)" remove_html_markup.py

[master 7821ef4] Third version (with debugging output)
 1 file changed, 2 insertions(+)


In [25]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<':  # and not quote:
            tag = True
        elif c == '>':  # and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [26]:
write_source(remove_html_markup)

In [27]:
!git commit -m "Fourth version (clueless)" remove_html_markup.py

[master 34371e6] Fourth version (clueless)
 1 file changed, 2 insertions(+), 4 deletions(-)


In [28]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        assert not tag  # <=== Just added

        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [29]:
write_source(remove_html_markup)

In [30]:
!git commit -m "Fifth version (with assert)" remove_html_markup.py

[master 8e44657] Fifth version (with assert)
 1 file changed, 4 insertions(+), 2 deletions(-)


In [31]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            assert False  # <=== Just added
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [32]:
write_source(remove_html_markup)

In [33]:
!git commit -m "Sixth version (with another assert)" remove_html_markup.py

[master 3e857d2] Sixth version (with another assert)
 1 file changed, 1 insertion(+), 2 deletions(-)


In [34]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif (c == '"' or c == "'") and tag:  # <-- FIX
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [35]:
write_source(remove_html_markup)

In [36]:
!git commit -m "Seventh version (fixed)" remove_html_markup.py

[master 45184b8] Seventh version (fixed)
 1 file changed, 1 insertion(+), 2 deletions(-)


### End of Excursion

Here comes the last version:

In [37]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    # postcondition
    assert '<' not in out and '>' not in out

    return out

In [38]:
write_source(remove_html_markup)

In [39]:
!git commit -m "Eighth version (with proper assertion)" remove_html_markup.py

[master 4647a54] Eighth version (with proper assertion)
 1 file changed, 4 insertions(+), 1 deletion(-)


We find that the latest version has an error.

In [40]:
from ExpectError import ExpectError

In [41]:
with ExpectError():
    assert remove_html_markup('"foo"') == '"foo"'

Traceback (most recent call last):
  File "<ipython-input-41-839bf760c807>", line 2, in <module>
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


When did the error occur?

## Accessing Versions

We can look up the individual versions.

In [42]:
!git log --pretty=oneline

[33m4647a54bf606de10d15f61e7662b8d348b84bab2[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m Eighth version (with proper assertion)
[33m45184b8f110f505d3e765f8e7051be77b35446b0[m Seventh version (fixed)
[33m3e857d29eac4da5cd9e0419da39e353b4ed2346f[m Sixth version (with another assert)
[33m8e4465730efb38b8c94dac5f904f6c828a509260[m Fifth version (with assert)
[33m34371e649e9deb7c4d9815a0ee05f310cfc62ffe[m Fourth version (clueless)
[33m7821ef43833bbd77c3c3f8e48b0f4ddee5d7e7ce[m Third version (with debugging output)
[33m6cfef1a6ec450231c68f7a57f316ce545c807066[m Second version
[33mca1b5bf3523d12e72c973429da8b95bbcdcebaeb[m First version


In [43]:
import subprocess

In [44]:
def get_output(command):
    result = subprocess.run(command, 
                            stdout=subprocess.PIPE,
                            universal_newlines=True)
    return result.stdout

In [45]:
log = get_output(['git', 'log', '--pretty=oneline'])
print(log)

4647a54bf606de10d15f61e7662b8d348b84bab2 Eighth version (with proper assertion)
45184b8f110f505d3e765f8e7051be77b35446b0 Seventh version (fixed)
3e857d29eac4da5cd9e0419da39e353b4ed2346f Sixth version (with another assert)
8e4465730efb38b8c94dac5f904f6c828a509260 Fifth version (with assert)
34371e649e9deb7c4d9815a0ee05f310cfc62ffe Fourth version (clueless)
7821ef43833bbd77c3c3f8e48b0f4ddee5d7e7ce Third version (with debugging output)
6cfef1a6ec450231c68f7a57f316ce545c807066 Second version
ca1b5bf3523d12e72c973429da8b95bbcdcebaeb First version



In [46]:
versions = [line.split()[0] for line in log.split('\n') if line]
versions.reverse()

We can check out the first version:

In [47]:
!git checkout {versions[0]}

HEAD is now at ca1b5bf First version


In [48]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [49]:
exec(open('remove_html_markup.py').read())

In [50]:
remove_html_markup('"foo"')

'"foo"'

... and the last one:

In [51]:
!git checkout {versions[7]}

Previous HEAD position was ca1b5bf First version
HEAD is now at 4647a54 Eighth version (with proper assertion)


In [52]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [37m# postcondition[39;49;00m
    [34massert[39;49;00m [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mnot

This is the version that no longer works.

In [53]:
exec(open('remove_html_markup.py').read())

In [54]:
remove_html_markup('"foo"')

'foo'

## Manual Bisecting

Bisecting is a cool technique to identify which commit caused the failure.

In [55]:
!git bisect start

In [56]:
!git bisect good {versions[0]}

In [57]:
!git bisect bad {versions[7]}

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[34371e649e9deb7c4d9815a0ee05f310cfc62ffe] Fourth version (clueless)


In [58]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:  [37m# and not quote:[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# and not quote:[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [59]:
exec(open('remove_html_markup.py').read())

In [60]:
remove_html_markup('"foo"')

'foo'

In [61]:
!git bisect bad

Bisecting: 0 revisions left to test after this (roughly 1 step)
[7821ef43833bbd77c3c3f8e48b0f4ddee5d7e7ce] Third version (with debugging output)


In [62]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [36mprint[39;49;00m([33m"[39;49;00m[33mc =[39;49;00m[33m"[39;49;00m, [36mrepr[39;49;00m(c), [33m"[39;49;00m[33mtag =[39;49;00m[33m"[39;49;00m, tag, [33m"[39;49;00m[33mquote =[39;49;00m[33m"[39;49;00m, quote)

        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            

In [63]:
exec(open('remove_html_markup.py').read())

In [64]:
remove_html_markup('"foo"')

c = '"' tag = False quote = False
c = 'f' tag = False quote = True
c = 'o' tag = False quote = True
c = 'o' tag = False quote = True
c = '"' tag = False quote = True


'foo'

In [65]:
!git bisect bad

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6cfef1a6ec450231c68f7a57f316ce545c807066] Second version


In [66]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [67]:
exec(open('remove_html_markup.py').read())

In [68]:
remove_html_markup('"foo"')

'foo'

We got the failure-inducing change:

In [69]:
!git bisect bad

6cfef1a6ec450231c68f7a57f316ce545c807066 is the first bad commit
[33mcommit 6cfef1a6ec450231c68f7a57f316ce545c807066[m
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 19:23:44 2020 +0100

    Second version

 remove_html_markup.py | 7 [32m+++++[m[31m--[m
 1 file changed, 5 insertions(+), 2 deletions(-)


... and this is what it does – introducing `quote` handling:

In [70]:
!git diff HEAD^

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [71]:
!git bisect view

[33mcommit 6cfef1a6ec450231c68f7a57f316ce545c807066[m[33m ([m[1;36mHEAD[m[33m, [m[mrefs/bisect/bad[m[33m)[m
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 19:23:44 2020 +0100

    Second version


In [72]:
!git bisect reset

Previous HEAD position was 6cfef1a Second version
HEAD is now at 4647a54 Eighth version (with proper assertion)


## Automatic Bisecting

We can write a test script to automate bisecting. Its return code indicates the test outcome.

In [73]:
# ignore
open('test.py', 'w').write('''
#!/usr/bin/env python

from remove_html_markup import remove_html_markup
import sys

result = remove_html_markup('"foo"')
if result == '"foo"':
    sys.exit(0)  # good/pass
elif result == 'foo':
    sys.exit(1)  # bad/fail
else:
    sys.exit(125)  # unresolved
''');

In [74]:
print_file('test.py')

[37m#!/usr/bin/env python[39;49;00m

[34mfrom[39;49;00m [04m[36mremove_html_markup[39;49;00m [34mimport[39;49;00m remove_html_markup
[34mimport[39;49;00m [04m[36msys[39;49;00m

result = remove_html_markup([33m'[39;49;00m[33m"[39;49;00m[33mfoo[39;49;00m[33m"[39;49;00m[33m'[39;49;00m)
[34mif[39;49;00m result == [33m'[39;49;00m[33m"[39;49;00m[33mfoo[39;49;00m[33m"[39;49;00m[33m'[39;49;00m:
    sys.exit([34m0[39;49;00m)  [37m# good/pass[39;49;00m
[34melif[39;49;00m result == [33m'[39;49;00m[33mfoo[39;49;00m[33m'[39;49;00m:
    sys.exit([34m1[39;49;00m)  [37m# bad/fail[39;49;00m
[34melse[39;49;00m:
    sys.exit([34m125[39;49;00m)  [37m# unresolved[39;49;00m

Right now, we are in the "fail" state:

In [75]:
!python ./test.py; echo $?

1


In [76]:
!git bisect start

In [77]:
!git bisect good {versions[0]}

In [78]:
!git bisect bad {versions[7]}

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[34371e649e9deb7c4d9815a0ee05f310cfc62ffe] Fourth version (clueless)


Here comes the automatic part:

In [79]:
!git bisect run python test.py

running python test.py
Bisecting: 0 revisions left to test after this (roughly 1 step)
[7821ef43833bbd77c3c3f8e48b0f4ddee5d7e7ce] Third version (with debugging output)
running python test.py
c = '"' tag = False quote = False
c = 'f' tag = False quote = True
c = 'o' tag = False quote = True
c = 'o' tag = False quote = True
c = '"' tag = False quote = True
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6cfef1a6ec450231c68f7a57f316ce545c807066] Second version
running python test.py
6cfef1a6ec450231c68f7a57f316ce545c807066 is the first bad commit
commit 6cfef1a6ec450231c68f7a57f316ce545c807066
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 19:23:44 2020 +0100

    Second version

 remove_html_markup.py | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
bisect run success


Again, we obtain the failure-inducing change:

In [80]:
!git diff HEAD^

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [81]:
!git bisect reset

Previous HEAD position was 6cfef1a Second version
HEAD is now at 4647a54 Eighth version (with proper assertion)


## Computing and Applying Patches

Our commit consists of a number of changes. Can we break this down further? Delta Debugging (on changes) to the rescue!

For this, though, we first need means to compute and apply patches.

In [82]:
version_1 = get_output(['git', 'show', 
                            f'{versions[0]}:remove_html_markup.py'])

In [83]:
print_content(version_1, '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [84]:
version_2 = get_output(['git', 'show', 
                            f'{versions[1]}:remove_html_markup.py'])

In [85]:
print_content(version_2, '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [86]:
!git diff {versions[0]} {versions[1]}

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


We use Google's [diff-match-patch library](https://github.com/google/diff-match-patch).

In [87]:
from diff_match_patch import diff_match_patch

The `diff()` function computes a set of patches (changes, diffs) between the two texts `s1` and `s2`:

In [88]:
def diff(s1, s2, mode='lines'):
    dmp = diff_match_patch()
    if mode == 'lines':
        (text1, text2, linearray) = dmp.diff_linesToChars(s1, s2)
        diffs = dmp.diff_main(text1, text2)
        dmp.diff_charsToLines(diffs, linearray)
        return dmp.patch_make(diffs)

    if mode == 'chars':
        diffs = dmp.diff_main(s1, s2)
        return dmp.patch_make(s1, diffs)

    raise ValueError("mode must be 'lines' or 'chars'")

In [89]:
patches = diff(version_1, version_2)
patches

[<diff_match_patch.diff_match_patch.patch_obj at 0x7faedab90588>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedab90780>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedab90438>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedab904a8>]

Here's how to inspect these patches:

In [90]:
import urllib

In [91]:
def patch_string(p):
    return urllib.parse.unquote(str(p).strip())

In [92]:
for p in patches:
    print(patch_string(p))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""
@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

@@ -146,48 +146,45 @@
 rue

-        elif c == '>':  # end of markup

+        elif c == '>' and not quote:

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif


Conversely, the `patch()` function applies patches.

In [93]:
def patch(s, patches):
    dmp = diff_match_patch()
    text, success = dmp.patch_apply(patches, s)
    assert all(success)
    return text

If we apply _all_ patches, we get version 2:

In [94]:
print_content(patch(version_1, patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [95]:
assert patch(version_1, patches) == version_2

Applying _no_ patch leaves the content unchanged.

In [96]:
assert patch(version_1, []) == version_1

However, one can also apply _partial_ sets of patches:

In [97]:
print(patch_string(patches[0]))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""


In [98]:
print_content(patch(version_1, [patches[0]]))

def remove_html_markup(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    [34mout[39;49;00m = [33m""[39;49;00m

    [34mfor[39;49;00m [34mc[39;49;00m [34min[39;49;00m s:
        [34mif[39;49;00m [34mc[39;49;00m == [33m'<'[39;49;00m:    # [34mstart[39;49;00m [34mof[39;49;00m markup
            tag = [34mTrue[39;49;00m
        elif [34mc[39;49;00m == [33m'>'[39;49;00m:  # [34mend[39;49;00m [34mof[39;49;00m markup
            tag = [34mFalse[39;49;00m
        elif [34mnot[39;49;00m tag:
            [34mout[39;49;00m = [34mout[39;49;00m + [34mc[39;49;00m

    [34mreturn[39;49;00m [34mout[39;49;00m

In [99]:
print_content(patch(version_1, [patches[1]]))

def remove_html_markup(s):
    tag = [34mFalse[39;49;00m
    [34mout[39;49;00m = [33m""[39;49;00m

    [34mfor[39;49;00m [34mc[39;49;00m [34min[39;49;00m s:
        [34mif[39;49;00m [34mc[39;49;00m == [33m'<'[39;49;00m [34mand[39;49;00m [34mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        elif [34mc[39;49;00m == [33m'>'[39;49;00m:  # [34mend[39;49;00m [34mof[39;49;00m markup
            tag = [34mFalse[39;49;00m
        elif [34mnot[39;49;00m tag:
            [34mout[39;49;00m = [34mout[39;49;00m + [34mc[39;49;00m

    [34mreturn[39;49;00m [34mout[39;49;00m

## Delta Debugging on Patches

We now have everything we need.

In [100]:
from DeltaDebugger import DeltaDebugger

In [101]:
def test_remove_html_markup(patches):
    new_version = patch(version_1, patches)
    exec(new_version, globals())
    assert remove_html_markup('"foo"') == '"foo"'

In [102]:
test_remove_html_markup([])

In [103]:
with ExpectError():
    test_remove_html_markup(patches)

Traceback (most recent call last):
  File "<ipython-input-103-9f9aae950c60>", line 2, in <module>
    test_remove_html_markup(patches)
  File "<ipython-input-101-4525dbd53ca7>", line 4, in test_remove_html_markup
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


In [104]:
with DeltaDebugger() as dd:
    test_remove_html_markup(patches)

In [105]:
reduced_patches = dd.min_args()['patches']

In [106]:
for p in reduced_patches:
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



In [107]:
print_content(patch(version_1, reduced_patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

Can we narrow this down even further? Yes!

In [108]:
pass_patches, fail_patches, diffs = (arg['patches'] for arg in dd.min_arg_diff())

In [109]:
print_content(patch(version_1, pass_patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [110]:
print_content(patch(version_1, fail_patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

and the difference is:

In [111]:
for p in diffs:
    print(urllib.parse.unquote(str(p)))

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



We learn that a single change introduced the failure.

## A ChangeDebugger class

Let us put all this together in a single class.

In [112]:
from DeltaDebugger import DeltaDebugger, NotFailingError

In [113]:
class ChangeDebugger(DeltaDebugger):
    def __init__(self, pass_source, fail_source, **ddargs):
        super().__init__(**ddargs)
        self._pass_source = pass_source
        self._fail_source = fail_source
        self._patches = diff(pass_source, fail_source)

    def pass_source(self):
        return self._pass_source
    def fail_source(self):
        return self._fail_source
    def patches(self):
        return self._patches

In [114]:
def test_remove_html_markup():
    assert remove_html_markup('"foo"') == '"foo"'    

In [115]:
with ChangeDebugger(version_1, version_2) as cd:
    test_remove_html_markup()

In [116]:
with ExpectError(AssertionError):
    cd.call()

Traceback (most recent call last):
  File "<ipython-input-116-1cdc3d223902>", line 2, in <module>
    cd.call()
  File "/Users/zeller/Projects/debuggingbook/notebooks/DeltaDebugger.ipynb", line 162, in call
    return self.function()(**args)
  File "<ipython-input-114-e1b019d948aa>", line 2, in test_remove_html_markup
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


In [117]:
print_content(cd.pass_source(), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [118]:
print_content(cd.fail_source(), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [119]:
cd.patches()

[<diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc7ba20>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc7bcc0>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc7b2b0>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc7b0f0>]

Now for the testing:

In [120]:
class ChangeDebugger(ChangeDebugger):
    def test_patches(self, patches):
        new_version = patch(self.pass_source(), patches)
        exec(new_version, globals())
        self.call()

In [121]:
with ChangeDebugger(version_1, version_2, log=True) as cd:
    test_remove_html_markup()

Observed test_remove_html_markup() raising AssertionError


In [122]:
cd.function()

<function __main__.test_remove_html_markup()>

In [123]:
cd.test_patches([])

In [124]:
with ExpectError(AssertionError):
    cd.test_patches(cd.patches())

Traceback (most recent call last):
  File "<ipython-input-124-27b59634b5f9>", line 2, in <module>
    cd.test_patches(cd.patches())
  File "<ipython-input-120-c2b7a799cf41>", line 5, in test_patches
    self.call()
  File "/Users/zeller/Projects/debuggingbook/notebooks/DeltaDebugger.ipynb", line 162, in call
    return self.function()(**args)
  File "<ipython-input-114-e1b019d948aa>", line 2, in test_remove_html_markup
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


Here's where `ChangeDebugger` applies the `DeltaDebugging` functionality on its own `test_patches()` method:

In [125]:
class ChangeDebugger(ChangeDebugger):
    def min_patches(self):
        patches = self.patches()
        with self:
            self.test_patches(patches)
        return tuple(p['patches'] for p in dd.min_arg_diff())

    def __repr__(self):
        pass_patches, fail_patches, diff_patches = self.min_patches()
        return "".join(urllib.parse.unquote(str(p)) for p in diff_patches)

In [126]:
with ChangeDebugger(version_1, version_2) as cd:
    test_remove_html_markup()

In [127]:
cd.patches()

[<diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc8d2b0>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc8d358>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc8d1d0>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7faedbc8d048>]

In [128]:
pass_patches, fail_patches, diffs = cd.min_patches()
diffs

[<diff_match_patch.diff_match_patch.patch_obj at 0x7faedab904a8>]

In [129]:
cd

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

Success!

Does this also work for longer change histories? Let's take the very first and the very last version.

In [130]:
version_8 = get_output(['git', 'show', 
                            f'{versions[7]}:remove_html_markup.py'])

In [131]:
with ChangeDebugger(version_1, version_8) as cd:
    test_remove_html_markup()

In [132]:
for p in cd.patches():
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

     

@@ -146,48 +146,45 @@
 rue

-        elif c == '>':  # end of markup

+        elif c == '>' and not quote:

     

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

@@ -324,16 +324,82 @@
 out + c

+
    # postcondition
    assert '<' not in out and '>' not in out

 
    ret



Again, success!

In [133]:
cd

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

What happens if the function does not fail? Then, the `DeltaDebugger` diagnosis parts take over.

In [134]:
with ExpectError(NotFailingError):
    with ChangeDebugger(version_1, version_2) as cd:
        remove_html_markup("foo")

Traceback (most recent call last):
  File "<ipython-input-134-99eea347b65d>", line 3, in <module>
    remove_html_markup("foo")
  File "/Users/zeller/Projects/debuggingbook/notebooks/DeltaDebugger.ipynb", line 149, in __exit__
    self.after_collection()
  File "/Users/zeller/Projects/debuggingbook/notebooks/DeltaDebugger.ipynb", line 518, in after_collection
    raise NotFailingError(f"{self.format_call()} did not raise an exception")
DeltaDebugger.NotFailingError: remove_html_markup(s='foo') did not raise an exception (expected)


## Synopsis

This chapter introduces a class `ChangeDebugger` that automatically determines failure-inducing code changes.

### High-Level Interface

Given two source files `source_pass` and `source_fail`, where `failing_function()` raises an exception in `source_pass`, but not in `source_fail`, you can use `ChangeDebugger` as follows:

```python
with ChangeDebugger(source_1, source_2) as cd:
    failing_function()
cd
```

This will produce the failure-inducing change between `source_pass` and `source_fail`.

In [135]:
print(version_1)

def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif not tag:
            out = out + c

    return out



In [136]:
print(version_2)

def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out



In [137]:
with ChangeDebugger(version_1, version_2) as cd:
    test_remove_html_markup()
cd

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif

A programmatic interface is also available. The method `min_patches()` returns a triple (`pass_patches`, `fail_patches`, `diffs`) where

* applying `pass_patches` causes the call to pass
* applying `fail_patches` causes the call to fail
* `diffs` is the (minimal) difference between the two.

In [138]:
pass_patches, fail_patches, diffs = cd.min_patches()

In [139]:
for p in diffs:
    print(urllib.parse.unquote(str(p)))

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



### Supporting Functions

`ChangeDebugger` relies on lower level `patch()` and `diff()` functions.

To apply patch objects on source code, use the `patch()` function. It takes a source code and a list of patches to be applied.

In [140]:
print(patch(version_1, diffs))

def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out



Conversely, the `diff()` function computes patches between two texts. It returns a list of patch objects that can be applied on text.

In [141]:
for p in diff(version_1, version_2):
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

     

@@ -146,48 +146,45 @@
 rue

-        elif c == '>':  # end of markup

+        elif c == '>' and not quote:

     

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



The `ChangeDebugger` class uses [Delta Debugging](DeltaDebugger.ipynb) to determine minimal differences in patches applied.

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducer.ipynb)


## Background

Cite \cite{Zeller1999} and earlier works.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [142]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [143]:
# Some code for the solution
2 + 2

4

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_