# Failure-Inducing Changes

_Brief abstract/introduction/motivation.  State what the chapter is about in 1-2 paragraphs._
_Then, have an introduction video:_

In [1]:
from bookutils import YouTubeVideo
YouTubeVideo("w4u5gCgPlmg")

**Prerequisites**

* _Refer to earlier chapters as notebooks here, as here:_ [Earlier Chapter](Fuzzer.ipynb).

In [2]:
import bookutils

In [3]:
from bookutils import quiz, print_file, print_content

## Synopsis
<!-- Automatically generated. Do not edit. -->

To [use the code provided in this chapter](Importing.ipynb), write

```python
>>> from debuggingbook.ChangeDebugger import <identifier>
```

and then make use of the following features.


_For those only interested in using the code in this chapter (without wanting to know how it works), give an example.  This will be copied to the beginning of the chapter (before the first section) as text with rendered input and output._

You can use `int_fuzzer()` as:

```python
>>> print(int_fuzzer())
76.5
```


## A Version History

Using the `remove_html_markup()` versions from [the introduction to debugging](Intro_Debugging.ipynb) and [the chapter on assertions](Assertions.ipynb), we create a little version history.

### Create a Working Directory

In [4]:
PROJECT = 'my_project'

In [5]:
import os
import shutil

In [6]:
try:
    shutil.rmtree(PROJECT)
except FileNotFoundError:
    pass
os.mkdir(PROJECT)

In [7]:
import sys

In [8]:
sys.path.append(os.getcwd())
os.chdir(PROJECT)

### Initialize git

In [9]:
!git init

Initialized empty Git repository in /Users/zeller/Projects/debuggingbook/notebooks/my_project/.git/


In [10]:
!git config advice.detachedHead False

In [11]:
def remove_html_markup(s):
    tag = False
    out = ""

    for c in s:
        if c == '<':    # start of markup
            tag = True
        elif c == '>':  # end of markup
            tag = False
        elif not tag:
            out = out + c

    return out

In [12]:
import inspect

In [13]:
def write_source(fun, filename=None):
    if filename is None:
        filename = fun.__name__ + '.py'
    with open(filename, 'w') as fh:
        fh.write(inspect.getsource(fun))

In [14]:
write_source(remove_html_markup)

In [15]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [16]:
!git add remove_html_markup.py

In [17]:
!git commit -m "First version"

[master (root-commit) 6a89ebb] First version
 1 file changed, 13 insertions(+)
 create mode 100644 remove_html_markup.py


In [18]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [19]:
write_source(remove_html_markup)

In [20]:
!git diff remove_html_markup.py

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [21]:
!git commit -m "Second version" remove_html_markup.py

[master aa2ff76] Second version
 1 file changed, 5 insertions(+), 2 deletions(-)


In [22]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        print("c =", repr(c), "tag =", tag, "quote =", quote)

        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [23]:
write_source(remove_html_markup)

In [24]:
!git commit -m "Third version (with debugging output)" remove_html_markup.py

[master 27256a0] Third version (with debugging output)
 1 file changed, 2 insertions(+)


In [25]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<':  # and not quote:
            tag = True
        elif c == '>':  # and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [26]:
write_source(remove_html_markup)

In [27]:
!git commit -m "Fourth version (clueless)" remove_html_markup.py

[master 437cf32] Fourth version (clueless)
 1 file changed, 2 insertions(+), 4 deletions(-)


In [28]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        assert not tag  # <=== Just added

        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [29]:
write_source(remove_html_markup)

In [30]:
!git commit -m "Fifth version (with assert)" remove_html_markup.py

[master fad2251] Fifth version (with assert)
 1 file changed, 4 insertions(+), 2 deletions(-)


In [31]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            assert False  # <=== Just added
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [32]:
write_source(remove_html_markup)

In [33]:
!git commit -m "Sixth version (with another assert)" remove_html_markup.py

[master 9e08431] Sixth version (with another assert)
 1 file changed, 1 insertion(+), 2 deletions(-)


In [34]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif (c == '"' or c == "'") and tag:  # <-- FIX
            quote = not quote
        elif not tag:
            out = out + c

    return out

In [35]:
write_source(remove_html_markup)

In [36]:
!git commit -m "Seventh version (fixed)" remove_html_markup.py

[master d63e37e] Seventh version (fixed)
 1 file changed, 1 insertion(+), 2 deletions(-)


In [37]:
def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
        if c == '<' and not quote:
            tag = True
        elif c == '>' and not quote:
            tag = False
        elif c == '"' or c == "'" and tag:
            quote = not quote
        elif not tag:
            out = out + c

    # postcondition
    assert '<' not in out and '>' not in out

    return out

In [38]:
write_source(remove_html_markup)

In [39]:
!git commit -m "Eighth version (with proper assertion)" remove_html_markup.py

[master 0ddc57a] Eighth version (with proper assertion)
 1 file changed, 4 insertions(+), 1 deletion(-)


We find that the latest version has an error.

In [40]:
from ExpectError import ExpectError

In [41]:
with ExpectError():
    assert remove_html_markup('"foo"') == '"foo"'

Traceback (most recent call last):
  File "<ipython-input-41-839bf760c807>", line 2, in <module>
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


When did the error occur?

## Accessing Versions

In [42]:
!git log --pretty=oneline

[33m0ddc57aa0f166b2bbb2d27ad4eff1088dacf57ce[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m Eighth version (with proper assertion)
[33md63e37e873ae6bde6327c1fd4255e9b8afdf810b[m Seventh version (fixed)
[33m9e08431e857fcec624fcf3910c01a7d39cb6da4b[m Sixth version (with another assert)
[33mfad2251458dba82168d553348fb78513f6867820[m Fifth version (with assert)
[33m437cf325fc775b632301755150d5f79117e7cb20[m Fourth version (clueless)
[33m27256a09550f1e5c167227df98909663522ab882[m Third version (with debugging output)
[33maa2ff76422281874ae6881359c04f3ee3cda38f1[m Second version
[33m6a89ebba324d9695660b7b11586e42d09063aa55[m First version


In [43]:
import subprocess

In [44]:
def get_output(command):
    result = subprocess.run(command, 
                            stdout=subprocess.PIPE,
                            universal_newlines=True)
    return result.stdout

In [45]:
log = get_output(['git', 'log', '--pretty=oneline'])
print(log)

0ddc57aa0f166b2bbb2d27ad4eff1088dacf57ce Eighth version (with proper assertion)
d63e37e873ae6bde6327c1fd4255e9b8afdf810b Seventh version (fixed)
9e08431e857fcec624fcf3910c01a7d39cb6da4b Sixth version (with another assert)
fad2251458dba82168d553348fb78513f6867820 Fifth version (with assert)
437cf325fc775b632301755150d5f79117e7cb20 Fourth version (clueless)
27256a09550f1e5c167227df98909663522ab882 Third version (with debugging output)
aa2ff76422281874ae6881359c04f3ee3cda38f1 Second version
6a89ebba324d9695660b7b11586e42d09063aa55 First version



In [46]:
versions = [line.split()[0] for line in log.split('\n') if line]
versions.reverse()

In [47]:
!git checkout {versions[0]}

HEAD is now at 6a89ebb First version


In [48]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [49]:
exec(open('remove_html_markup.py').read())

In [50]:
remove_html_markup('"foo"')

'"foo"'

In [51]:
!git checkout {versions[7]}

Previous HEAD position was 6a89ebb First version
HEAD is now at 0ddc57a Eighth version (with proper assertion)


In [52]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [37m# postcondition[39;49;00m
    [34massert[39;49;00m [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mnot

In [53]:
exec(open('remove_html_markup.py').read())

In [54]:
remove_html_markup('"foo"')

'foo'

## Manual Bisecting

In [55]:
!git bisect start

In [56]:
!git bisect good {versions[0]}

In [57]:
!git bisect bad {versions[7]}

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[437cf325fc775b632301755150d5f79117e7cb20] Fourth version (clueless)


In [58]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:  [37m# and not quote:[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# and not quote:[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [59]:
exec(open('remove_html_markup.py').read())

In [60]:
remove_html_markup('"foo"')

'foo'

In [61]:
!git bisect bad

Bisecting: 0 revisions left to test after this (roughly 1 step)
[27256a09550f1e5c167227df98909663522ab882] Third version (with debugging output)


In [62]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [36mprint[39;49;00m([33m"[39;49;00m[33mc =[39;49;00m[33m"[39;49;00m, [36mrepr[39;49;00m(c), [33m"[39;49;00m[33mtag =[39;49;00m[33m"[39;49;00m, tag, [33m"[39;49;00m[33mquote =[39;49;00m[33m"[39;49;00m, quote)

        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            

In [63]:
exec(open('remove_html_markup.py').read())

In [64]:
remove_html_markup('"foo"')

c = '"' tag = False quote = False
c = 'f' tag = False quote = True
c = 'o' tag = False quote = True
c = 'o' tag = False quote = True
c = '"' tag = False quote = True


'foo'

In [65]:
!git bisect bad

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[aa2ff76422281874ae6881359c04f3ee3cda38f1] Second version


In [66]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [67]:
exec(open('remove_html_markup.py').read())

In [68]:
remove_html_markup('"foo"')

'foo'

In [69]:
!git bisect bad

aa2ff76422281874ae6881359c04f3ee3cda38f1 is the first bad commit
[33mcommit aa2ff76422281874ae6881359c04f3ee3cda38f1[m
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 17:04:52 2020 +0100

    Second version

 remove_html_markup.py | 7 [32m+++++[m[31m--[m
 1 file changed, 5 insertions(+), 2 deletions(-)


In [70]:
!git diff HEAD^

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [71]:
!git bisect view

[33mcommit aa2ff76422281874ae6881359c04f3ee3cda38f1[m[33m ([m[1;36mHEAD[m[33m, [m[mrefs/bisect/bad[m[33m)[m
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 17:04:52 2020 +0100

    Second version


In [72]:
!git bisect reset

Previous HEAD position was aa2ff76 Second version
HEAD is now at 0ddc57a Eighth version (with proper assertion)


## Automatic Bisecting

In [73]:
open('test.py', 'w').write('''
#!/usr/bin/env python

from remove_html_markup import remove_html_markup
import sys

result = remove_html_markup('"foo"')
if result == '"foo"':
    sys.exit(0)  # good/pass
elif result == 'foo':
    sys.exit(1)  # bad/fail
else:
    sys.exit(125)  # unresolved
''')
print_file('test.py')

[37m#!/usr/bin/env python[39;49;00m

[34mfrom[39;49;00m [04m[36mremove_html_markup[39;49;00m [34mimport[39;49;00m remove_html_markup
[34mimport[39;49;00m [04m[36msys[39;49;00m

result = remove_html_markup([33m'[39;49;00m[33m"[39;49;00m[33mfoo[39;49;00m[33m"[39;49;00m[33m'[39;49;00m)
[34mif[39;49;00m result == [33m'[39;49;00m[33m"[39;49;00m[33mfoo[39;49;00m[33m"[39;49;00m[33m'[39;49;00m:
    sys.exit([34m0[39;49;00m)  [37m# good/pass[39;49;00m
[34melif[39;49;00m result == [33m'[39;49;00m[33mfoo[39;49;00m[33m'[39;49;00m:
    sys.exit([34m1[39;49;00m)  [37m# bad/fail[39;49;00m
[34melse[39;49;00m:
    sys.exit([34m125[39;49;00m)  [37m# unresolved[39;49;00m

In [74]:
!python ./test.py; echo $?

1


In [75]:
!git bisect start

In [76]:
!git bisect good {versions[0]}

In [77]:
!git bisect bad {versions[7]}

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[437cf325fc775b632301755150d5f79117e7cb20] Fourth version (clueless)


In [78]:
!git bisect run python test.py

running python test.py
Bisecting: 0 revisions left to test after this (roughly 1 step)
[27256a09550f1e5c167227df98909663522ab882] Third version (with debugging output)
running python test.py
c = '"' tag = False quote = False
c = 'f' tag = False quote = True
c = 'o' tag = False quote = True
c = 'o' tag = False quote = True
c = '"' tag = False quote = True
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[aa2ff76422281874ae6881359c04f3ee3cda38f1] Second version
running python test.py
aa2ff76422281874ae6881359c04f3ee3cda38f1 is the first bad commit
commit aa2ff76422281874ae6881359c04f3ee3cda38f1
Author: Andreas Zeller <zeller@cispa.saarland>
Date:   Sun Dec 6 17:04:52 2020 +0100

    Second version

 remove_html_markup.py | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
bisect run success


In [79]:
print_file('remove_html_markup.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [80]:
!git diff HEAD^

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


In [81]:
!git bisect reset

Previous HEAD position was aa2ff76 Second version
HEAD is now at 0ddc57a Eighth version (with proper assertion)


## Delta Debugging on Changes

In [82]:
from DeltaDebugger import DeltaDebugger

In [83]:
version_1 = get_output(['git', 'show', 
                            f'{versions[0]}:remove_html_markup.py'])

In [84]:
print_content(version_1, '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [85]:
version_2 = get_output(['git', 'show', 
                            f'{versions[1]}:remove_html_markup.py'])

In [86]:
print_content(version_2, '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [87]:
!git diff {versions[0]} {versions[1]}

[1mdiff --git a/remove_html_markup.py b/remove_html_markup.py[m
[1mindex 759df23..6e41b5b 100644[m
[1m--- a/remove_html_markup.py[m
[1m+++ b/remove_html_markup.py[m
[36m@@ -1,12 +1,15 @@[m
 def remove_html_markup(s):[m
     tag = False[m
[32m+[m[32m    quote = False[m
     out = ""[m
 [m
     for c in s:[m
[31m-        if c == '<':    # start of markup[m
[32m+[m[32m        if c == '<' and not quote:[m
             tag = True[m
[31m-        elif c == '>':  # end of markup[m
[32m+[m[32m        elif c == '>' and not quote:[m
             tag = False[m
[32m+[m[32m        elif c == '"' or c == "'" and tag:[m
[32m+[m[32m            quote = not quote[m
         elif not tag:[m
             out = out + c[m
 [m


We use Google's [diff-match-patch library](https://github.com/google/diff-match-patch) as a [Python module](https://github.com/JoshData/diff_match_patch-python).

In [88]:
from diff_match_patch import diff_match_patch

In [89]:
def diff(s1, s2, mode='lines'):
    dmp = diff_match_patch()
    if mode == 'lines':
        (text1, text2, linearray) = dmp.diff_linesToChars(s1, s2)
        diffs = dmp.diff_main(text1, text2)
        dmp.diff_charsToLines(diffs, linearray)
        return dmp.patch_make(diffs)

    if mode == 'chars':
        diffs = dmp.diff_main(s1, s2)
        return dmp.patch_make(s1, diffs)
        
    raise ValueError("mode must be 'lines' or 'chars'")

In [90]:
patches = diff(version_1, version_2)
patches

[<diff_match_patch.diff_match_patch.patch_obj at 0x7fa530e15f98>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7fa530e15e10>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7fa530e15f60>,
 <diff_match_patch.diff_match_patch.patch_obj at 0x7fa530c915c0>]

In [91]:
import urllib

In [92]:
def patch_string(p):
    return urllib.parse.unquote(str(p).strip())

In [93]:
for p in patches:
    print(patch_string(p))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""
@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

@@ -146,48 +146,45 @@
 rue

-        elif c == '>':  # end of markup

+        elif c == '>' and not quote:

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif


In [94]:
def patch(s, patches):
    dmp = diff_match_patch()
    text, success = dmp.patch_apply(patches, s)
    assert all(success)
    return text

In [95]:
print_content(patch(version_1, patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m [35mand[39;49;00m [35mnot[39;49;00m quote:
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

In [96]:
assert patch(version_1, patches) == version_2

In [97]:
assert patch(version_1, []) == version_1

In [98]:
print(patch_string(patches[0]))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""


In [99]:
print_content(patch(version_1, [patches[0]]))

def remove_html_markup(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    [34mout[39;49;00m = [33m""[39;49;00m

    [34mfor[39;49;00m [34mc[39;49;00m [34min[39;49;00m s:
        [34mif[39;49;00m [34mc[39;49;00m == [33m'<'[39;49;00m:    # [34mstart[39;49;00m [34mof[39;49;00m markup
            tag = [34mTrue[39;49;00m
        elif [34mc[39;49;00m == [33m'>'[39;49;00m:  # [34mend[39;49;00m [34mof[39;49;00m markup
            tag = [34mFalse[39;49;00m
        elif [34mnot[39;49;00m tag:
            [34mout[39;49;00m = [34mout[39;49;00m + [34mc[39;49;00m

    [34mreturn[39;49;00m [34mout[39;49;00m

In [100]:
print_content(patch(version_1, [patches[1]]))

def remove_html_markup(s):
    tag = [34mFalse[39;49;00m
    [34mout[39;49;00m = [33m""[39;49;00m

    [34mfor[39;49;00m [34mc[39;49;00m [34min[39;49;00m s:
        [34mif[39;49;00m [34mc[39;49;00m == [33m'<'[39;49;00m [34mand[39;49;00m [34mnot[39;49;00m quote:
            tag = [34mTrue[39;49;00m
        elif [34mc[39;49;00m == [33m'>'[39;49;00m:  # [34mend[39;49;00m [34mof[39;49;00m markup
            tag = [34mFalse[39;49;00m
        elif [34mnot[39;49;00m tag:
            [34mout[39;49;00m = [34mout[39;49;00m + [34mc[39;49;00m

    [34mreturn[39;49;00m [34mout[39;49;00m

In [101]:
def test_remove_html_markup(patches):
    new_version = patch(version_1, patches)
    exec(new_version, globals())
    assert remove_html_markup('"foo"') == '"foo"'

In [102]:
test_remove_html_markup([])

In [103]:
with ExpectError():
    test_remove_html_markup(patches)

Traceback (most recent call last):
  File "<ipython-input-103-9f9aae950c60>", line 2, in <module>
    test_remove_html_markup(patches)
  File "<ipython-input-101-4525dbd53ca7>", line 4, in test_remove_html_markup
    assert remove_html_markup('"foo"') == '"foo"'
AssertionError (expected)


In [104]:
with DeltaDebugger() as dd:
    test_remove_html_markup(patches)

In [105]:
reduced_patches = dd.min_args()['patches']

In [106]:
for p in reduced_patches:
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



In [107]:
print_content(patch(version_1, reduced_patches), '.py')

[34mdef[39;49;00m [32mremove_html_markup[39;49;00m(s):
    tag = [34mFalse[39;49;00m
    quote = [34mFalse[39;49;00m
    out = [33m"[39;49;00m[33m"[39;49;00m

    [34mfor[39;49;00m c [35min[39;49;00m s:
        [34mif[39;49;00m c == [33m'[39;49;00m[33m<[39;49;00m[33m'[39;49;00m:    [37m# start of markup[39;49;00m
            tag = [34mTrue[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m>[39;49;00m[33m'[39;49;00m:  [37m# end of markup[39;49;00m
            tag = [34mFalse[39;49;00m
        [34melif[39;49;00m c == [33m'[39;49;00m[33m"[39;49;00m[33m'[39;49;00m [35mor[39;49;00m c == [33m"[39;49;00m[33m'[39;49;00m[33m"[39;49;00m [35mand[39;49;00m tag:
            quote = [35mnot[39;49;00m quote
        [34melif[39;49;00m [35mnot[39;49;00m tag:
            out = out + c

    [34mreturn[39;49;00m out

Can we narrow this down even further? Yes!

In [108]:
pass_patches, fail_patches = (arg['patches'] for arg in dd.min_arg_diff())

In [109]:
for p in pass_patches:
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

     



In [110]:
for p in fail_patches:
    print(urllib.parse.unquote(str(p)))

@@ -32,24 +32,42 @@
 tag = False

+    quote = False

     out = ""

@@ -88,50 +88,43 @@
  s:

-        if c == '<':    # start of markup

+        if c == '<' and not quote:

     

@@ -199,24 +199,97 @@
 tag = False

+        elif c == '"' or c == "'" and tag:
            quote = not quote

         elif



## A ChangeDebugger class

```python
with ChangeDebugger(source, patches):
    foo()
```

```python
exec(source); foo() -> PASS
bad_source = patch(source); exec(bad_source); foo() -> FAIL
```

or better:

```python
with ChangeDebugger(pass_source, fail_source):
    foo()
```

Implement general dd() here. Consider extending DeltaDebugger such that it not only has `reduced_args()`, but also `maximized_args()`, and `minimal_arg_diff()`.

### Excursion: All the Details

This text will only show up on demand (HTML) or not at all (PDF). This is useful for longer implementations, or repetitive, or specialized parts.

### End of Excursion

## _Section 3_

\todo{Add}

_If you want to introduce code, it is helpful to state the most important functions, as in:_

* `random.randrange(start, end)` - return a random number [`start`, `end`]
* `range(start, end)` - create a list with integers from `start` to `end`.  Typically used in iterations.
* `for elem in list: body` executes `body` in a loop with `elem` taking each value from `list`.
* `for i in range(start, end): body` executes `body` in a loop with `i` from `start` to `end` - 1.
* `chr(n)` - return a character with ASCII code `n`

In [111]:
import random

In [112]:
def int_fuzzer():
    """A simple function that returns a random integer"""
    return random.randrange(1, 100) + 0.5

In [113]:
# More code
pass

## _Section 4_

\todo{Add}

## Synopsis

_For those only interested in using the code in this chapter (without wanting to know how it works), give an example.  This will be copied to the beginning of the chapter (before the first section) as text with rendered input and output._

You can use `int_fuzzer()` as:

In [114]:
print(int_fuzzer())

76.5


## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducer.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [115]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [116]:
# Some code for the solution
2 + 2

4

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_