# Information Flow

In this chapter, we explore in depth how to track information flows in python by tainting input strings, and tracking the taint across string operations.

Some material on `eval` exploitation is adapted from the excellent [blog post](https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html) by Ned Batchelder.

**Prerequisites**

* You should have read the [chapter on coverage](Coverage.ipynb).
* Some knowledge of inheritance in Python is required.

We first setup our infrastructure so that we can make use of previously defined functions.

In [None]:
import fuzzingbook_utils

In [None]:
from ExpectError import ExpectError

In [None]:
import inspect
import enum

Say we want to implement a *calculator* service in Python. A rather easy way to do that is to rely on the `eval()` function in Python. However, unrestricted `eval()` can be used by users to execute arbitrary commands. Since we want to restrict our users to using only the *calculator* functionality, and do not want the users to trash our server, we use `eval()` with empty `locals` and `globals` (as recommended [elsewhere](https://www.programiz.com/python-programming/methods/built-in/eval)).

In [None]:
def my_calculator(my_input):
    result = eval(my_input, {}, {})
    print("The result of %s was %d" % (my_input, result))

It works as expected:

In [None]:
my_calculator('1+2')

Does it?

In [None]:
with ExpectError():
    my_calculator('__import__("os").popen("ls").read()')

As you can see from the error, `eval()` completed successfully, with the system command `ls` executing successfully. It is easy enough for the user to see the output if needed.

In [None]:
my_calculator("1 if __builtins__['print'](__import__('os').popen('pwd').read()) else 0")

The problem is that the Python `__builtins__` is [inserted by default](https://docs.python.org/3/library/functions.html#eval) when one uses `eval()`. We can avoid this by restricting `__builtins__` in `eval` explicitly (again as recommended [elsewhere](http://lybniz2.sourceforge.net/safeeval.html)).

In [None]:
def my_calculator(my_input):
    result = eval(my_input, {"__builtins__":None}, {})
    print("The result of %s was %d" % (my_input, result))

Does it help?

In [None]:
with ExpectError():
    my_calculator("1 if __builtins__['print'](__import__('os').popen('pwd').read()) else 0")

But does it actually?

In [None]:
my_calculator("1 if [x['print'](x['__import__']('os').popen('pwd').read()) for x in ([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Sized'][0].__len__.__globals__['__builtins__'],)] else 0")

The problem here is that when the user has a way to inject **uninterpreted strings** that can reach a dangerous routine such as  `eval()` or an `exec()`, it makes it possible for them to inject dangerous code. What we need is a way to restrict the ability of uninterpreted input string fragments from reaching dangerous portions of code.

## A Simple Taint Tracker

For capturing information flows we need a new string class. The idea is to use the new tainted string class `tstr` as a wrapper on the original `str` class. However, `str` is an *immutable* class. Hence, it does not call its `__init__` method after being constructed. This means that any subclasses of `str` also will not get the `__init__` called. If we want to get our initialization routine called, we need to [hook into `__new__`](https://docs.python.org/3/reference/datamodel.html#basic-customization) and return an instance of our own class.

We need to write the `__new__()` method because we want to track the parent object responsible for the taint during our initialization `tstr.__init__()`. Hence, we define a class `tstr_` that subclasses `str`, and enables its subclasses to initialize using `__init__()`.

In [None]:
class tstr_(str):
    def __new__(cls, value, *args, **kw):
        return super(tstr_, cls).__new__(cls, value)

There are various levels of taint tracking that one can perform. The simplest is to track that a string fragment originated in an untrusted environment, and has not undergone a taint removal process. For this, we simply need to wrap the original string in the untrusted environment with `tstr`, and produce `tstr` instances on each operation that results in another string frament. Distinguishing various untrusted sources may be accomplished by tainting each instances as separate instances (called *colors* in dynamic taint research). You will see an instance of this technique in the chapter on [Grammar Mining](GrammarMining.ipynb).

In this chapter, we carry *character level* taints. That is, given a fragment that resulted from a portion of the original tainted string, one will be able to tell which portion of the input string the fragment was taken from. In essence, each input character index from a tainted source gets its own color.

More complex tainting such as *bitmap taints* are possible where a single character may result from multiple tainted character indexes (such as *checksum* operations on strings). We do not consider these in this chapter.

We now define our initialization code in `__init__()`.

The variable `taint` contains non-overlapping taints mapped to the original string. The variable `parent` holds a reference to the `tstr` instance from which this instance was derived.

In [None]:
class tstr(tstr_):
    def __init__(self, value, taint=None, parent=None, **kwargs):
        self.parent = parent
        l = len(self)
        if not taint:
            taint = 0
        self.taint = list(range(taint, taint + l)) if isinstance(taint, int) else taint
        assert len(self.taint) == l

    def __repr__(self):
        return str.__repr__(self)

    def __str__(self):
        return str.__str__(self)

For example, if we wrap `hello` in `tstr`, then we should be able to access its taint in indices `0..4`

In [None]:
t = tstr('hello')
t.taint

We can also specify the starting taint as below -- `6..10`

In [None]:
t = tstr('world', taint = 6)
t.taint

`repr()` and `str()` returns an untainted `str` instance.

In [None]:
type(str(t))

By default, when we wrap a string, it is tainted. Hence we also need a way to `untaint` the string. One way is to simply return a `str` instance as above. However, one may sometimes wish to remove taint from an existing instance. This is accomplished with `untaint()`. During `untaint()`, we simply set the taint indexes to `-1`. This method comes with a pair method `has_taint()` which checks whether a `tstr` instance is currently tainted.

In [None]:
 class tstr(tstr):
    def untaint(self):
        self.taint =  [-1] * len(self)
        return self

    def has_taint(self):
        return any(True for i in self.taint if i >= 0)

In [None]:
t = tstr('hello world')
t.untaint()
t.has_taint()

While the basic tainted string creation works, we have not completed the taint transition. For example, getting a substring does not transfer taint from the original string.

In [None]:
with ExpectError():
    t = tstr('hello world')
    t[0:5].has_taint()

In Python, the substring as shown above is implemented using `slice`. We implement this next.

### Create

We need to create new substrings that are wrapped in `tstr`. However, we also want to allow our subclasses to create their own instances. Hence we provide a `create()` method that produces a new `tstr` instance.

In [None]:
class tstr(tstr):
    def create(self, res, taint):
        return tstr(res, taint, self)

In [None]:
hello = tstr('hello')
world = hello.create('world', 6)
world.parent.taint, world.taint

### Index

In Python, indexing is provided through `__getitem__()`. Indexing on positive integers is simple enough. However, it has two additional wrinkles. The first is that, if the index is negative, that many characters are counted from the end of the string which lies just after the last character. That is, the last character has a negative index `-1`

In [None]:
class tstr(tstr):
    def __getitem__(self, key):
        res = super().__getitem__(key)
        if type(key) == int:
            key = len(self) + key if key < 0 else key
            return self.create(res, [self.taint[key]])
        elif type(key) == slice:
            return self.create(res, self.taint[key])
        else:
            assert False

In [None]:
hello = tstr('hello')
hello[0], hello[-1]

The other wrinkle is that `__getitem__()` can accept a slice. We discuss this next.

### Slice

The Python `slice` operator `[n:m]` relies on the object being an `iterator`. Hence, we define the `__iter__()` method, which returns a custom `iterator`.

In [None]:
class tstr(tstr):
    def __iter__(self):
        return tstr_iterator(self)

#### The iterator class
The `__iter__()` method requires a supporting `iterator` object. The `iterator` is used to save the state of the current iteration, which it does by keeping a reference to the original `tstr`, and the current index of iteration `_str_idx`.

In [None]:
class tstr_iterator():
    def __init__(self, tstr):
        self._tstr = tstr
        self._str_idx = 0

    def __next__(self):
        if self._str_idx == len(self._tstr): raise StopIteration
        # calls tstr getitem should be tstr
        c = self._tstr[self._str_idx]
        assert type(c) is tstr
        self._str_idx += 1
        return c

Bringing all these together:

In [None]:
t = tstr('hello world')
t[0:5].has_taint()

### Concatenation

If two tainted strings are concatenated together, it may be desirable to transer the taints from each to the corresponding portion of the resulting string. The concatenation of strings is accomplished by overriding `__add__()`.

In [None]:
class tstr(tstr):
    def __add__(self, other):
        if type(other) is tstr:
            return self.create(str.__add__(self, other), (self.taint + other.taint))
        else:
            return self.create(str.__add__(self, other), (self.taint + [-1 for i in other]))

Testing concatenations between two `tstr` instances:

In [None]:
my_str1 = tstr("hello")
my_str2 = tstr("world", taint=6)
v = my_str1 + my_str2
print(v.taint)

What if a `tstr` is concatenated with a `str`?

In [None]:
my_str3 = "bye"
w = my_str1 + my_str3 + my_str2
print(w.taint)

One wrinkle here is that when adding a `tstr` and a `str`, the user may place the `str` first, in which case, the `__add__()` method will be called on the `str` instance. Not on the `tstr` instance. However, Python provides a solution. If one defines `__radd__()` on the `tstr` instance, that method will be called rather than `str.__add__()`

In [None]:
class tstr(tstr):
    def __radd__(self, other):
        taint = other.taint if type(other) is tstr else [-1 for i in other]
        return self.create(str.__add__(other, self), (taint + self.taint))

We test it out:

In [None]:
my_str1 = "hello"
my_str2 = tstr("world")
v = my_str1 + my_str2
v.taint

These methods: `slicing` and `concatenation` is sufficient to implement other string methods that result in a string, and does not change the character underneath (i.e no case change). Hence, we look at a helper method next.

### Extract tainted string.

Given a specific input index, the method `x()` extracts the corresponding tainted portion from a `tstr`. As a convenience it supports `slices` along with `ints`.

In [None]:
class tstr(tstr):
    class TaintException(Exception):
        pass

    def x(self, i=0):
        if not self.taint:
            raise taint.TaintException('Invalid request idx')
        if isinstance(i, int):
            return [self[p] for p in [k for k,j in enumerate(self.taint) if j == i]]
        elif isinstance(i, slice):
            r = range(i.start or 0, i.stop or len(self), i.step or 1)
            return [self[p] for p in [k for k,j in enumerate(self.taint) if j in r]]

In [None]:
my_str = tstr('abcdefghijkl', taint=100)

In [None]:
my_str.x(101)

In [None]:
my_str.x(slice(101,105))

### Replace

The `replace()` method replaces a portion of the string with another.

In [None]:
class tstr(tstr):
    def replace(self, a, b, n=None):
        old_taint = self.taint
        b_taint = b.taint if type(b) is tstr else [-1] * len(b)
        mystr = str(self)
        i = 0
        while True:
            if n and i >= n: break
            idx = mystr.find(a)
            if idx == -1: break
            last = idx + len(a)
            mystr = mystr.replace(a, b, 1)
            partA, partB = old_taint[0:idx], old_taint[last:]
            old_taint = partA + b_taint + partB
            i += 1
        return self.create(mystr, old_taint)

In [None]:
my_str = tstr("aa cde aa")
res = my_str.replace('aa', 'bb')
res, res.taint

### Split

We essentially have to re-implement split operations, and split by space is slightly different from other splits.

In [None]:
class tstr(tstr):
    def _split_helper(self, sep, splitted):
        result_list = []
        last_idx = 0
        first_idx = 0
        sep_len = len(sep)

        for s in splitted:
            last_idx = first_idx + len(s)
            item = self[first_idx:last_idx]
            result_list.append(item)
            first_idx = last_idx + sep_len
        return result_list

    def _split_space(self, splitted):
        result_list = []
        last_idx = 0
        first_idx = 0
        sep_len = 0
        for s in splitted:
            last_idx = first_idx + len(s)
            item = self[first_idx:last_idx]
            result_list.append(item)
            v = str(self[last_idx:])
            sep_len = len(v) - len(v.lstrip(' '))
            first_idx = last_idx + sep_len
        return result_list

    def rsplit(self, sep=None, maxsplit=-1):
        splitted = super().rsplit(sep, maxsplit)
        if not sep:
            return self._split_space(splitted)
        return self._split_helper(sep, splitted)

    def split(self, sep=None, maxsplit=-1):
        splitted = super().split(sep, maxsplit)
        if not sep:
            return self._split_space(splitted)
        return self._split_helper(sep, splitted)

In [None]:
my_str = tstr('ab cdef ghij kl')
ab, cdef, ghij, kl = my_str.rsplit(sep=' ')
print(ab.taint, cdef.taint, ghij.taint, kl.taint)

my_str = tstr('ab   cdef ghij    kl', taint=100)
ab, cdef, ghij, kl = my_str.rsplit()
print(ab.taint, cdef.taint, ghij.taint, kl.taint)

In [None]:
my_str = tstr('ab cdef ghij kl', taint=list(range(0, 15)))
ab, cdef, ghij, kl = my_str.split(sep=' ')
print(ab.taint, cdef.taint, kl.taint)

my_str = tstr('ab   cdef ghij    kl', taint=list(range(0, 20)))
ab, cdef, ghij, kl = my_str.split()
print(ab.taint, cdef.taint, kl.taint)

### Strip

In [None]:
class tstr(tstr):
    def strip(self, cl=None):
        return self.lstrip(cl).rstrip(cl)

    def lstrip(self, cl=None):
        res = super().lstrip(cl)
        i = self.find(res)
        return self[i:]

    def rstrip(self, cl=None):
        res = super().rstrip(cl)
        return self[0:len(res)]


In [None]:
my_str1 = tstr("  abc  ")
v = my_str1.strip()
v, v.taint

In [None]:
my_str1 = tstr("  abc  ")
v = my_str1.lstrip()
v, v.taint

In [None]:
my_str1 = tstr("  abc  ")
v = my_str1.rstrip()
v, v.taint

### Expand Tabs

In [None]:
class tstr(tstr):
    def expandtabs(self, n=8):
        parts = self.split('\t')
        res = super().expandtabs(n)
        all_parts = []
        for i, p in enumerate(parts):
            all_parts.extend(p.taint)
            if i < len(parts) - 1:
                l = len(all_parts) % n
                all_parts.extend([p.taint[-1]] * l)
        return self.create(res, all_parts)

In [None]:
my_tstr = tstr("ab\tcd")
my_str = str("ab\tcd")
v1 = my_str.expandtabs(4)
v2 = my_tstr.expandtabs(4)
print(len(v1), repr(my_tstr), repr(v2), v2.taint)

In [None]:
class tstr(tstr):
    def join(self, iterable):
        mystr = ''
        mytaint = []
        sep_taint = self.taint
        lst = list(iterable)
        for i, s in enumerate(lst):
            staint = s.taint if type(s) is tstr else [-1] * len(s)
            mytaint.extend(staint)
            mystr += str(s)
            if i < len(lst)-1:
                mytaint.extend(sep_taint)
                mystr += str(self)
        res = super().join(iterable)
        assert len(res) == len(mystr)
        return self.create(res, mytaint)

In [None]:
my_str = tstr("ab cd", taint=100)
(v1, v2), v3 = my_str.split(), 'ef'
print(v1.taint, v2.taint)
v4 = tstr('').join([v2,v3,v1])
print(v4, v4.taint)

In [None]:
my_str = tstr("ab cd", taint=100)
(v1, v2), v3 = my_str.split(), 'ef'
print(v1.taint, v2.taint)
v4 = tstr(',').join([v2,v3,v1])
print(v4, v4.taint)

### Partitions

In [None]:
class tstr(tstr):
    def partition(self, sep):
        partA, sep, partB = super().partition(sep)
        return (
            self.create(partA, self.taint[0:len(partA)]), self.create(sep, self.taint[len(partA): len(partA) + len(sep)]), self.create(partB, self.taint[len(partA) + len(sep):]))

    def rpartition(self, sep):
        partA, sep, partB = super().rpartition(sep)
        return (self.create(partA, self.taint[0:len(partA)]), self.create(sep, self.taint[len(partA): len(partA) + len(sep)]), self.create(partB, self.taint[len(partA) + len(sep):]))

### Justify

In [None]:
class tstr(tstr):
    def ljust(self, width, fillchar=' '):
        res = super().ljust(width, fillchar)
        initial = len(res) - len(self)
        if type(fillchar) is tstr:
            t = fillchar.x()
        else:
            t = -1
        return self.create(res, [t] * initial + self.taint)

    def rjust(self, width, fillchar=' '):
        res = super().rjust(width, fillchar)
        final = len(res) - len(self)
        if type(fillchar) is tstr:
            t = fillchar.x()
        else:
            t = -1
        return self.create(res, self.taint + [t] * final)

### String methods that do not change taint

In [None]:
def make_str_wrapper_eq_taint(fun):
    def proxy(*args, **kwargs):
        res = fun(*args, **kwargs)
        return args[0].create(res, args[0].taint)
    return proxy

for name, fn in inspect.getmembers(str, callable):
    if name in ['swapcase', 'upper', 'lower', 'capitalize', 'title']:
        setattr(tstr, name, make_str_wrapper_eq_taint(fn))


In [None]:
a = tstr('aa', taint=100).upper()
a, a.taint

### General wrappers

These are not strictly needed for operation, but can be useful for tracing

In [None]:
def make_str_wrapper(fun):
    def proxy(*args, **kwargs):
        res = fun(*args, **kwargs)
        return res
    return proxy

import types
tstr_members = [name for name, fn in inspect.getmembers(tstr,callable)
if type(fn) == types.FunctionType and fn.__qualname__.startswith('tstr')]

for name, fn in inspect.getmembers(str, callable):
    if name not in set(['__class__', '__new__', '__str__', '__init__',
                        '__repr__','__getattribute__']) | set(tstr_members):
        setattr(tstr, name, make_str_wrapper(fn))

### Methods yet to be translated

These methods generate strings from other strings. However, we do not have the right implementations for any of these. Hence these are marked as dangerous until we can generate the right translations.

In [None]:
def make_str_abort_wrapper(fun):
    def proxy(*args, **kwargs):
        raise TaintException('%s Not implemented in TSTR' % fun.__name__)
    return proxy

for name, fn in inspect.getmembers(str, callable):
    if name in ['__format__', '__rmod__', '__mod__', 'format_map', 'format',
               '__mul__','__rmul__','center','zfill', 'decode', 'encode', 'splitlines']:
        setattr(tstr, name, make_str_abort_wrapper(fn))

## EOF Tracker

Sometimes we want to know where an empty string came from. That is, if an empty string is the result of operations on a tainted string, we want to know the best guess as to what the taint index of the preceding character is.

### Slice


For detecting EOF, we need to carry the cursor. The main idea is the cursor indicates the taint of the character in front of it.

In [None]:
class eoftstr(tstr):
    def create(self, res, taint):
        return eoftstr(res, taint, self)
    
    def __getitem__(self, key):
        def get_interval(key):
            return ((0 if key.start is None else key.start),
                    (len(res) if key.stop is None else key.stop))

        res = super().__getitem__(key)
        if type(key) == int:
            key = len(self) + key if key < 0 else key
            return self.create(res, [self.taint[key]])
        elif type(key) == slice:
            if res:
                return self.create(res, self.taint[key])
            # Result is an empty string
            t = self.create(res, self.taint[key])
            key_start, key_stop = get_interval(key)
            cursor = 0
            if key_start < len(self):
                assert key_stop < len(self)
                cursor = self.taint[key_stop]
            else:
                if len(self) == 0:
                    # if the original string was empty, we assume that any
                    # empty string produced from it should carry the same taint.
                    cursor = self.x()
                else:
                    # Key start was not in the string. We can reply only
                    # if the key start was just outside the string, in
                    # which case, we guess.
                    if key_start != len(self):
                        raise taint.TaintException('Can\'t guess the taint')
                    cursor = self.taint[len(self) - 1] + 1
            # _tcursor gets created only for empty strings.
            t._tcursor = cursor
            return t

        else:
            assert False

We add an additional method `t()` that takes in a taint index, and returns the taint at that index. If it is an empty string, it gives you a possible location of that empty string.

In [None]:
class eoftstr(eoftstr):
    def t(self, i=0):
        if self.taint:
            return self.taint[i]
        else:
            if i != 0:
                raise taint.TaintException('Invalid request idx')
            # self._tcursor gets created only for empty strings.
            # use the exception to determine which ones need it.
            return self._tcursor

In [None]:
t = eoftstr('hello world')
print(repr(t[11:]))
print(t[11:].taint, t[11:].t())

## A Comparison Tracker

Sometimes, we also want to know what each character in an input was compared to.

### Operators

In [None]:
class Op(enum.Enum):
    LT = 0
    LE = enum.auto()
    EQ = enum.auto()
    NE = enum.auto()
    GT = enum.auto()
    GE = enum.auto()
    IN = enum.auto()
    NOT_IN = enum.auto()
    IS = enum.auto()
    IS_NOT = enum.auto()
    FIND_STR = enum.auto()


COMPARE_OPERATORS = {
    Op.EQ: lambda x, y: x == y,
    Op.NE: lambda x, y: x != y,
    Op.IN: lambda x, y: x in y,
    Op.NOT_IN: lambda x, y: x not in y,
    Op.FIND_STR: lambda x, y: x.find(y)
}

Comparisons = []

### Instructions

In [None]:
class Instr:
    def __init__(self, o, a, b):
        self.opA = a
        self.opB = b
        self.op = o

    def o(self):
        if self.op == Op.EQ:
            return 'eq'
        elif self.op == Op.NE:
            return 'ne'
        else:
            return '?'

    def opS(self):
        if not self.opA.has_taint() and type(self.opB) is tstr:
            return (self.opB, self.opA)
        else:
            return (self.opA, self.opB)

    @property
    def op_A(self):
        return self.opS()[0]

    @property
    def op_B(self):
        return self.opS()[1]

    def __repr__(self):
        return "%s,%s,%s" % (self.o(), repr(self.opA), repr(self.opB))

    def __str__(self):
        if self.op == Op.EQ:
            if str(self.opA) == str(self.opB):
                return "%s = %s" % (repr(self.opA), repr(self.opB))
            else:
                return "%s != %s" % (repr(self.opA), repr(self.opB))
        elif self.op == Op.NE:
            if str(self.opA) == str(self.opB):
                return "%s = %s" % (repr(self.opA), repr(self.opB))
            else:
                return "%s != %s" % (repr(self.opA), repr(self.opB))
        elif self.op == Op.IN:
            if str(self.opA) in str(self.opB):
                return "%s in %s" % (repr(self.opA), repr(self.opB))
            else:
                return "%s not in %s" % (repr(self.opA), repr(self.opB))
        elif self.op == Op.NOT_IN:
            if str(self.opA) in str(self.opB):
                return "%s in %s" % (repr(self.opA), repr(self.opB))
            else:
                return "%s not in %s" % (repr(self.opA), repr(self.opB))
        else:
            assert False

### Equivalance

In [None]:
class ctstr(eoftstr):
    def create(self, res, taint):
        o = ctstr(res, taint, self)
        o.comparisons = self.comparisons
        return o
    
    def with_comparisons(self, comparisons):
        self.comparisons = comparisons
        return self

In [None]:
class ctstr(ctstr):
    def __eq__(self, other):
        if len(self) == 0 and len(other) == 0:
            self.comparisons.append(Instr(Op.EQ, self, other))
            return True
        elif len(self) == 0:
            self.comparisons.append(Instr(Op.EQ, self, other[0]))
            return False
        elif len(other) == 0:
            self.comparisons.append(Instr(Op.EQ, self[0], other))
            return False
        elif len(self) == 1 and len(other) == 1:
            self.comparisons.append(Instr(Op.EQ, self, other))
            return super().__eq__(other)
        else:
            if not self[0] == other[0]:
                return False
            return self[1:] == other[1:]

In [None]:
t = ctstr('hello world', taint=100).with_comparisons([])
print(t.comparisons)
t == 'hello'
for c in t.comparisons:
    print(repr(c))

In [None]:
class ctstr(ctstr):
    def __ne__(self, other):
        return not self.__eq__(other)

In [None]:
t = ctstr('hello', taint=100).with_comparisons([])
print(t.comparisons)
t != 'bye'
for c in t.comparisons:
    print(repr(c))

In [None]:
class ctstr(ctstr):
    def __contains__(self, other):
        self.comparisons.append(Instr(Op.IN, self, other))
        return super().__contains__(other)

In [None]:
class ctstr(ctstr):
    def find(self, sub, start=None, end=None):
        if start == None:
            start_val = 0
        if end == None:
            end_val = len(self)
        self.comparisons.append(Instr(Op.IN, self[start_val:end_val], sub))
        return super().find(sub, start, end)

### In

This requires some surgery on the module.

In [None]:
def substrings(s, l):
    for i in range(len(s)-(l-1)):
        yield s[i:i+l]

class ctstr(ctstr):
    def in_(self, s):
        # c in '0123456789'
        # to
        # __fn(c).in_('0123456789')
        # ensure that all characters are compared
        result = [self == c for c in substrings(s, len(self))]
        return any(result)

In [None]:
def my_fn(c, s):
    if (c in s):
        return c
    else:
        return s

class __fn:
    def __init__(self, s):
        self.s = s

    def in_(self, v):
        if isinstance(self.s, ctstr):
            return self.s.in_(v)
        else:
            return self.s in v

In [None]:
import ast
import inspect

#### Get the source code

In [None]:
# from fuzzingbook_utils import unparse
pass

In [None]:
class InRewrite(ast.NodeTransformer):
    def visit_Compare(self, tree_node):
        left = tree_node.left
        if not tree_node.ops or not isinstance(tree_node.ops[0], ast.In):
            return tree_node
        mod_val = ast.Call(
            func=ast.Attribute(
                value=ast.Call(
                    func=ast.Name(id='__fn', ctx=ast.Load()), args=[left], keywords=[]),
                attr='in_',
                ctx=left.ctx),
            args=tree_node.comparators,
            keywords=[])
        return mod_val


def rewrite_in(fn):
    fn_ast = ast.parse(inspect.getsource(fn))
    return compile(ast.fix_missing_locations(InRewrite().visit(fn_ast)), filename='', mode='exec')

In [None]:
my_new_fn = rewrite_in(my_fn)
exec(my_new_fn)

In [None]:
abcd =  'ABCD'
c = ctstr('C').with_comparisons([])

In [None]:
c.comparisons

In [None]:
my_fn(c,abcd)

In [None]:
c.comparisons

## Lessons Learned

* One can track the information flow form input to the internals of a system.

## Next Steps

_Link to subsequent chapters (notebooks) here:_

## Background

\cite{Lin2008}

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_