# Carving Unit Tests

So far, we have always generated _system input_, i.e. data that the program as a whole obtains via its input channels.  If we are interested in testing only a small set of functions, having to go through the system can be very inefficient.  This chapter introduces a technique known as _carving_, which, given a system test, automatically extracts a set of _unit tests_ that replicate the calls seen during the unit test.  The key idea is to _record_ such calls such that we can _replay_ them later – as a whole or selectively.

In [None]:
from fuzzingbook_utils import YouTubeVideo
YouTubeVideo("Ty0ktPXJ23c")

**Prerequisites**

* Carving makes use of dynamic traces of function calls and variables, as introduced in the [chapter on configuration fuzzing](ConfigurationFuzzer.ipynb).

## System Tests vs Unit Tests

Remember the URL grammar introduced for [grammar fuzzing](Grammars.ipynb)?  With such a grammar, we can happily test a Web browser again and again, checking how it reacts to arbitrary page requests.

Let us define a very simple "web browser" that goes and downloads the content given by the URL.

In [None]:
import urllib.parse, urllib.request

In [None]:
def webbrowser(url):
    """Download the http/https resource given by the URL"""
    response = urllib.request.urlopen(url)
    if response.getcode() == 200:
        contents = response.read()
    return contents

Let us apply this on [fuzzingboook.org](https://www.fuzzingbook.org/) and measure the time, using the [Timer class](Timer.ipynb):

In [None]:
from Timer import Timer

In [None]:
with Timer() as webbrowser_timer:
    fuzzingbook_contents = webbrowser("http://www.fuzzingbook.org/html/Fuzzer.html")

print("Downloaded %d bytes in %.2f seconds" % (len(fuzzingbook_contents), webbrowser_timer.elapsed_time()))

In [None]:
fuzzingbook_contents[:100]

Having to start a whole browser (or having it render a Web page) again and again means lots of overhead, though – in particular if we want to test only a subset of its functionality.  In particular, after a change in the code, we would prefer to test only the subset of functions that is affected by the change, rather than running the well-tested functions again and again.

Let us assume we change the function that takes care of parsing the given URL and decomposing it into the individual elements – the scheme ("http"), the network location (`"www.fuzzingbook.com"`), or the path (`"/html/Fuzzer.html"`).  This function is named `urlparse()`:

In [None]:
from urllib.parse import urlparse

In [None]:
urlparse('https://www.fuzzingbook.com/html/APIFuzzer.html')

You see how the individual elements of the URL – the _scheme_ (`"http"`), the _network location_ (`"www.fuzzingbook.com"`), or the path (`"//html/APIFuzzer.html"`) are all properly identified.  Other elements (like `params`, `query`, or `fragment`) are empty, because they were not part of our input.

The interesting thing is that executing only `urlparse()` is orders of magnitude faster than running all of `webbrowser()`.  Let us measure the factor:

In [None]:
runs = 1000
with Timer() as urlparse_timer:
    for i in range(runs):
        urlparse('https://www.fuzzingbook.com/html/APIFuzzer.html')

avg_urlparse_time = urlparse_timer.elapsed_time() / 1000
avg_urlparse_time

Compare this to the time required by the webbrowser

In [None]:
webbrowser_timer.elapsed_time()

The difference in time is huge:

In [None]:
webbrowser_timer.elapsed_time() / avg_urlparse_time

Hence, in the time it takes to run `webbrowser()` once, we can have _hundreds of thousands_ of executions of `urlparse()` – and this does not even take into account the time it takes the browser to render the downloaded HTML, to run the included scripts, and whatever else happens when a Web page is loaded.  Hence, strategies that allow us to test at the _unit_ level are very promising as they can save lots of overhead.

## Carving Unit Tests

Testing methods and functions at the unit level requires a very good understanding of the individual units to be tested as well as their interplay with other units.  Setting up an appropriate infrastructure and writing unit tests by hand thus is demanding, yet rewarding.  There is, however, an interesting alternative to writing unit tests by hand.  The technique of _carving_ automatically _converts system tests into unit tests_ by means of recording and replaying function calls:

1. During a system test (given or generated), we _record_ all calls into a function, including all arguments and other variables the function reads.
2. From these, we synthesize a self-contained _unit test_ that reconstructs the function call with all arguments.
3. This unit test can be executed (replayed) at any time with high efficiency.

Let us explore these three steps.

### Recording Calls

Our first challenge is to record function calls together with their arguments.  (In the interest of simplicity, we restrict ourself to arguments, ignoring any global variables or other non-arguments that are read by the function.)  To record calls and arguments, we use the mechanism [we introduced for coverage](Coverage.ipynb): By setting up a tracer function, we track all calls into individual functions, also saving their arguments.  Just like `Coverage` objects, we want to use `Carver` objects to be able to be used in conjunction with the `with` statement, such that we can trace a particular code block:

```python
with Carver() as carver:
    function_to_be_traced()
c = carver.calls()
```

The initial definition supports this construct:

In [None]:
import sys
import inspect

In [None]:
class Carver(object):
    def __init__(self):
        self._calls = {}

    # Start of `with` block
    def __enter__(self):
        self.original_trace_function = sys.gettrace()
        sys.settrace(self.traceit)
        return self

    # End of `with` block
    def __exit__(self, exc_type, exc_value, tb):
        sys.settrace(self.original_trace_function)

The actual work takes place in the `traceit()` method, which records all calls in the `_calls` attribute:

In [None]:
def qualified_name(code):
    name = code.co_name
    module = inspect.getmodule(code)
    if module is not None:
        name = module.__name__ + "." + name
    return name

class Carver(Carver):
    # Tracking function: Record all calls and all args
    def traceit(self, frame, event, arg):
        if event != "call":
            return None
        
        code = frame.f_code
        function_name = qualified_name(code)

        # When called, all arguments are local variables
        arguments = [(var, frame.f_locals[var]) for var in frame.f_locals]
        arguments.reverse()  # Want same order as call

        if function_name not in self._calls:
            self._calls[function_name] = []
        if arguments not in self._calls[function_name]:
            self._calls[function_name].append(arguments)

        # Some tracking
        # print(call_with_args(function_name, args))

        return None

Finally, we need some convenience functions to access the calls:

In [None]:
class Carver(Carver):
    def calls(self):
        """Return a dictionary of all calls traced."""  
        return self._calls
    
    def arguments(self, function_name):
        """Return a list of all arguments of the given function
        as (VAR, VALUE) pairs.
        Raises an exception if the function was not traced."""
        return self._calls[function_name]
    
    def called_functions(self):
        """Return all functions called."""
        return self._calls.keys()

Let us see how these work.

### Recording my_sqrt()

Let's try out our new `Carver` class – first on a very simple function:

In [None]:
from Intro_Testing import my_sqrt

In [None]:
with Carver() as sqrt_carver:
    my_sqrt(2)
    my_sqrt(4)

We can retrieve all calls seen...

In [None]:
sqrt_carver.calls()

... as well as the arguments of a particular function:

In [None]:
sqrt_carver.arguments("my_sqrt")

We define a convenience function for nicer printing of these lists:

In [None]:
# return function_name(arg[0], arg[1], ...) as a string
def call_with_args(function_name, argument_list):
    return function_name + "(" + \
        ", ".join([var + "=" + repr(value) for (var, value) in argument_list]) + ")"

In [None]:
for function_name in sqrt_carver.called_functions():
    for argument_list in sqrt_carver.arguments(function_name):
        print(call_with_args(function_name, argument_list))

This is a syntax we can directly use to invoke `my_sqrt()` again:

In [None]:
eval("my_sqrt(x=2)")

### Carving urlparse()

What happens if we apply this to `webbrowser()`?

In [None]:
with Carver() as webbrowser_carver:
    webbrowser("http://www.example.com")

We see that retrieving a URL from the Web requires quite some functionality:

In [None]:
webbrowser_carver.called_functions()

Among several other functions, we also have a call to `urlparse()`:

In [None]:
urlparse_argument_list = webbrowser_carver.arguments("urllib.parse.urlparse")
urlparse_argument_list

Again, we can convert this into a well-formatted call:

In [None]:
urlparse_call = call_with_args("urlparse", urlparse_argument_list[0])
urlparse_call

Again, we can re-execute this call:

In [None]:
eval(urlparse_call)

We now have successfully carved the call to `urlparse()` out of the `webbrowser()` execution.

## Replaying Calls

Replaying calls in their entirety and in all generality is tricky, as there are several challenges to be addressed.  These include:

1. We need to be able to _access_ individual functions.  If we access a function by name, the name must be in scope.  If the name is not visible (for instance, because it is a name internal to the module), we must make it visible.

2. Any resources accessed outside of arguments must be recorded and reconstructed for replay as well.  This can be difficult if variables refer to external resources such as files or network resources.

3. Complex objects must be reconstructed as well.  This can be difficult if they provide no means to do so.

These constraints make carving hard if the function to be tested interacts heavily with its environment.  In this section, we provide some mechanisms to make carving possible even under difficult circumstances.

To illustrate these issues, consider the `email.parser.parse()` method that is invoked in `webbrowser()`:

In [None]:
email_parse_argument_list = webbrowser_carver.arguments("email.parser.parse")

Calls to this method look like this:

In [None]:
email_parse_call = call_with_args("email.parser.parse", email_parse_argument_list[0])
email_parse_call

We see that `email.parser.parse()` is part of a `email.parser.Parser` object and it gets a `StringIO` object.  Both are non-primitive values.

In [None]:
import pickle    

In [None]:
parser_object = email_parse_argument_list[0][0][1]

In [None]:
pickled = pickle.dumps(parser_object)
pickled

In [None]:
def call_with_pickled_args(function_name, argument_list):
    return function_name + "(" + \
        ", ".join([var + "=pickle.loads(" + repr(pickle.dumps(value)) + ")" for (var, value) in argument_list]) + ")"

In [None]:
call = call_with_pickled_args("email.parser.parse", email_parse_argument_list[0])
print(call)

In [None]:
obj = pickle.loads(b'\x80\x03cemail.parser\nParser\nq\x00)\x81q\x01}q\x02(X\x06\x00\x00\x00_classq\x03chttp.client\nHTTPMessage\nq\x04X\x06\x00\x00\x00policyq\x05cemail._policybase\nCompat32\nq\x06)\x81q\x07ub.')

In [None]:
import io
obj.parse(io.StringIO("foo"))

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

Traditionally, tests at the unit level are written by humans just as production code is.  Since this book is about generating software tests, we also [cover techniques that generate unit tests](APIFuzzer.ipynb).

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducer.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_