# Getting Coverage

Let us obtain the coverage of a simple function.  To this end, we use a trace function that tracks each line executed.



## Tracing Functions

This implementation uses the following Python features and functions:

* `sys.settrace(f)` - set `f()` as tracing function, to be called for each line.  



In [None]:
import sys

As `f`, we define a function `traceit()`, which accesses the current function name and current line number, as shown below, and store this in a global map `coverage`.  Note that depending on your Python setting, `traceit()` may also be called for other, internal functions, and hence, these may also be covered in `coverage`.

In [None]:
coverage = {}

In [None]:
def traceit(frame, event, arg):
    global coverage
    if event == "line":
        function_name = frame.f_code.co_name
        lineno = frame.f_lineno
        if function_name not in coverage:
            coverage[function_name] = set()
        coverage[function_name].add(lineno)
    return traceit

To test our coverage setting, we use the `cgi_decode()` function (after Pezze and Young), which takes a CGI-encoded string (say, "a+b") and returns the decoded variant (say, "a b").

In [None]:
def cgi_decode(s):
    """Decode the CGI-encoded string `s`:
       * replace "+" by " "
       * replace "%xx" by the character with hex number xx.
       Return the decoded string, or None for invalid inputs."""

    # Mapping of hex digits to their integer values
    hex_values = {
        '0': 0,
        '1': 1,
        '2': 2,
        '3': 3,
        '4': 4,
        '5': 5,
        '6': 6,
        '7': 7,
        '8': 8,
        '9': 9,
        'a': 10,
        'b': 11,
        'c': 12,
        'd': 13,
        'e': 14,
        'f': 15,
        'A': 10,
        'B': 11,
        'C': 12,
        'D': 13,
        'E': 14,
        'F': 15,
    }

    t = ""
    i = 0
    while i < len(s):
        c = s[i]
        if c == '+':
            t = t + ' '
        elif c == '%':
            digit_high, digit_low = s[i + 1], s[i + 2]
            i = i + 2
            if digit_high in hex_values and digit_low in hex_values:
                v = hex_values[digit_high] * 16 + hex_values[digit_low]
                t = t + chr(v)
            else:
                return None
        else:
            t = t + c
        i = i + 1
    return t


# A few unit tests
assert cgi_decode('+') == ' '
assert cgi_decode('%20') == ' '
assert cgi_decode('abc') == 'abc'
assert cgi_decode('%?a') is None

We can now obtain the coverage for a run:

In [None]:
coverage = {}
sys.settrace(traceit)
x = cgi_decode('abc')
sys.settrace(None)
abc_coverage = coverage['cgi_decode']
print(abc_coverage)

## Comparing Coverages

Different inputs cause different coverages:

In [None]:
coverage = {}
sys.settrace(traceit)
x = cgi_decode('a+b')
sys.settrace(None)
a_plus_b_coverage = coverage['cgi_decode']
print(a_plus_b_coverage)

We see that the input `"a+b"` covers one more line than `"abc"`, namely the code line where `'+'` is processed:

In [None]:
print(a_plus_b_coverage - abc_coverage)

With this tool, we can now

* _assess_ coverage to evaluate how good our test generators are
* _leverage_ coverage to guide our test generators towards uncovered code.


## Exercises

1. Measure the _branch coverage_ in a function by storing _pairs_ of lines executed one after the other.
2. Generalize your solution such that it also works with function calls and returns, including recursive calls.  See the documentation of `sys.settrace()` to track calling events.
