# Fuzzing APIs

So far, we have always generated _system input_, i.e. data that the program as a whole obtains via its input channels.  However, we can also generate input that goes directly into individual functions, gaining flexibility and speed in the process.  In this chapter, we explore the use of grammars to synthesize code for function calls, which allows you to generate _program code that very efficiently invokes functions directly._  On top, we also explore how such API grammars can be synthesized from existing executions; this means that we can _synthesize API tests without having to write a grammar at all._

**Prerequisites**

* You have to know how grammar fuzzing work, e.g. from the [chapter on grammars](Grammars.ipynb).
* To synthesize API grammars, we make use of [recorded ("carved") function calls](Carver.ipynb).

## Fuzzing a Function

Let us start with our first problem: How do we fuzz a given function?  For an interpreted language like Python, this pretty straight-forward.  All we need to do is to generate _calls_ to the function(s) we want to test.  This is something we can easily do with a grammar.

### Testing a URL Parser

As an example, consider the `urlparse()` function from the Python library.  `urlparse()` takes a URL and decomposes it into its individual components.

In [None]:
import fuzzingbook_utils

In [None]:
from urllib.parse import urlparse

In [None]:
urlparse('https://www.fuzzingbook.com/html/APIFuzzer.html')

You see how the individual elements of the URL – the _scheme_ (`"http"`), the _network location_ (`"www.fuzzingbook.com"`), or the path (`"//html/APIFuzzer.html"`) are all properly identified.  Other elements (like `params`, `query`, or `fragment`) are empty, because they were not part of our input.

To test `urlparse()`, we'd want to feed it a large set of different URLs.  We can obtain these from the URL grammar we had defined in the ["Grammars"](Grammars.ipynb) chapter.

In [None]:
from Grammars import URL_GRAMMAR, is_valid_grammar, START_SYMBOL, new_symbol
from GrammarFuzzer import GrammarFuzzer, display_tree, all_terminals

In [None]:
url_fuzzer = GrammarFuzzer(URL_GRAMMAR)

In [None]:
for i in range(10):
    url = url_fuzzer.fuzz()
    print(urlparse(url))

This way, we can easily test any Python function – by setting up a scaffold that runs it.  How would we proceed, though, if we wanted to have a test that can be re-run again and again, without having to generate new calls every time?

### Synthesizing Code

The "scaffolding" method, as sketched above, has an important downside: It couples test generation and test execution into a single unit, disallowing running both at different times, or for different languages.  To decouple the two, we take another approach: Rather than generating inputs and immediately feeding this input into a function, we _synthesize code_ instead that invokes functions with a given input.

For instance, if we generate the string

In [None]:
call = "urlparse('http://www.example.com/')"

we can execute this string as a whole (and thus run the test) at any time:

In [None]:
eval(call)

To systematically generate such calls, we can again use a grammar:

In [None]:
URLPARSE_GRAMMAR = {
    "<call>":
        ['urlparse("<url>")']
}

This grammar creates calls in the form `urlparse(<url>)`, where `<url>` is yet to be defined; the idea is to create many of these calls and to feed them into the Python interpreter.

Let us add definitions for `<url>` from the previously defined URL grammar:

In [None]:
URLPARSE_GRAMMAR.update(URL_GRAMMAR)

In [None]:
URLPARSE_GRAMMAR["<start>"] = ["<call>"]

In [None]:
assert is_valid_grammar(URLPARSE_GRAMMAR)

In [None]:
URLPARSE_GRAMMAR

We can now use this grammar for fuzzing and synthesizing calls to `urlparse)`:

In [None]:
urlparse_fuzzer = GrammarFuzzer(URLPARSE_GRAMMAR)
urlparse_fuzzer.fuzz()

Just as above, we can immediately execute these calls.  To better see what is happening, we define a small helper function:

In [None]:
# Call function_name(arg[0], arg[1], ...) as a string
def do_call(call_string):
    print(call_string)
    result = eval(call_string)
    print("\t= " + repr(result))
    return result

In [None]:
call = urlparse_fuzzer.fuzz()
do_call(call)

If `urlparse()` were a C function, for instance, we could embed its call into some (also generated) C function:

In [None]:
URLPARSE_C_GRAMMAR = {
    "<cfile>": ["<cheader><cfunction>"],
    "<cheader>": ['#include "urlparse.h"\n\n'],
    "<cfunction>": ["void test() {\n<calls>}\n"],
    "<calls>": ["<call>", "<calls><call>"],
    "<call>": ['    urlparse("<url>");\n']
}

In [None]:
URLPARSE_C_GRAMMAR.update(URL_GRAMMAR)

In [None]:
URLPARSE_C_GRAMMAR["<start>"] = ["<cfile>"]

In [None]:
assert is_valid_grammar(URLPARSE_C_GRAMMAR)

In [None]:
urlparse_fuzzer = GrammarFuzzer(URLPARSE_C_GRAMMAR)
print(urlparse_fuzzer.fuzz())

Note that both the Python as well as the C variant only check for _generic_ errors in `urlparse()`; that is, they only detect fatal errors and exceptions.  To also check the _result_ of `urlparse()`, see the [exercise on synthesizing oracles](#Exercise-1:-Synthesizing-Oracles).

### Synthesizing Data

For `urlparse()`, we have used a very specific grammar that would be useful for this function only.
\todo{Complete}
\todo{Introduce generators and constraints first?}

#### Integers

In [None]:
from Grammars import convert_ebnf_grammar, crange

In [None]:
from ProbabilisticGrammarFuzzer import ProbabilisticGrammarFuzzer, opts

In [None]:
import copy

In [None]:
INT_EBNF_GRAMMAR = {
    "<start>": ["<int>"],
    "<int>": ["<sign>?<leaddigit><digit>*"],
    "<sign>": ["+", "-"],
    "<leaddigit>": crange('1', '9'),
    "<digit>": crange('0', '9')
}

assert is_valid_grammar(INT_EBNF_GRAMMAR)

In [None]:
INT_GRAMMAR = convert_ebnf_grammar(INT_EBNF_GRAMMAR)

In [None]:
int_fuzzer = GrammarFuzzer(INT_GRAMMAR)
print([int_fuzzer.fuzz() for i in range(10)])

#### Floats

In [None]:
FLOAT_EBNF_GRAMMAR = {
    "<start>": ["<float>"],
    "<float>": ["<int>(.<digit>+)?<exp>?", "inf", "NaN"],
    "<exp>": ["e<int>"]
}
FLOAT_EBNF_GRAMMAR.update(INT_EBNF_GRAMMAR)
FLOAT_EBNF_GRAMMAR["<start>"] = ["<float>"]

assert is_valid_grammar(FLOAT_EBNF_GRAMMAR)

In [None]:
FLOAT_GRAMMAR = convert_ebnf_grammar(FLOAT_EBNF_GRAMMAR)

In [None]:
float_fuzzer = GrammarFuzzer(FLOAT_GRAMMAR)
print([float_fuzzer.fuzz() for i in range(10)])

#### Strings

In [None]:
ASCII_STRING_EBNF_GRAMMAR = {
    "<start>": ["<ascii-string>"],
    "<ascii-string>": ['"<ascii-chars>"'],
    "<ascii-chars>": [
        ("", opts(prob=0.05)),
        "<ascii-chars><ascii-char>"
    ],
    "<ascii-char>": crange(" ", "!") + [r'\"'] + crange("#", "~")
}

assert is_valid_grammar(ASCII_STRING_EBNF_GRAMMAR)

In [None]:
ASCII_STRING_GRAMMAR = convert_ebnf_grammar(ASCII_STRING_EBNF_GRAMMAR)

In [None]:
string_fuzzer = ProbabilisticGrammarFuzzer(ASCII_STRING_GRAMMAR)
print([string_fuzzer.fuzz() for i in range(10)])

#### Lists

In [None]:
LIST_EBNF_GRAMMAR = {
    "<start>": ["<list>"],
    "<list>": [
        ("[]", opts(prob=0.05)),
        "[<list-objects>]"
    ],
    "<list-objects>": [
        ("<list-object>", opts(prob=0.2)),
        "<list-object>, <list-objects>"
    ],
    "<list-object>": ["0"],
}

assert is_valid_grammar(LIST_EBNF_GRAMMAR)

In [None]:
LIST_GRAMMAR = convert_ebnf_grammar(LIST_EBNF_GRAMMAR)

In [None]:
int_list_grammar = copy.deepcopy(LIST_GRAMMAR)
int_list_grammar.update(INT_GRAMMAR)
int_list_grammar["<start>"] = ["<list>"]
int_list_grammar["<list-object>"] = ["<int>"]
assert is_valid_grammar(int_list_grammar)

In [None]:
int_list_fuzzer = ProbabilisticGrammarFuzzer(int_list_grammar)
[int_list_fuzzer.fuzz() for i in range(10)]

## Mining API Grammars

While it is relatively straightforward to write a grammar that produces a sequence of function calls, we can again try to automate this further by automatically _mining_ function calls from a given execution of a program.  To this end, we make use of [our _carving_ infrastructure introduced in the previous chapter](Carver.ipynb) to _record_ function calls and their arguments, from which we can create a grammar that can then combine arbitrary arguments into new calls.

The general idea is as follows:

1. First, we record all calls of a specific function from a given execution of the program.
2. Second, we create a grammar that incorporates all these calls, with separate rules for each argument and alternatives for each value found; this allows us to produce calls that arbitrarily _recombine_ these arguments.

Let us explore these steps in the following sections.

### From Calls to Grammars

Let us start with an example.  The `power(x, y)` function returns $x^y$; it is but a wrapper around the equivalent `math.pow()` function.  (Since `power()` is defined in Python, we can trace it – in contrast to `math.pow()`, which is implemented in C.)

In [None]:
import math

In [None]:
def power(x, y):
    return math.pow(x, y)

Let us invoke `power()` while recording its arguments:

In [None]:
from Carver import CallCarver, call_value, call_string

In [None]:
with CallCarver() as power_carver:
    z = power(1, 2)
    z = power(3, 4)

In [None]:
power_carver.arguments("power")

From this list of recorded arguments, we could now create a grammar for the `power()` call, with `x` and `y` expanding into the values seen:

In [None]:
POWER_GRAMMAR = {
    "<start>": ["power(<x>, <y>)"],
    "<x>": ["1", "3"],
    "<y>": ["2", "4"]
}

In [None]:
assert is_valid_grammar(POWER_GRAMMAR)

When fuzzing with this grammar, we then get arbitrary combinations of `x` and `y`; aiming for coverage will ensure that all values are actually tested at least once:

In [None]:
from GrammarCoverageFuzzer import GrammarCoverageFuzzer

In [None]:
power_fuzzer = GrammarCoverageFuzzer(POWER_GRAMMAR)
[power_fuzzer.fuzz() for i in range(5)]

What we need is a method to automatically convert the arguments as seen in `power_carver` to the grammar as seen in `POWER_GRAMMAR`.  This is what we define in the next section.

### A Grammar Miner for Calls

We introduce a class `CallGrammarMiner`, which, given a `Carver`, automatically produces a grammar from the calls seen.  To initialize, we pass the carver object:

In [None]:
class CallGrammarMiner(object):
    def __init__(self, carver, log=False):
        self.carver = carver
        self.log = log

#### Initial Grammar

The initial grammar produces a single call.  The possible `<call>` expansions are to be constructed later:

In [None]:
import copy 

In [None]:
class CallGrammarMiner(CallGrammarMiner):
    CALL_SYMBOL = "<call>"

    def initial_grammar(self):
        return copy.deepcopy(
            {START_SYMBOL: [self.CALL_SYMBOL],
                self.CALL_SYMBOL: []
             })

In [None]:
m = CallGrammarMiner(power_carver)
initial_grammar = m.initial_grammar()
initial_grammar

#### A Grammar from Arguments

Let us start by creating a grammar from a list of arguments.  The method `mine_arguments_grammar()` creates a grammar for the arguments seen during carving, such as these:

In [None]:
arguments = power_carver.arguments("power")
arguments

The `mine_arguments_grammar()` method iterates through the variables seen and creates a mapping `variables` of variable names to a set of values seen (as strings, going through `call_value()`).  In a second step, it then creates a grammar with a rule for each variable name, expanding into the values seen.

In [None]:
class CallGrammarMiner(CallGrammarMiner):
    def var_symbol(self, function_name, var, grammar):
        return new_symbol(grammar, "<" + function_name + "-" + var + ">")

    def mine_arguments_grammar(self, function_name, arguments, grammar):
        var_grammar = {}

        variables = {}
        for argument_list in arguments:
            for (var, value) in argument_list:
                value_string = call_value(value)
                if self.log:
                    print(var, "=", value_string)

                if value_string.find("<") >= 0:
                    var_grammar["<langle>"] = ["<"]
                    value_string = value_string.replace("<", "<langle>")

                if var not in variables:
                    variables[var] = set()
                variables[var].add(value_string)

        var_symbols = []
        for var in variables:
            var_symbol = self.var_symbol(function_name, var, grammar)
            var_symbols.append(var_symbol)
            var_grammar[var_symbol] = list(variables[var])

        return var_grammar, var_symbols

In [None]:
m = CallGrammarMiner(power_carver)
var_grammar, var_symbols = m.mine_arguments_grammar(
    "power", arguments, initial_grammar)

In [None]:
var_grammar

The additional return value `var_symbols` is a list of argument symbols in the call:

In [None]:
var_symbols

#### A Grammar from Calls

To get the grammar for a single function (`mine_function_grammar()`), we add a call to the function:

In [None]:
class CallGrammarMiner(CallGrammarMiner):
    def function_symbol(self, function_name, grammar):
        return new_symbol(grammar, "<" + function_name + ">")

    def mine_function_grammar(self, function_name, grammar):
        arguments = self.carver.arguments(function_name)

        if self.log:
            print(function_name, arguments)

        var_grammar, var_symbols = self.mine_arguments_grammar(
            function_name, arguments, grammar)

        function_grammar = var_grammar
        function_symbol = self.function_symbol(function_name, grammar)

        if len(var_symbols) > 0 and var_symbols[0].find("-self") >= 0:
            # Method call
            function_grammar[function_symbol] = [
                var_symbols[0] + "." + function_name + "(" + ", ".join(var_symbols[1:]) + ")"]
        else:
            function_grammar[function_symbol] = [
                function_name + "(" + ", ".join(var_symbols) + ")"]

        if self.log:
            print(function_symbol, "::=", function_grammar[function_symbol])

        return function_grammar, function_symbol

In [None]:
m = CallGrammarMiner(power_carver)
function_grammar, function_symbol = m.mine_function_grammar(
    "power", initial_grammar)
function_grammar

The additionally returned `function_symbol` holds the name of the function call just added:

In [None]:
function_symbol

#### A Grammar from all Calls

Let us now repeat the above for all function calls seen during carving.  To this end, we simply iterate over all function calls seen:

In [None]:
power_carver.called_functions()

In [None]:
class CallGrammarMiner(CallGrammarMiner):
    def mine_call_grammar(self, function_list=None, qualified=False):
        grammar = self.initial_grammar()
        fn_list = function_list
        if function_list is None:
            fn_list = self.carver.called_functions(qualified=qualified)

        for function_name in fn_list:
            if function_list is None and (function_name.startswith("_") or function_name.startswith("<")):
                continue  # Internal function

            # Ignore errors with mined functions
            try:
                function_grammar, function_symbol = self.mine_function_grammar(
                    function_name, grammar)
            except:
                if function_list is not None:
                    raise

            if function_symbol not in grammar[self.CALL_SYMBOL]:
                grammar[self.CALL_SYMBOL].append(function_symbol)
            grammar.update(function_grammar)

        assert is_valid_grammar(grammar)
        return grammar

The method `mine_call_grammar()` is the one that clients can and should use – first for mining...

In [None]:
m = CallGrammarMiner(power_carver)
power_grammar = m.mine_call_grammar()
power_grammar

...and then for fuzzing:

In [None]:
power_fuzzer = GrammarCoverageFuzzer(power_grammar)
[power_fuzzer.fuzz() for i in range(5)]

With this, we have successfully extracted a grammar from a recorded execution; in contrast to "simple" carving, our grammar allows us to _recombine_ arguments and thus to fuzz at the API level.

## Fuzzing Web Functions

Let us now apply our grammar miner on a larger API – the `urlparse` API we already encountered during carving.

In [None]:
from Carver import webbrowser

In [None]:
with CallCarver() as webbrowser_carver:
    webbrowser("https://www.fuzzingbook.org")
    webbrowser("http://www.example.com")

We can mine a grammar from the calls encountered:

In [None]:
m = CallGrammarMiner(webbrowser_carver)
webbrowser_grammar = m.mine_call_grammar()

This is a rather large grammar:

In [None]:
print(webbrowser_grammar['<call>'])

Here's the rule for the `urlsplit()` function:

In [None]:
webbrowser_grammar["<urlsplit>"]

Here are the arguments.  Note that although we only passed `http://www.fuzzingbook.org` as a parameter, we also see the `https:` variant.  That is because opening the `http:` URL automatically redirects to the `https:` URL, which is then also processed by `urlsplit()`.

In [None]:
webbrowser_grammar["<urlsplit-url>"]

There also is some variation in the `scheme` argument:

In [None]:
webbrowser_grammar["<urlsplit-scheme>"]

If we now apply a fuzzer on these rules, we systematically cover all variations of arguments seen, including, of course, combinations not seen during carving.  Again, we are fuzzing at the API level here.

In [None]:
urlsplit_fuzzer = GrammarCoverageFuzzer(
    webbrowser_grammar, start_symbol="<urlsplit>")
for i in range(5):
    print(urlsplit_fuzzer.fuzz())

Just [as seen with carving](Carver.ipynb), running tests at the API level is orders of magnitude faster than executing system tests.  Hence, this calls for means to fuzz at the method level:

In [None]:
from urllib.parse import urlsplit

In [None]:
from Timer import Timer

In [None]:
with Timer() as urlsplit_timer:
    urlsplit('http://www.fuzzingbook.org/', 'http', True)
urlsplit_timer.elapsed_time()

In [None]:
with Timer() as webbrowser_timer:
    webbrowser("http://www.fuzzingbook.org")
webbrowser_timer.elapsed_time()

In [None]:
webbrowser_timer.elapsed_time() / urlsplit_timer.elapsed_time()

But then again, the caveats encountered during carving apply, notably the requirement to recreate the original function environment.  If we also alter or recombine arguments, we get the additional risk of _violating an implicit precondition_ – that is, invoking a function with arguments the function was never designed for.  Such _false alarms_, resulting from incorrect invocations rather than incorrect implementations, must then be identified (typically manually) and wed out (for instance, by altering or constraining the grammar).  The huge speed gains at the API level, however, may well justify this additional investment.

## Lessons Learned

* To fuzz individual functions, one can easily set up grammars that produce function calls.
* Such grammars can also be mined (carved) from given executions.
* Fuzzing at the API level can be much faster than fuzzing at the system level, but brings the risk of false alarms by violating implicit preconditions.

## Next Steps

To extend the techniques just introduced, you can

* [use _search-based testing_ to guide test generation towards specific goals](SearchBasedFuzzer.ipynb)
* [use _constraints_ (i.e., a specification of the input format) to get even more valid inputs](ConstraintGrammarFuzzer.ipynb)


## Background

The combination of carving and fuzzing at the API level was first conducted by Alexander Kampmann in his PhD work.

## Exercises

The exercises for this chapter combine the above techniques with fuzzing techniques introduced earlier.

### Exercise 1: Covering Argument Combinations

In the chapter on [configuration testing](ConfigurationFuzzer.ipynb), we also discussed _combinatorial testing_ – that is, systematic coverage of _sets_ of configuration elements.  Implement a scheme that by changing the grammar, allows all _pairs_ of argument values to be covered.

**Solution.** Left to the reader.

### Exercise 2: Mutating Arguments

To widen the range of arguments to be used during testing, apply the _mutation schemes_ introduced in [mutation fuzzing](MutationFuzzer.ipynb) – for instance, flip individual bytes or delete characters from strings.  Apply this either during grammar inference or as a separate step when invoking functions.

**Solution.** Left to the reader.

### Exercise 3: Abstracting Arguments

Set up an abstraction scheme to widen the range of arguments to be used during testing.  If the values for an argument, all conform to some type `T`. abstract it into `<T>`.  For instance, if calls to `foo(1)`, `foo(2)`, `foo(3)` have been seen, the grammar should abstract its calls into `foo(<int>)`, with `<int>` being appropriately defined.

Do this for a number of common types: integers, positive numbers, floating-point numbers, host names, URLs, mail addresses, and more.

**Solution.** Left to the reader.