# Fuzzing Configurations

In this chapter, we explore how to systematically cover software configurations – that is, the settings that govern the execution of a program on its (regular) input data.  By _automatically inferring configuration options_, we can apply these techniques out of the box, with no need for writing a grammar.

**Prerequisites**

* You should have read the [chapter on grammars](Grammars.ipynb).
* You should have read the [chapter on grammar coverage](GrammarCoverage.ipynb).

## Configuration Options


In [None]:
import fuzzingbook_utils

In [None]:
import argparse

In [None]:
def process_numbers(args=[]):
    parser = argparse.ArgumentParser(description='Process some integers.')
    parser.add_argument('integers', metavar='N', type=int, nargs='+',
                        help='an integer for the accumulator')
    group = parser.add_mutually_exclusive_group(required=True)
    group.add_argument('--sum', dest='accumulate', action='store_const',
                        const=sum,
                        help='sum the integers')
    group.add_argument('--min', dest='accumulate', action='store_const',
                        const=min,
                        help='compute the minimum')
    group.add_argument('--max', dest='accumulate', action='store_const',
                        const=max,
                        help='compute the maximum')

    args = parser.parse_args(args)
    print(args.accumulate(args.integers))

In [None]:
process_numbers(["--min", "100", "200", "300"])

In [None]:
process_numbers(["--sum", '1', '2', '3'])

## A Grammar for Configurations

In [None]:
from Grammars import crange, srange, convert_ebnf_grammar, is_valid_grammar, START_SYMBOL, new_symbol

In [None]:
PROCESS_NUMBERS_GRAMMAR_EBNF = {
    "<start>": ["<operator> <integers>"],
    "<operator>": ["--sum", "--min", "--max"],
    "<integers>": ["<integer>", "<integers> <integer>"],
    "<integer>": ["<digit>+"],
    "<digit>": crange('0', '9')
}

assert is_valid_grammar(PROCESS_NUMBERS_GRAMMAR_EBNF)

In [None]:
PROCESS_NUMBERS_GRAMMAR = convert_ebnf_grammar(PROCESS_NUMBERS_GRAMMAR_EBNF)

In [None]:
from GrammarCoverageFuzzer import GrammarCoverageFuzzer

In [None]:
f = GrammarCoverageFuzzer(PROCESS_NUMBERS_GRAMMAR, min_nonterminals=10)
for i in range(3):
    print(f.fuzz())

## Mining Configuration Options


In [None]:
import sys

In [None]:
import string

In [None]:
class ParseInterrupt(Exception):
    pass

In [None]:
class ConfigurationGrammarMiner(object):
    def __init__(self, function, log=False):
        self.function = function    # FIXME: Should this be a runner?
        self.log = log

In [None]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    OPTION_SYMBOL = "<option>"
    ARGUMENTS_SYMBOL = "<arguments>" 
    def mine_ebnf_grammar(self):
        self.grammar = { 
            START_SYMBOL: [ "(" + self.OPTION_SYMBOL + ")*" + self.ARGUMENTS_SYMBOL],
            self.OPTION_SYMBOL: [], 
            self.ARGUMENTS_SYMBOL: []
        }
        self.current_group = self.OPTION_SYMBOL

        old_trace = sys.settrace(self.traceit)
        try:
            self.function()
        except ParseInterrupt:
            pass
        sys.settrace(old_trace)
        
        return self.grammar
    
    def mine_grammar(self):
        return convert_ebnf_grammar(self.mine_ebnf_grammar())

In [None]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def traceit(self, frame, event, arg):
        if event != "call":
            return

        if "self" not in frame.f_locals:
            return
        self_var = frame.f_locals["self"]

        method_name = frame.f_code.co_name

        if method_name == "add_argument":
            in_group = repr(type(self_var)).find("Group") >= 0
            self.process_argument(frame.f_locals, in_group)
            
        if method_name == "add_mutually_exclusive_group":
            self.add_group(frame.f_locals, exclusive=True)

        if method_name == "add_argument_group":
            # self.add_group(frame.f_locals, exclusive=False)
            pass
    
        if method_name == "parse_args":
            raise ParseInterrupt

        return None

In [None]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def process_argument(self, locals, in_group):
        args = locals["args"]
        kwargs = locals["kwargs"]

        if self.log:
            print(args)
            print(kwargs)
            print()

        for arg in args:
            if arg.startswith('-'):
                if not in_group:
                    target = self.OPTION_SYMBOL
                else:
                    target = self.current_group
                metavar = None
                arg = " " + arg
            else:
                target = self.ARGUMENTS_SYMBOL
                metavar = arg
                arg = ""

            if "nargs" in kwargs:
                nargs = kwargs["nargs"]
            else:
                nargs = 1
            
            if "action" in kwargs:
                # No argument
                param = ""
                nargs = 0
            else:
                if "type" in kwargs and isinstance(kwargs["type"], int):
                    type_ = "int"
                else:
                    type_ = "str"

                if metavar is None and "metavar" in kwargs:
                    metavar = kwargs["metavar"]
                    
                if metavar is not None:
                    self.grammar["<" + metavar + ">"] = ["<" + type_ + ">"]
                else:
                    metavar = type_
                    
                if type_ == "int":
                    self.grammar["<int>"] = ["(-)?<digit>+"]
                    self.grammar["<digit>"] = crange('0', '9')
                    param = " <" + metavar + ">"
                else:
                    self.grammar["<str>"] = ["<char>+"]
                    self.grammar["<char>"] = srange(string.digits + string.ascii_letters + string.punctuation)
                    param = " <" + metavar + ">"

            if isinstance(nargs, int):
                for i in range(nargs):
                    arg += param
            else:
                assert nargs in "?+*"
                arg += '(' + param + ')' + nargs
                    
            if target == self.OPTION_SYMBOL:
                self.grammar[target].append(arg)
            else:
                self.grammar[target].append(arg)

In [None]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def add_group(self, locals, exclusive):
        kwargs = locals["kwargs"]
        if self.log:
            print(kwargs)

        required = kwargs.get("required", False)
        group = new_symbol(self.grammar, "<group>")

        if required and exclusive:
            group_expansion = group
        if required and not exclusive:
            group_expansion = group + "+"
        if not required and exclusive:
            group_expansion = group + "?"
        if not required and not exclusive:
            group_expansion = group + "*"

        self.grammar[START_SYMBOL][0] = group_expansion + self.grammar[START_SYMBOL][0]
        self.grammar[group] = []
        self.current_group = group

In [None]:
miner = ConfigurationGrammarMiner(process_numbers, log=True)
grammar_ebnf = miner.mine_ebnf_grammar()
print(grammar_ebnf)

In [None]:
assert is_valid_grammar(grammar_ebnf)

In [None]:
grammar = convert_ebnf_grammar(grammar_ebnf)
assert is_valid_grammar(grammar)

In [None]:
f = GrammarCoverageFuzzer(grammar)
for i in range(10):
    print(f.fuzz())

## Complex Args

In [None]:
!autopep8 --help

In [None]:
import os

In [None]:
def find_executable(name):
    for path in os.get_exec_path():
        qualified_name = os.path.join(path, name)
        if os.path.exists(qualified_name):
            return qualified_name
    return None

In [None]:
find_executable("autopep8")

In [None]:
def autopep8():
    executable = find_executable("autopep8")
    first_line = open(executable).readline()
    assert first_line.find("python") >= 0
    contents = open(executable).read()
    exec(contents)

In [None]:
miner = ConfigurationGrammarMiner(autopep8, log=True)

In [None]:
grammar = miner.mine_ebnf_grammar()
print(grammar["<option>"])

In [None]:
grammar = convert_ebnf_grammar(grammar_ebnf)
assert is_valid_grammar(grammar)
print(grammar["<option>"])

In [None]:
grammar["<arguments>"] = [" foo.py"]
f = GrammarCoverageFuzzer(grammar, max_nonterminals=3)
for i in range(20):
    print(f.fuzz())

In [None]:
def create_foo_py():
    open("foo.py", "w").write("""
def twice(x):
    return x+x
""")

In [None]:
create_foo_py()

In [None]:
print(open("foo.py").read())

In [None]:
from Fuzzer import ProgramRunner

In [None]:
f = GrammarCoverageFuzzer(grammar, max_nonterminals=5)
for i in range(20):
    invocation = "autopep8" + f.fuzz()
    print("$ " + invocation)
    args = invocation.split()
    autopep8 = ProgramRunner(args)
    result, outcome = autopep8.run()
    if result.stderr != "":
        print(result.stderr, end="")

In [None]:
import os

In [None]:
os.remove("foo.py")

## Putting it all Together

In [None]:
class ConfigurationRunner(ProgramRunner):
    def __init__(self, program, arguments=None):
        if isinstance(program, str):
            self.base_executable = program
        else:
            self.base_executable = program[0]

        self.find_contents()
        self.find_grammar()
        if arguments is not None:
            self.set_arguments(arguments)
        super().__init__(program)

    def find_contents(self):
        self._executable = find_executable(self.base_executable)
        first_line = open(self._executable).readline()
        assert first_line.find("python") >= 0
        self.contents = open(self._executable).read()

    def invoker(self):
        exec(self.contents)
    
    def find_grammar(self):
        miner = ConfigurationGrammarMiner(self.invoker)
        self._grammar = miner.mine_grammar()

    def grammar(self):
        return self._grammar

    def executable(self):
        return self._executable

    def set_arguments(self, args):
        self._grammar["<arguments>"] = [" " + args]
        
    def set_invocation(self, program):
        self.program = program

In [None]:
conf_runner = ConfigurationRunner("autopep8", "foo.py")

In [None]:
conf_runner.grammar()["<option>"]

In [None]:
class ConfigurationFuzzer(GrammarCoverageFuzzer):
    def __init__(self, runner, *args, **kwargs):
        self.runner = runner
        grammar = runner.grammar()
        super().__init__(grammar, *args, **kwargs)

    def run(self, runner=None, inp=""):
        if runner is None:
            runner = self.runner
        invocation = runner.executable() + " " + self.fuzz()
        runner.set_invocation(invocation.split())
        return runner.run(inp)

In [None]:
conf_fuzzer = ConfigurationFuzzer(conf_runner, max_nonterminals=5)

In [None]:
conf_fuzzer.fuzz()

In [None]:
conf_fuzzer.run(conf_runner)

## MyPy

In [None]:
mypy = ConfigurationRunner("mypy", "foo.py")
print(mypy.grammar()["<option>"])

In [None]:
mypy_fuzzer = ConfigurationFuzzer(mypy, max_nonterminals=3)
for i in range(10):
    print(mypy_fuzzer.fuzz())

In [None]:
notedown = ConfigurationRunner("notedown")
print(notedown.grammar()["<option>"])

In [None]:
notedown_fuzzer = ConfigurationFuzzer(notedown, max_nonterminals=3)
for i in range(10):
    print(notedown_fuzzer.fuzz())

## Combinatorial Testing

\todo{Add}

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducing.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_