# Fuzzing Configurations

In this chapter, we explore how to systematically cover software configurations – that is, the settings that govern the execution of a program on its (regular) input data.  By _automatically inferring configuration options_, we can apply these techniques out of the box, with no need for writing a grammar.

**Prerequisites**

* You should have read the [chapter on grammars](Grammars.ipynb).
* You should have read the [chapter on grammar coverage](GrammarCoverage.ipynb).

## Configuration Options


In [1]:
import fuzzingbook_utils

In [2]:
import argparse

In [3]:
def process_numbers(args=[]):
    parser = argparse.ArgumentParser(description='Process some integers.')
    parser.add_argument('integers', metavar='N', type=int, nargs='+',
                        help='an integer for the accumulator')
    group = parser.add_mutually_exclusive_group(required=True)
    group.add_argument('--sum', dest='accumulate', action='store_const',
                        const=sum,
                        help='sum the integers')
    group.add_argument('--min', dest='accumulate', action='store_const',
                        const=min,
                        help='compute the minimum')
    group.add_argument('--max', dest='accumulate', action='store_const',
                        const=max,
                        help='compute the maximum')

    args = parser.parse_args(args)
    print(args.accumulate(args.integers))

In [4]:
process_numbers(["--min", "100", "200", "300"])

100


In [5]:
process_numbers(["--sum", '1', '2', '3'])

6


## A Grammar for Configurations

In [6]:
from Grammars import crange, srange, convert_ebnf_grammar, is_valid_grammar, START_SYMBOL, new_symbol

In [7]:
PROCESS_NUMBERS_GRAMMAR_EBNF = {
    "<start>": ["<operator> <integers>"],
    "<operator>": ["--sum", "--min", "--max"],
    "<integers>": ["<integer>", "<integers> <integer>"],
    "<integer>": ["<digit>+"],
    "<digit>": crange('0', '9')
}

assert is_valid_grammar(PROCESS_NUMBERS_GRAMMAR_EBNF)

In [8]:
PROCESS_NUMBERS_GRAMMAR = convert_ebnf_grammar(PROCESS_NUMBERS_GRAMMAR_EBNF)

In [9]:
from GrammarCoverageFuzzer import GrammarCoverageFuzzer

In [10]:
f = GrammarCoverageFuzzer(PROCESS_NUMBERS_GRAMMAR, min_nonterminals=10)
for i in range(3):
    print(f.fuzz())

--max 9 5 8 210 80 9756431
--sum 9 4 99 1245 612370
--min 2 3 0 46 15798 7570926


## Mining Configuration Options


In [11]:
import sys

In [12]:
import string

In [13]:
class ParseInterrupt(Exception):
    pass

In [14]:
class ConfigurationGrammarMiner(object):
    def __init__(self, function, log=False):
        self.function = function    # FIXME: Should this be a runner?
        self.log = log

In [15]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    OPTION_SYMBOL = "<option>"
    ARGUMENTS_SYMBOL = "<arguments>" 
    def mine_ebnf_grammar(self):
        self.grammar = { 
            START_SYMBOL: [ "(" + self.OPTION_SYMBOL + ")*" + self.ARGUMENTS_SYMBOL],
            self.OPTION_SYMBOL: [], 
            self.ARGUMENTS_SYMBOL: []
        }
        self.current_group = self.OPTION_SYMBOL

        old_trace = sys.settrace(self.traceit)
        try:
            self.function()
        except ParseInterrupt:
            pass
        sys.settrace(old_trace)
        
        return self.grammar
    
    def mine_grammar(self):
        return convert_ebnf_grammar(self.mine_ebnf_grammar())

In [16]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def traceit(self, frame, event, arg):
        if event != "call":
            return

        if "self" not in frame.f_locals:
            return
        self_var = frame.f_locals["self"]

        method_name = frame.f_code.co_name

        if method_name == "add_argument":
            in_group = repr(type(self_var)).find("Group") >= 0
            self.process_argument(frame.f_locals, in_group)
            
        if method_name == "add_mutually_exclusive_group":
            self.add_group(frame.f_locals, exclusive=True)

        if method_name == "add_argument_group":
            # self.add_group(frame.f_locals, exclusive=False)
            pass
    
        if method_name == "parse_args":
            raise ParseInterrupt

        return None

In [17]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def process_argument(self, locals, in_group):
        args = locals["args"]
        kwargs = locals["kwargs"]

        if self.log:
            print(args)
            print(kwargs)
            print()

        for arg in args:
            if arg.startswith('-'):
                if not in_group:
                    target = self.OPTION_SYMBOL
                else:
                    target = self.current_group
                metavar = None
                arg = " " + arg
            else:
                target = self.ARGUMENTS_SYMBOL
                metavar = arg
                arg = ""

            if "nargs" in kwargs:
                nargs = kwargs["nargs"]
            else:
                nargs = 1
            
            if "action" in kwargs:
                # No argument
                param = ""
                nargs = 0
            else:
                if "type" in kwargs and isinstance(kwargs["type"], int):
                    type_ = "int"
                else:
                    type_ = "str"

                if metavar is None and "metavar" in kwargs:
                    metavar = kwargs["metavar"]
                    
                if metavar is not None:
                    self.grammar["<" + metavar + ">"] = ["<" + type_ + ">"]
                else:
                    metavar = type_
                    
                if type_ == "int":
                    self.grammar["<int>"] = ["(-)?<digit>+"]
                    self.grammar["<digit>"] = crange('0', '9')
                    param = " <" + metavar + ">"
                else:
                    self.grammar["<str>"] = ["<char>+"]
                    self.grammar["<char>"] = srange(string.digits + string.ascii_letters + string.punctuation)
                    param = " <" + metavar + ">"

            if isinstance(nargs, int):
                for i in range(nargs):
                    arg += param
            else:
                assert nargs in "?+*"
                arg += '(' + param + ')' + nargs
                    
            if target == self.OPTION_SYMBOL:
                self.grammar[target].append(arg)
            else:
                self.grammar[target].append(arg)

In [18]:
class ConfigurationGrammarMiner(ConfigurationGrammarMiner):
    def add_group(self, locals, exclusive):
        kwargs = locals["kwargs"]
        if self.log:
            print(kwargs)

        required = kwargs.get("required", False)
        group = new_symbol(self.grammar, "<group>")

        if required and exclusive:
            group_expansion = group
        if required and not exclusive:
            group_expansion = group + "+"
        if not required and exclusive:
            group_expansion = group + "?"
        if not required and not exclusive:
            group_expansion = group + "*"

        self.grammar[START_SYMBOL][0] = group_expansion + self.grammar[START_SYMBOL][0]
        self.grammar[group] = []
        self.current_group = group

In [19]:
miner = ConfigurationGrammarMiner(process_numbers, log=True)
grammar_ebnf = miner.mine_ebnf_grammar()
print(grammar_ebnf)

('-h', '--help')
{'action': 'help', 'default': '==SUPPRESS==', 'help': 'show this help message and exit'}

('integers',)
{'metavar': 'N', 'type': <class 'int'>, 'nargs': '+', 'help': 'an integer for the accumulator'}

{'required': True}
('--sum',)
{'dest': 'accumulate', 'action': 'store_const', 'const': <built-in function sum>, 'help': 'sum the integers'}

('--min',)
{'dest': 'accumulate', 'action': 'store_const', 'const': <built-in function min>, 'help': 'compute the minimum'}

('--max',)
{'dest': 'accumulate', 'action': 'store_const', 'const': <built-in function max>, 'help': 'compute the maximum'}

{'<start>': ['<group-1>(<option>)*<arguments>'], '<option>': [' -h', ' --help'], '<arguments>': ['( <integers>)+'], '<integers>': ['<str>'], '<str>': ['<char>+'], '<char>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H

In [20]:
assert is_valid_grammar(grammar_ebnf)

In [21]:
grammar = convert_ebnf_grammar(grammar_ebnf)
assert is_valid_grammar(grammar)

In [22]:
f = GrammarCoverageFuzzer(grammar)
for i in range(10):
    print(f.fuzz())

 --sum -h q
 --max --help --help --help --help ta y 1 Z "
 --min S9
 --max -h p& V MHe
 --min >
 --min U< ~
 --max -h -h K 8N?lj W
 --min X3 J
 --min w%
 --sum --help --help --help -h @


## Complex Args

In [23]:
!autopep8 --help

usage: autopep8 [-h] [--version] [-v] [-d] [-i] [--global-config filename]
                [--ignore-local-config] [-r] [-j n] [-p n] [-a]
                [--experimental] [--exclude globs] [--list-fixes]
                [--ignore errors] [--select errors] [--max-line-length n]
                [--line-range line line] [--hang-closing]
                [files [files ...]]

Automatically formats Python code to conform to the PEP 8 style guide.

positional arguments:
  files                 files to format or '-' for standard in

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v, --verbose         print verbose messages; multiple -v result in more
                        verbose messages
  -d, --diff            print the diff for the fixed source
  -i, --in-place        make changes to files in place
  --global-config filename
                        path to a global pep8 confi

In [24]:
import os

In [25]:
def find_executable(name):
    for path in os.get_exec_path():
        qualified_name = os.path.join(path, name)
        if os.path.exists(qualified_name):
            return qualified_name
    return None

In [26]:
find_executable("autopep8")

'/Users/zeller/anaconda3/bin/autopep8'

In [27]:
def autopep8():
    executable = find_executable("autopep8")
    first_line = open(executable).readline()
    assert first_line.find("python") >= 0
    contents = open(executable).read()
    exec(contents)

In [28]:
miner = ConfigurationGrammarMiner(autopep8, log=True)

In [29]:
grammar = miner.mine_ebnf_grammar()
print(grammar["<option>"])

('-h', '--help')
{'action': 'help', 'default': '==SUPPRESS==', 'help': 'show this help message and exit'}

('--version',)
{'action': 'version', 'version': '%(prog)s 1.3.4 (pycodestyle: 2.4.0)'}

('-v', '--verbose')
{'action': 'count', 'default': 0, 'help': 'print verbose messages; multiple -v result in more verbose messages'}

('-d', '--diff')
{'action': 'store_true', 'help': 'print the diff for the fixed source'}

('-i', '--in-place')
{'action': 'store_true', 'help': 'make changes to files in place'}

('--global-config',)
{'metavar': 'filename', 'default': '/Users/zeller/.config/pep8', 'help': 'path to a global pep8 config file; if this file does not exist then this is ignored (default: /Users/zeller/.config/pep8)'}

('--ignore-local-config',)
{'action': 'store_true', 'help': "don't look for and apply local config files; if not passed, defaults are updated with any config files in the project's root directory"}

('-r', '--recursive')
{'action': 'store_true', 'help': 'run recursively o

In [30]:
grammar = convert_ebnf_grammar(grammar_ebnf)
assert is_valid_grammar(grammar)
print(grammar["<option>"])

[' -h', ' --help']


In [31]:
grammar["<arguments>"] = [" foo.py"]
f = GrammarCoverageFuzzer(grammar, max_nonterminals=3)
for i in range(20):
    print(f.fuzz())

 --min foo.py
 --sum foo.py
 --max foo.py
 --min foo.py
 --min foo.py
 --max foo.py
 --min foo.py
 --sum foo.py
 --min foo.py
 --max foo.py
 --min foo.py
 --sum foo.py
 --sum foo.py
 --sum foo.py
 --max foo.py
 --max foo.py
 --sum foo.py
 --max foo.py
 --sum foo.py
 --min foo.py


In [32]:
def create_foo_py():
    open("foo.py", "w").write("""
def twice(x):
    return x+x
""")

In [33]:
create_foo_py()

In [34]:
print(open("foo.py").read())


def twice(x):
    return x+x



In [35]:
from Fuzzer import ProgramRunner

In [36]:
f = GrammarCoverageFuzzer(grammar, max_nonterminals=5)
for i in range(20):
    invocation = "autopep8" + f.fuzz()
    print("$ " + invocation)
    args = invocation.split()
    autopep8 = ProgramRunner(args)
    result, outcome = autopep8.run()
    if result.stderr != "":
        print(result.stderr, end="")

$ autopep8 --sum foo.py
usage: autopep8 [-h] [--version] [-v] [-d] [-i] [--global-config filename]
                [--ignore-local-config] [-r] [-j n] [-p n] [-a]
                [--experimental] [--exclude globs] [--list-fixes]
                [--ignore errors] [--select errors] [--max-line-length n]
                [--line-range line line] [--hang-closing]
                [files [files ...]]
autopep8: error: unrecognized arguments: --sum
$ autopep8 --min --help -h foo.py
$ autopep8 --max foo.py
usage: autopep8 [-h] [--version] [-v] [-d] [-i] [--global-config filename]
                [--ignore-local-config] [-r] [-j n] [-p n] [-a]
                [--experimental] [--exclude globs] [--list-fixes]
                [--ignore errors] [--select errors] [--max-line-length n]
                [--line-range line line] [--hang-closing]
                [files [files ...]]
autopep8: error: argument --max-line-length: invalid int value: 'foo.py'
$ autopep8 --max --help -h -h --help foo.py
usage: a

In [37]:
import os

In [38]:
os.remove("foo.py")

## Putting it all Together

In [39]:
class ConfigurationRunner(ProgramRunner):
    def __init__(self, program, arguments=None):
        if isinstance(program, str):
            self.base_executable = program
        else:
            self.base_executable = program[0]

        self.find_contents()
        self.find_grammar()
        if arguments is not None:
            self.set_arguments(arguments)
        super().__init__(program)

    def find_contents(self):
        self._executable = find_executable(self.base_executable)
        first_line = open(self._executable).readline()
        assert first_line.find("python") >= 0
        self.contents = open(self._executable).read()

    def invoker(self):
        exec(self.contents)
    
    def find_grammar(self):
        miner = ConfigurationGrammarMiner(self.invoker)
        self._grammar = miner.mine_grammar()

    def grammar(self):
        return self._grammar

    def executable(self):
        return self._executable

    def set_arguments(self, args):
        self._grammar["<arguments>"] = [" " + args]
        
    def set_invocation(self, program):
        self.program = program

In [40]:
conf_runner = ConfigurationRunner("autopep8", "foo.py")

In [41]:
conf_runner.grammar()["<option>"]

[' -h',
 ' --help',
 ' --version',
 ' -v',
 ' --verbose',
 ' -d',
 ' --diff',
 ' -i',
 ' --in-place',
 ' --global-config <filename>',
 ' --ignore-local-config',
 ' -r',
 ' --recursive',
 ' -j <n>',
 ' --jobs <n>',
 ' -p <n>',
 ' --pep8-passes <n>',
 ' -a',
 ' --aggressive',
 ' --experimental',
 ' --exclude <globs>',
 ' --list-fixes',
 ' --ignore <errors>',
 ' --select <errors>',
 ' --max-line-length <n>',
 ' --line-range <line> <line>',
 ' --range <line> <line>',
 ' --indent-size <str>',
 ' --hang-closing']

In [42]:
class ConfigurationFuzzer(GrammarCoverageFuzzer):
    def __init__(self, runner, *args, **kwargs):
        self.runner = runner
        grammar = runner.grammar()
        super().__init__(grammar, *args, **kwargs)

    def run(self, runner=None, inp=""):
        if runner is None:
            runner = self.runner
        invocation = runner.executable() + " " + self.fuzz()
        runner.set_invocation(invocation.split())
        return runner.run(inp)

In [43]:
conf_fuzzer = ConfigurationFuzzer(conf_runner, max_nonterminals=5)

In [44]:
conf_fuzzer.fuzz()

' --global-config M! foo.py'

In [45]:
conf_fuzzer.run(conf_runner)

(CompletedProcess(args=['/Users/zeller/anaconda3/bin/autopep8', '--exclude', 'To', '--recursive', '--select', '^c', '--version', 'foo.py'], returncode=0, stdout='autopep8 1.3.4 (pycodestyle: 2.4.0)\n', stderr=''),
 'PASS')

## MyPy

In [46]:
mypy = ConfigurationRunner("mypy", "foo.py")
print(mypy.grammar()["<option>"])

[' -h', ' --help', ' -v', ' --verbose', ' -V', ' --version', ' --config-file <str>', ' --warn-unused-configs', ' --no-warn-unused-configs', ' --ignore-missing-imports', ' --follow-imports <str>', ' --python-executable', ' --no-site-packages', ' --no-silence-site-packages', ' --python-version <x.y>', ' -2', ' --py2', ' --platform', ' --always-true', ' --always-false', ' --disallow-any-unimported', ' --disallow-subclassing-any', ' --allow-subclassing-any', ' --disallow-any-expr', ' --disallow-any-decorated', ' --disallow-any-explicit', ' --disallow-any-generics', ' --disallow-untyped-calls', ' --allow-untyped-calls', ' --disallow-untyped-defs', ' --allow-untyped-defs', ' --disallow-incomplete-defs', ' --allow-incomplete-defs', ' --check-untyped-defs', ' --no-check-untyped-defs', ' --warn-incomplete-stub', ' --no-warn-incomplete-stub', ' --no-implicit-optional', ' --implicit-optional', ' --strict-optional', ' --no-strict-optional', ' --strict-optional-whitelist<symbol-2-1>', ' --warn-redu

In [47]:
mypy_fuzzer = ConfigurationFuzzer(mypy, max_nonterminals=3)
for i in range(10):
    print(mypy_fuzzer.fuzz())

 foo.py
 --no-warn-redundant-casts --allow-untyped-decorators foo.py
 --disallow-untyped-decorators --warn-unused-ignores foo.py
 --check-untyped-defs --stats foo.py
 -V foo.py
 --cache-fine-grained --semantic-analysis-only --no-strict-boolean foo.py
 --show-traceback foo.py
 --verbose foo.py
 --no-warn-return-any foo.py
 --hide-error-context -h foo.py


In [48]:
notedown = ConfigurationRunner("notedown")
print(notedown.grammar()["<option>"])

[' -h', ' --help', ' -o<symbol-2-1>', ' --output<symbol-3-1>', ' --from <str>', ' --to <str>', ' --run', ' --execute', ' --timeout <str>', ' --strip', ' --precode<symbol-4-1>', ' --knit<symbol-5-1>', ' --rmagic', ' --nomagic', ' --render', ' --template <str>', ' --match <str>', ' --examples', ' --version', ' --debug']


In [49]:
notedown_fuzzer = ConfigurationFuzzer(notedown, max_nonterminals=3)
for i in range(10):
    print(notedown_fuzzer.fuzz())

 --strip
 --examples
 --execute
 --run
 --rmagic
 --render
 --help
 -h
 --version
 --nomagic


## Combinatorial Testing

\todo{Add}

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducing.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [50]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [51]:
# Some code for the solution
2 + 2

4

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_