Skip to content
This repository has been archived by the owner on Mar 29, 2022. It is now read-only.

Commit

Permalink
Tools for fuzzing HTTPolice with american fuzzy lop
Browse files Browse the repository at this point in the history
  • Loading branch information
vfaronov committed Mar 24, 2017
1 parent 3e9e89a commit 573b7f4
Show file tree
Hide file tree
Showing 5 changed files with 104 additions and 1 deletion.
2 changes: 1 addition & 1 deletion pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ overgeneral-exceptions=

[TYPECHECK]

ignored-modules=lxml.etree
ignored-modules=lxml.etree,afl
generated-members=__subclasses__
53 changes: 53 additions & 0 deletions tools/afl/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
Fuzzing HTTPolice with american fuzzy lop
=========================================

As a program that does complex processing on compact input files,
HTTPolice is amenable to testing with a dumb fuzzer such as
`american fuzzy lop`_.

As of this writing, all the bugs thus found have been discovered while
fuzzing original (valid) examples -- that is, AFL's genetic algorithm
has yet to bear fruit -- but the examples it produces do look interesting,
so perhaps if one were to run AFL seriously for a longer period of time,
more subtle bugs could be uncovered.

.. _american fuzzy lop: http://lcamtuf.coredump.cx/afl/


How to run
----------

Install american fuzzy lop as appropriate for your platform, for example::

$ sudo apt-get install afl

Set up the environment::

$ pip install Cython
$ pip install python-afl
$ fuzz_path=/some/working/directory
$ mkdir -p $fuzz_path/examples

Prepare examples for AFL::

$ tools/afl/prepare_examples.sh -n 100 $fuzz_path/examples/

Run it::

$ AFL_NO_VAR_CHECK=1 \\
> py-afl-fuzz -m 1000 -t 1000 \\
> -i $fuzz_path/examples/ -o $fuzz_path/results/ -f $fuzz_path/input \\
> -d -x tools/afl/http-tweaks.dict \\
> -- python tools/afl/harness.py -i combined -o html $fuzz_path/input

Almost all paths are considered 'variable' by AFL. I'm not sure why.
Clearing the parser memo (``Stream._cache``) on every run doesn't help,
but maybe it's the memoization in grammar symbols. Anyway, this variability
doesn't seem to be a problem, and setting ``AFL_NO_VAR_CHECK=1`` greatly
reduces the time spent on calibration.

Remove the ``-d`` option if you have patience.

The crash deduper in AFL is conservative enough that it reports many more
'unique' crashes than are actually unique, so don't be alarmed by numbers
like 50 or 100. Also, a fair number of spurious hangs is detected.
15 changes: 15 additions & 0 deletions tools/afl/harness.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# -*- coding: utf-8; -*-

import os
import sys

import httpolice.cli

args = httpolice.cli.parse_args(sys.argv)

import afl
while afl.loop(100):
httpolice.cli.run_cli(args, sys.stdout, sys.stderr)

# As suggestd by python-afl docs.
os._exit(0) # pylint: disable=protected-access
15 changes: 15 additions & 0 deletions tools/afl/http-tweaks.dict
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# This dictionary was created to guide american fuzzy lop towards breaking
# the *existing* examples in ways that yield interesting output from parsers,
# as opposed to just patently invalid files. It seems to work as intended,
# but no bugs have been uncovered so far with these tweaks.

line_terminator="\x0D\x0A"
header_field_separator=": "
space=" "
semicolon=";"
comma=","
double_quote="\""
equals="="
slash="/"
asterisk="*"
zero="0"
20 changes: 20 additions & 0 deletions tools/afl/prepare_examples.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/sh

# Take a sample of test cases from ``test/combined_data/``.
# The default sample size of 100 ensures that we seed the fuzzer with
# examples of many different protocol features (headers, methods etc.).
# Alternatively, maybe one could use a dictionary and try to make AFL
# generate these features on its own, but I don't have the time for that.

n=100
getopts n: OPT && [ "$OPT" = "n" ] && n=$OPTARG
shift $(( OPTIND - 1 ))

output_dir=$1

find test/combined_data/ | shuf -n "$n" | while read fn
do
name="$( basename "$fn" )"
test "$( stat -c %s "$fn" )" -lt 1000 || continue # Skip large examples
grep -vE --text '^# ' "$fn" >"$output_dir/$name" # Remove comments
done

0 comments on commit 573b7f4

Please sign in to comment.