Fuzzing hexml.c with afl #6

Closed
thoughtpolice opened this Issue Dec 14, 2016 · 16 comments

Projects

None yet

2 participants

@thoughtpolice
thoughtpolice commented Dec 14, 2016 edited

hexml.c seems to have quite a few vulnerabilities that could lead to code execution, according to AFL. Here's how to start with that:


Install afl, and some of the extra tools, like libdislocator (a hardened memory allocator)

$ wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz
$ tar xf afl-latest.tgz
$ cd afl-*
$ make
$ cd llvm_mode && make && cd ../
$ sudo make install
$ cd libdislocator
$ make && sudo make install

What this does:

  1. Installs AFL
  2. Builds afl-clang-fast, a tool built on LLVM, which uses a compiler plugin to instrument the compiled code. afl-clang-fast results in much faster code than using the traditional afl-gcc tool, which is vital for fast fuzzing. You will need the LLVM development tools, and clang installed for your distro.
  3. Builds and installs libdislocator.so, which is a tool for hardening memory allocations while you're fuzzing your program to help find more bugs.

Start with this fuzzing harness:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hexml.c"

int main(int ac, char** av)
{
#ifdef __AFL_HAVE_MANUAL_CONTROL
  __AFL_INIT();
#endif

  while(__AFL_LOOP(1000)) {
    FILE *f = fopen(av[1], "rb");
    fseek(f, 0, SEEK_END);
    size_t fsize = ftell(f);
    rewind(f);

    char* string = malloc(fsize+1);
    memset(string, 0, fsize+1);
    (void)fread(string, fsize, 1, f);
    fclose(f);

    document *doc = document_parse(string, fsize);
    document_free(doc);
    free(string);
  }
  return 0;
}

And you can compile it:

$ AFL_HARDEN=1 afl-clang-fast -O2 fuzz.c 
afl-clang-fast 2.35b by <lszekeres@google.com>
afl-llvm-pass 2.35b by <lszekeres@google.com>
[+] Instrumented 277 locations (hardened mode, ratio 100%).

AFL_HARDEN=1 means the compiler uses even more anti-exploit tools (detecting stack overflows, etc) which causes a very minor performance hit, but finds more bugs.


Create the initial, starting corpus. This is what AFL begins with as it tries to find bugs. You technically don't need this, it's just a nice starting point.

$ mkdir corpus
$ cat > corpus/start.xml
<a b="c">d</a>
$

Also, grab the xml dictionary to help improve the fuzzing. Fuzzers like AFL can use "language dictionaries" which describe the syntax of the file format you're investigating. AFL & other tools then use dictionaries to help guide the input mutation and generation, because it knows what syntax is valid. This improves performance and finds bugs even faster.

AFL comes with an xml.dict dictionary you can use after you install it.

$ cp /usr/local/share/afl/dictionaries/xml.dict .

Now, begin fuzzing:

$ AFL_PRELOAD=/usr/local/lib/afl/libdislocator.so afl-fuzz -T hexml -x $PWD/xml.dict -i $PWD/corpus -o $PWD/results -- $PWD/a.out @@

What this does:

  1. Tells AFL to load libdislocator.so to harden allocations.
  2. Starts the fuzzer with the given dictionary (xml.dict) using the -x flag.
  3. -i $PWD/corpus tells afl where to look for the initial starting set of inputs.
  4. -o $PWD/results tells afl where to put all the resulting crashes, hangs, etc.
  5. The syntax -- $PWD/a.out @@ means "Invoke a.out and replace @@ with the name of a filename afl generates and passes as the first argument." This means afl just creates temporary files and feeds them to the harness.

That should be about it! You should begin getting bugs very, very quickly. I'll also post a quick follow up on minimizing the corpus, but this should be good to start with.

image

@thoughtpolice

After you've generated a bunch of bugs, you can minimize the corpus. This prunes away any duplicate inputs which result in the same crashes. This tool is called afl-cmin:

$ AFL_PRELOAD=/usr/local/lib/afl/libdislocator.so afl-cmin -i results/crashes/ -o results.shrunk -- $PWD/a.out @@

Same idea as before, except using the afl-cmin ("Corpus minimizer")


Next, you should minimize the individual test cases. AFL will explore each, uniquely crashing input, and find a smaller input that results in the same crash.

Always minimize the corpus before the indidividual tests. Minimizing the corpus is very efficient, but individual tests can take much longer. So, save yourself the time.

$ mkdir results.min
$ for x in `ls results.shrunk`; do AFL_PRELOAD=/usr/local/lib/afl/libdislocator.so afl-tmin -i results.shrunk/$x -o results.min/$x -- $PWD/a.out @@; done

Same idea, except afl-tmin requires individual files as the input, not a directory. Hence, the for loop. This improved the size of several tests, e.g. shaving off 20% to 30% of the test case size.


Finally, as a bonus: afl comes with a very useful tool called afl-analyze which can help you pinpoint what bytes in the input are relevant. For example, given an input, that triggers a crash:

$ AFL_PRELOAD=/usr/local/lib/afl/libdislocator.so afl-analyze -i results.shrunk/id:000013,sig:11,src:000002,op:havoc,rep:4 -- ./a.out @@

I get:

image

Which basically tries to summarize what parts of the input are relevant, and is more useful if you're exploring foreign code, but at a glance it can help.

@ndmitchell
Owner

Thanks for your excellent instructions. Using that, I was able to spot a few issues. However, AFL is still telling me there are ~30 issues remaining, and I can't see them. There is a test file which contains literally <! alone - a great test case. But I am unable to make that test case fail. I tried:

$ AFL_PRELOAD=/usr/local/lib/afl/libdislocator.so ./a.out results.min/id\:000010\,sig\:06\,src\:000074\,op\:flip4\,pos\:2
$ echo $?
0

So it seems using the binary I was using to fuzz that this test case works just fine. What step have I missed?

@ndmitchell
Owner
ndmitchell commented Dec 14, 2016 edited

Running the above under valgrind also raises no errors.

@thoughtpolice
thoughtpolice commented Dec 14, 2016 edited

Ah, there seems to be a bug in the test harness that causes some false positive crashes. I'd have to look closer at the hexml.c source to see why, but this introduces unpredictability in the actual fuzzing process.

Try this copy of fuzz.c instead, which gets rid of "Deferred initialization" and "fork server mode" features.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hexml.c"

int main(int ac, char** av)
{
//#ifdef __AFL_HAVE_MANUAL_CONTROL
//  __AFL_INIT();
//#endif

//  while(__AFL_LOOP(1000)) {
    FILE *f = fopen(av[1], "rb");
    fseek(f, 0, SEEK_END);
    size_t fsize = ftell(f);
    rewind(f);

    char* string = malloc(fsize+1);
    memset(string, 0, fsize+1);
    (void)fread(string, fsize, 1, f);
    fclose(f);

    document *doc = document_parse(string, fsize);
    document_free(doc);
    free(string);
//  }

  return 0;
}

This will execute at only 50% of the original speed, give or take. But you can also run multiple fuzzers simultaneously to regain performance...

Also keep in mind AFL has memory limits in place via ulimit when it runs these executables, which you may also need to replicate for a test to crash (e.g. if it causes a pathological amount of memory to be allocated): https://github.com/mirrorer/afl/blob/62b46d57f229ba2030a014484e7b1d73e8e02752/docs/README#L239

@thoughtpolice

From my experiments, with the changed fuzz.c, it seems I can reproduce every single crash it finds by trivially invoking the executable.

Also, when you run the test individually outside of afl-fuzz -- do not use AFL_PRELOAD! Use LD_PRELOAD. The only reason AFL_PRELOAD exists is so that it can map to either LD_PRELOAD or DYLD_INSERT_LIBRARIES depending on the platform, since AFL works on macOS, so it's a consistent interface. But with the example you gave, libdislocator.so will not be loaded.

@ndmitchell
Owner

Note that I've updated the code based on your initial examples, so some of the errors may have been removed. Running without the server fork loop I'm seeing no crashes thus-far, after ~5 mins.

@thoughtpolice

That's good! I just updated my copy of hexml.c from the latest version in the repo, and my own fuzzer instance isn't finding anything else so far, either.

@ndmitchell
Owner

Using a __AFL_INIT version, I find lots of errors quickly, and then why I try and reproduce (now with LD_PRELOAD) they don't reproduce, even under valgrind as well. Using a non-__AFL_INIT I have yet to find any errors - but will keep running.

@thoughtpolice
thoughtpolice commented Dec 14, 2016 edited

Yes, those are probably red herrings. It's unclear why they manifest, but the basic idea is that AFL essentially starts a process before main and then uses the clone syscall to create all the copies to amortize the process creation time. __AFL_INIT says "Do not stop before main, instead, stop and clone from this point". For example, if your program had to parse a configuration file at startup and that was expensive, you could put __AFL_INIT after the configuration setup, so that you don't have to pay for it over and over again. However, this can result in invalid crashes if there's some kind of global state going on.

__AFL_LOOP on the other hand is meant to make the fuzzer execute one executable N amount of times, before killing it and restarting. So in the above example, afl would call clone right after main, and then it would execute that single cloned instance 1000 times before killing the process, and creating a new clone. This further reduces the need to clone processes.

Together, that's why these both see a big speed boost. However, they can be very sensitive to global state and result in false positives. It would be nice to fix this but removing it, at the moment, is the easiest way to move forward.

FWIW, when you start afl-fuzz on this binary with those features enabled, before throwing up the UI, afl-fuzz says:

[!] WARNING: Instrumentation output varies across runs.

Which means that there's some level of unpredictability. Sometimes this can be due to things like thread scheduling or other stuff, but other times it's an indication the test harness is actually not stable.

Discovering why this is the case is an exercise left to the reader.

@ndmitchell
Owner

Keeping __AFL_INIT and avoiding __AFL_LOOP seems to work, so I've gone with that. I can't see anything that makes __AFL_LOOP look dodgy, but evidence disagrees.

In my most recent run I got 1 hang reported, but the output directory had nothing under hang/, so I didn't have a clue how to reproduce it. Any ideas?

@ndmitchell
Owner

I reproduced the hang in a run that actually did write out some output. I've got 1 unique hang of 1 total. Running over that input again doesn't produce a hang - which seems like it should be pretty obvious if it was there.

@thoughtpolice
thoughtpolice commented Dec 14, 2016 edited

"Hang" is relative, it doesn't mean it looped forever, it simply means the executable look longer than the timeout for a particular invocation. The timeout is scaled based on observed behavior, and defaults from anything from 50ms to 1000ms. If the vast majority of cases crash or pass within like 75ms, one test case taking in the range of like 300ms will qualify as a hang.

You can control the time limit for hangs using the -t option to afl-fuzz. Try setting it to 1000 or something and you probably won't see any more hangs.

@ndmitchell
Owner

Ah, I'm running in a Ubuntu VM with a Windows host - I suspect Windows had a very brief pause and that resulted in it looking like a hang. Seems definitely to be a false positive.

@ndmitchell ndmitchell added a commit that referenced this issue Dec 14, 2016
@ndmitchell #6, update the changelog 9035502
@ndmitchell
Owner

I've released 0.2 which has all the AFL inspired fixes. To check, I did:

  • Add a few more XML documents I had been using for benchmarking and bug fixing into the corpus.
  • Use __AFL_INIT but not __AFL_LOOP.
  • Ran 50M test cases over several hours, without a single failure.
  • Reverted back to the code before I made the AFL-inspired fixes to check that they would still have been founds without __AFL_LOOP and encountered 13 unique crashes in < 2 mins, which I could reproduce segfaulting normally.

That convinces me that there were issues, AFL caught them, and now there are probably no AFL-findable issues left.

Next step is to integrate it into Travis, roughly following http://stackoverflow.com/questions/32238907/apply-american-fuzzy-lop-as-a-part-of-travis-ci, probably.

@ndmitchell ndmitchell added a commit that referenced this issue Dec 22, 2016
@ndmitchell #6, add a fuzzing testbed d07e252
@ndmitchell
Owner

Doing this in Travis doesn't seem particularly feasible, if the internet is to be believed. I've added afl.sh which automates the standard pieces, so now after any C change I'll go in and do sh afl.sh and hopefully that's enough for now.

@ndmitchell ndmitchell added a commit that referenced this issue Dec 22, 2016
@ndmitchell #6, fix up core pattern f2e8131
@ndmitchell
Owner

It turned out not to be all that hard to run it on Travis - I just do a 5 minute timeout and then grep to check the unique crashes are empty.

@ndmitchell ndmitchell closed this Dec 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment