Skip to content

Latest commit



641 lines (530 loc) · 25.9 KB

File metadata and controls

641 lines (530 loc) · 25.9 KB

libFuzzer Tutorial


In this tutorial you will learn how to use libFuzzer -- a coverage-guided in-process fuzzing engine.

You will also learn basics of AddressSanitizer -- a dynamic memory error detector for C/C++.

Prerequisites: experience with C/C++ and Unix shell.

Setup the environment

First, you should prepare the environment. We recommend to use a VM on GCE. You may also use your own Linux machine, but YMMV.


  • Login into your GCE account or create one.
  • Create a new VM and ssh to it
    • Ubuntu 16.04 is recommended, other VMs may or may not work
    • Choose as many CPUs as you can
    • Choose "Access scopes" = "Allow full access to all Cloud APIs"
  • Install dependencies:
# Install git and get this tutorial
sudo apt-get --yes install git
git clone fuzzing

# Get fuzzer-test-suite
git clone FTS

./fuzzing/tutorial/libFuzzer/  # Get deps
./fuzzing/tutorial/libFuzzer/ # Get fresh clang binaries

Verify the setup


clang++ -g -fsanitize=address,fuzzer fuzzing/tutorial/libFuzzer/
./a.out 2>&1 | grep ERROR

and make sure you see something like

==31851==ERROR: AddressSanitizer: heap-buffer-overflow on address...

'Hello world' fuzzer

Definition: a fuzz target is a function that has the following signature and does something interesting with its arguments:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  DoSomethingWithData(Data, Size);
  return 0;

Take a look at an example of such fuzz target: ./ Can you see the bug?

To build a fuzzer binary for this target you need to compile the source using the recent Clang compiler with the following extra flags:

  • -fsanitize=fuzzer (required): provides in-process coverage information to libFuzzer and links with the libFuzzer runtime.
  • -fsanitize=address (recommended): enables AddressSanitizer
  • -g (recommended): enables debug info, makes the error messages easier to read.

For example:

clang++ -g -fsanitize=address,fuzzer fuzzing/tutorial/libFuzzer/

Now try running it:


You will see something like this:

INFO: Seed: 3918206239
INFO: Loaded 1 modules (14 guards): [0x73be00, 0x73be38),
INFO: Loaded 1 PC tables (7 PCs): 7 [0x52f8c8,0x52f938),
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#0      READ units: 1
#1      INITED cov: 3 ft: 3 corp: 1/1b exec/s: 0 rss: 26Mb
#8      NEW    cov: 4 ft: 4 corp: 2/29b exec/s: 0 rss: 26Mb L: 28 MS: 2 InsertByte-InsertRepeatedBytes-
#3405   NEW    cov: 5 ft: 5 corp: 3/82b exec/s: 0 rss: 27Mb L: 53 MS: 4 InsertByte-EraseBytes-...
#8664   NEW    cov: 6 ft: 6 corp: 4/141b exec/s: 0 rss: 27Mb L: 59 MS: 3 CrossOver-EraseBytes-...
#272167 NEW    cov: 7 ft: 7 corp: 5/201b exec/s: 0 rss: 51Mb L: 60 MS: 1 InsertByte-
==2335==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000155c13 at pc 0x0000004ee637...
READ of size 1 at 0x602000155c13 thread T0
    #0 0x4ee636 in FuzzMe(unsigned char const*, unsigned long) fuzzing/tutorial/libFuzzer/
    #1 0x4ee6aa in LLVMFuzzerTestOneInput fuzzing/tutorial/libFuzzer/
artifact_prefix='./'; Test unit written to ./crash-0eb8e4ed029b774d80f2b66408203801cb982a60

Do you see a similar output? Congratulations, you have built a fuzzer and found a bug. Let us look at the output.

INFO: Seed: 3918206239

The fuzzer has started with this random seed. Rerun it with -seed=3918206239 to get the same result.

INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus

By default, libFuzzer assumes that all inputs are 4096 bytes or smaller. To change that either use -max_len=N or run with a non-empty seed corpus.

#0      READ units: 1
#1      INITED cov: 3 ft: 3 corp: 1/1b exec/s: 0 rss: 26Mb
#8      NEW    cov: 4 ft: 4 corp: 2/29b exec/s: 0 rss: 26Mb L: 28 MS: 2 InsertByte-InsertRepeatedBytes-
#3405   NEW    cov: 5 ft: 5 corp: 3/82b exec/s: 0 rss: 27Mb L: 53 MS: 4 InsertByte-EraseBytes-...
#8664   NEW    cov: 6 ft: 6 corp: 4/141b exec/s: 0 rss: 27Mb L: 59 MS: 3 CrossOver-EraseBytes-...
#272167 NEW    cov: 7 ft: 7 corp: 5/201b exec/s: 0 rss: 51Mb L: 60 MS: 1 InsertByte-

libFuzzer has tried at least 272167 inputs (#272167) and has discovered 5 inputs of 201 bytes total (corp: 5/201b) that together cover 7 coverage points (cov: 7). You may think of coverage points as of basic blocks in the code.

==2335==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000155c13 at pc 0x0000004ee637...
READ of size 1 at 0x602000155c13 thread T0
    #0 0x4ee636 in FuzzMe(unsigned char const*, unsigned long) fuzzing/tutorial/libFuzzer/
    #1 0x4ee6aa in LLVMFuzzerTestOneInput fuzzing/tutorial/libFuzzer/

On one of the inputs AddressSanitizer has detected a heap-buffer-overflow bug and aborted the execution.

artifact_prefix='./'; Test unit written to ./crash-0eb8e4ed029b774d80f2b66408203801cb982a60

Before exiting the process libFuzzer has created a file on disc with the bytes that triggered the crash. Take a look at this file. What do you see? Why did it trigger the crash?

To reproduce the crash again w/o fuzzing run

./a.out crash-0eb8e4ed029b774d80f2b66408203801cb982a60


Let us run something real. Heartbleed (aka CVE-2014-0160) was a critical security bug in the OpenSSL cryptography library. It was discovered in 2014, probably by code inspection. It was later demonstrated that this bug can be easily found by fuzzing.

fuzzer-test-suite contains ready-to-use scripts to build fuzzers for various targets, including openssl-1.0.1f where the 'heartbleed' bug is present.

To build the fuzzer for openssl-1.0.1f execute the following:

mkdir -p ~/heartbleed; rm -rf ~/heartbleed/*; cd ~/heartbleed

This command will download the openssl sources at the affected revision and build the fuzzer for one specific API that has the bug, see openssl-1.0.1f/

Try running the fuzzer:


You should see something like this in a few seconds:

==5781==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000009748 at pc 0x0000004a9817...
READ of size 19715 at 0x629000009748 thread T0
    #0 0x4a9816 in __asan_memcpy (heartbleed/openssl-1.0.1f+0x4a9816)
    #1 0x4fd54a in tls1_process_heartbeat heartbleed/BUILD/ssl/t1_lib.c:2586:3
    #2 0x58027d in ssl3_read_bytes heartbleed/BUILD/ssl/s3_pkt.c:1092:4
    #3 0x585357 in ssl3_get_message heartbleed/BUILD/ssl/s3_both.c:457:7
    #4 0x54781a in ssl3_get_client_hello heartbleed/BUILD/ssl/s3_srvr.c:941:4
    #5 0x543764 in ssl3_accept heartbleed/BUILD/ssl/s3_srvr.c:357:9
    #6 0x4eed3a in LLVMFuzzerTestOneInput FTS/openssl-1.0.1f/

Exercise: run the fuzzer that finds CVE-2016-5180. The experience should be very similar to that of heartbleed.

Seed corpus

So far we have tried several fuzz targets on which a bug can be found w/o much effort. Not all targets are that easy.

One important way to increase fuzzing efficiency is to provide an initial set of inputs, aka a seed corpus. For example, let us try another target: Woff2. Build it like this:

cd; mkdir -p woff; cd woff;

Now run it like you did it with the previous fuzz targets:


Most likely you will see that the fuzzer is stuck -- it is running millions of inputs but can not find many new code paths.

#1      INITED cov: 18 ft: 15 corp: 1/1b exec/s: 0 rss: 27Mb
#15     NEW    cov: 23 ft: 16 corp: 2/5b exec/s: 0 rss: 27Mb L: 4 MS: 4 InsertByte-...
#262144 pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 131072 rss: 45Mb
#524288 pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 131072 rss: 62Mb
#1048576        pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 116508 rss: 97Mb
#2097152        pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 110376 rss: 167Mb
#4194304        pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 107546 rss: 306Mb
#8388608        pulse  cov: 23 ft: 16 corp: 2/5b exec/s: 106184 rss: 584Mb

The first step you should make in such case is to find some inputs that trigger enough code paths -- the more the better. The woff2 fuzz target consumes web fonts in .woff2 format and so you can just find any such file(s). The build script you have just executed has downloaded a project with some .woff2 files and placed it into the directory ./seeds/. Inspect this directory. What do you see? Are there any .woff2 files?

Now you can use the woff2 fuzzer with a seed corpus. Do it like this:

./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/

When a libFuzzer-based fuzzer is executed with one more directory as arguments, it will first read files from every directory recursively and execute the target function on all of them. Then, any input that triggers interesting code path(s) will be written back into the first corpus directory (in this case, MY_CORPUS).

Let us look at the output:

INFO: Seed: 3976665814
INFO: Loaded 1 modules   (9611 inline 8-bit counters): 9611 [0x93c710, 0x93ec9b), 
INFO: Loaded 1 PC tables (9611 PCs): 9611 [0x6e8628,0x70ded8), 
INFO:        0 files found in MY_CORPUS/
INFO:       62 files found in seeds/
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 168276 bytes
INFO: seed corpus: files: 62 min: 14b max: 168276b total: 3896056b rss: 37Mb
#63     INITED cov: 632 ft: 1096 corp: 13/766Kb exec/s: 0 rss: 61Mb
        NEW_FUNC[0/1]: 0x5aae80 in TransformDictionaryWord...
#64     NEW    cov: 651 ft: 1148 corp: 14/832Kb exec/s: 0 rss: 63Mb L: 67832/68784 MS: 1 ChangeBinInt-
#535    NEW    cov: 705 ft: 1620 corp: 48/3038Kb exec/s: 0 rss: 162Mb L: 68784/68784 MS: 1 ChangeBinInt-
#288595 NEW    cov: 839 ft: 2909 corp: 489/30Mb exec/s: 1873 rss: 488Mb L: 62832/68784 MS: 1 ShuffleBytes-

As you can see, the initial coverage is much greater than before (INITED cov: 632) and it keeps growing.

The size of the inputs that libFuzzer tries is now limited by 168276, which is the size of the largest file in the seed corpus. You may change that with -max_len=N.

You may interrupt the fuzzer at any moment and restart it using the same command line. It will start from where it stopped.

How long does it take for this fuzzer to slowdown the path discovery (i.e. stop finding new coverage every few seconds)? Did it find any bugs so far?

Parallel runs

Another way to increase the fuzzing efficiency is to use more CPUs. If you run the fuzzer with -jobs=N it will spawn N independent jobs but no more than half of the number of cores you have; use -workers=M to set the number of allowed parallel jobs.

cd ~/woff
./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/ -jobs=8

On a 8-core machine this will spawn 4 parallel workers. If one of them dies, another one will be created, up to 8.

Running 4 workers
./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/  > fuzz-0.log 2>&1
./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/  > fuzz-1.log 2>&1
./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/  > fuzz-2.log 2>&1
./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/  > fuzz-3.log 2>&1

At this time it would be convenient to have some terminal multiplexer, e.g. GNU screen, or to simply open another terminal window.

Let's look at one of the log files, fuzz-3.log. You will see lines like this:

#17634  RELOAD cov: 864 ft: 2555 corp: 340/20Mb exec/s: 979 rss: 408Mb

Such lines show that this instance of the fuzzer has reloaded the corpus (only the first directory is reloaded) and found some new interesting inputs created by other instances.

If you keep running this target for some time (at the time of writing: 20-60 minutes on 4-8 cores) you will be rewarded by a nice security bug.

If you are both impatient and curious you may feed a provided crash reproducer to see the bug:

./woff2-2016-05-06-fsanitize_fuzzer ../FTS/woff2-2016-05-06/crash-696cb49b6d7f63e153a6605f00aceb0d7738971a

Do you see the same stack trace as in the original bug report?

See also Distributed Fuzzing


Another important way to improve fuzzing efficiency is to use a dictionary. This works well if the input format being fuzzed consists of tokens or have lots of magic values.

Let's look at an example of such input format: XML.

mkdir -p ~/libxml; rm -rf ~/libxml/*; cd ~/libxml

Now, run the newly built fuzzer for 10-20 seconds with and without a dictionary:

./libxml2-v2.9.2-fsanitize_fuzzer   # Press Ctrl-C in 10-20 seconds
./libxml2-v2.9.2-fsanitize_fuzzer -dict=afl/dictionaries/xml.dict  # Press Ctrl-C in 10-20 seconds

Did you see the difference?

Now create a corpus directory and run for real on all CPUs:

mkdir CORPUS
./libxml2-v2.9.2-fsanitize_fuzzer -dict=afl/dictionaries/xml.dict -jobs=8 -workers=8 CORPUS

How much time did it take to find the bug? What is the bug? How much time will it take to find the bug w/o a dictionary?

Take a look at the file afl/dictionaries/xml.dict (distributed with AFL). It is pretty self-explanatory. The syntax of dictionary files is shared between libFuzzer and AFL.


Fuzzing can be used to find bugs other than memory corruption. For example, take a look at the openssl-1.0.2d benchmark. The target function feeds the data to two different functions that are expected to produce the same result and verifies that.

mkdir -p ~/openssl-1.0.2d; rm -rf ~/openssl-1.0.2d/*; cd ~/openssl-1.0.2d
mkdir CORPUS; ./openssl-1.0.2d-fsanitize_fuzzer  -max_len=256 CORPUS -jobs=8 -workers=8

Did it crash? How?

Competing bugs

Sometimes there is one shallow (easy to find) bug in the target that prevents you from finding more bugs. The best approach in such cases is to fix the shallow bug(s) and restart fuzzing. However you can move forward a bit by simply re-starting libFuzzer many times. -jobs=1000 will do this for you.

mkdir -p ~/pcre2 ; rm -rf ~/pcre2/*; cd ~/pcre2
mkdir CORPUS
./pcre2-10.00-fsanitize_fuzzer -jobs=1000 -workers=8 CORPUS

After a minute or two look for the errors in the log files:

grep ERROR *.log | sort -k 3

You will see one paticular bug very often (which one?) but occasionally others will occur too.

Minimizing a corpus

The test corpus may grow to large sizes during fuzzing. Or you may be lucky to have a large seed corpus. In either way, you may want to minimize your corpus, that is to create a subset of the corpus that has the same coverage.

./your-fuzzer NEW_CORPUS OLD_CORPUS -merge=1

Do this with one of the fuzzers you have tried previosly.

The same flag can be used to merge new items into your existing corpus. Only the items that generate new coverage will be added.


Minimizing a reproducer

Often it is desirable to have a small reproducer (input that causes a crash). LibFuzzer has a simple builtin minimizer. Try to minimize the crash reproducer provided with the openssl-1.0.2d benchmark

This will try to iteratively minimize the crash reproducer by applying up to 10000 mutations on every iteration.

cd ~/openssl-1.0.2d
./openssl-1.0.2d-fsanitize_fuzzer \
  -minimize_crash=1 -runs=10000 \

Try this with one of the crashes you have found previously.

Visualizing Coverage

We recommend Clang Coverage to visualize and study your code coverage. A simple example:

# Build your code for Clang Coverage; link it against a standalone driver for running fuzz targets.
svn export Fuzzer
clang -fprofile-instr-generate -fcoverage-mapping ~/fuzzing/tutorial/libFuzzer/ \
mkdir CORPUS # Create an empty corpus dir.
echo -n A > CORPUS/A && ./a.out CORPUS/* && \
             llvm-profdata merge -sparse *.profraw -o default.profdata && \
             llvm-cov show a.out -instr-profile=default.profdata -name=FuzzMe


echo -n AAA > CORPUS/AAA && ./a.out CORPUS/* && ... 


echo -n FAA > CORPUS/FAA && ./a.out CORPUS/* && ... 


echo -n FUA > CORPUS/FUA && ./a.out CORPUS/* && ... 


echo -n FUZA > CORPUS/FUZA && ./a.out CORPUS/* && ... 


Other sanitizers

AddressSanitizer is not the only dynamic testing tool that can be combined with fuzzing. At the very least try UBSan. For example, add -fsanitize=signed-integer-overflow -fno-sanitize-recover=all to the build flags for the pcre2 benchmark and do some more fuzzing. You will see reports like this:

src/pcre2_compile.c:5506:19: runtime error: signed integer overflow: 1111111411 * 10 cannot be represented in type 'int'

In some cases you may want to run fuzzing w/o any additional tool (e.g. a sanitizer). This will allow you to find only the simplest bugs (null dereferences, assertion failures) but will run faster. Later you may run a sanitized build on the generated corpus to find more bugs. The downside is that you may miss some bugs this way.

Other fuzzing engines

Take a look at the fuzz targets that you have experimented with so far: 1, 2, 3, 4, 5, 6, 7.

There is nothing in these fuzz targets that makes them tied to libFuzzer -- there is just one function that takes an array of bytes as a parameter. And so it is possible, and even desirable, to fuzz the same targets with different other fuzzing engines.

For example you may fuzz your target with other guided fuzzing engines, such as AFL (instructions) or honggfuzz. Or even try other approaches, such as un-guided test mutation (e.g. using Radamsa).

When using multiple fuzzing engines make sure to exchange the corpora between the engines -- this way the engines will be helping each other. You can do it using the libFuzzer's -merge= flag.

Distributed Fuzzing

What if I want to fuzz one specific target on more CPUs than any single VM has? That's easy: you may store the corpus on some cloud storage system and synchronize it back and forth.

Example (using GCS):

  • Make sure you've used "Allow full access to all Cloud APIs" when creating your GCE VM. If you didn't, create a new VM.
  • (In the browser) Go to and create a new bucket (let it's name be $GCS_BUCKET)
  • Create a directory in your cloud bucket named CORPUS:
  • (In the browser), click 'REFRESH', verify that you see the new directory with EMPTY_FILE in it.
  • Create a local directory named CORPUS and do some fuzzing:
cd ~/pcre2
mkdir CORPUS
./pcre2-10.00-fsanitize_fuzzer CORPUS/ -runs=10000
  • Now CORPUS has some files. Synchronize it with the cloud directory:
gsutil -m rsync  CORPUS  gs://$GCS_BUCKET/CORPUS/
  • Check that you can see the new files:
gsutil ls gs://$GCS_BUCKET/CORPUS/
  • Congratulations, you have just saved your corpus to cloud storage. But this is not all the fun. Now you can synchronize it back to the local disk and fuzz again.
gsutil -m rsync  gs://$GCS_BUCKET/CORPUS/ CORPUS
  • If several VMs do this simultaneously you get distributed fuzzing.

In practice this is slightly more complicated than that. If you blindly synchronize the corpus between workers the corpus may grow to unmanageable sizes. The simplest suggestion is to first fuzz on a single machine, then minimize the corpus, uploaded it to cloud, and only then start fuzzing on many VMs. Even better is to periodically minimize the corpus and update it in the cloud.

Continuous fuzzing

One-off fuzzing might find you some bugs, but unless you make the fuzzing process continuous it will be a wasted effort.

A simple continuous fuzzing system could be written in < 100 lines of bash code. In an infinite loop do the following:

  • Pull the current revision of your code.
  • Build the fuzz target
  • Copy the current corpus from cloud to local disk
  • Fuzz for some time.
    • With libFuzzer, use the flag -max_total_time=N to set the time in seconds).
  • Synchronize the updated corpus back to the cloud
  • Provide the logs, coverage information, crash reports, and crash reproducers via e-mail, web interface, or cloud storage.


Some features (or bugs) of the target code may complicate fuzzing and hide other bugs from you.


Out-of-memory (OOM) bugs slowdown in-process fuzzing immensely. By default libFuzzer limits the amount of RAM per process by 2Gb.

Try fuzzing the woff benchmark with an empty seed corpus:

cd ~/woff
./woff2-2016-05-06-fsanitize_fuzzer NEW_CORPUS -jobs=8 -workers=8

Pretty soon you will hit an OOM bug:

==30135== ERROR: libFuzzer: out-of-memory (used: 2349Mb; limit: 2048Mb)
   To change the out-of-memory limit use -rss_limit_mb=<N>

   Live Heap Allocations: 3749936468 bytes from 2254 allocations; showing top 95%
   3747609600 byte(s) (99%) in 1 allocation(s)
   #6 0x62e8f6 in woff2::ConvertWOFF2ToTTF src/
   #7 0x660731 in LLVMFuzzerTestOneInput FTS/woff2-2016-05-06/

The benchmark directory also contains a reproducer for the OOM bug. Find it. Can you reproduce the OOM?

Sometimes using 2Gb per one target invocation is not a bug, and so you can use -rss_limit_mb=N to set another limit.`


Memory leaks are bugs themselves, but if they go undetected they cause OOMs during in-process fuzzing.

When combined with AddressSanitizer or LeakSanitizer libFuzzer will attempt to find leaks right after every executed input. If a leak is found libFuzzer will print the warning, save the reproducer on disk and exit.

However, not all leaks are easily detectable as such and if they evade LeakSanitizer libFuzzer will eventually die with OOM (see above).


Timeouts are equally bad for in-process fuzzing. If some input takes more than 1200 seconds to run libFuzzer will report a "timeout" error and exit, dumping the reproducer on disk. You may change the default timeout with -timeout=N.

Slow inputs

libFuzzer distinguishes between slow and very slow inputs. Very slow inputs will cause timeout failures while just slow will be reported during the run (with reproducers dumped on disk) but will not cause the process to exit. Use -report_slow_units=N to set the threshold for just slow units.

Advanced Topics

Related links