Files in subdirs of $OUT are treated as fuzzers #4575

a1xndr · 2020-10-29T14:07:53Z

Hello,
QEMU fuzz targets are all built into a single binary. To make this work with oss-fuzz we select the fuzz-target through argv0. We are using hard-links to cut down on wasted space. Thus, the resulting build looks something like:

$ ls $OUT
qemu-fuzz-i386-target-generic-fuzz-virtio-blk -> bin/qemu-fuzz-i386
qemu-fuzz-i386-target-generic-fuzz-virtio-balloon -> bin/qemu-fuzz-i386
...

This broke oss-fuzz coverage builds because oss-fuzz treats bin/qemu-fuzz-i386 as a fuzzer, though it cannot run standalone, since the target isn't selected by argv0.
https://oss-fuzz-build-logs.storage.googleapis.com/log-a4f1b2f3-f07f-4ff2-a0af-001cc1cf2058.txt

Is oss-fuzz supposed to be looking for fuzzers in sub-directories of $OUT?
It seems the behavior of get_fuzz_targets() changed in 9ab8312 ("Remove walk from utils.py (#3561)")

Thanks

The text was updated successfully, but these errors were encountered:

inferno-chromium · 2020-10-31T17:02:40Z

That CL didnt change behavior. Both before and after, we still use os.walk, that cl description is wrong and wasnt updated. Also, CF code does traverse inside all dirs. Can you remove the duplicate hard link and keep only original ? You can even move the files to root of $OUT in build.sh ?

a1xndr · 2020-10-31T19:30:17Z

I think it did change the behavior. The original usage of os.walk was buggy:

$ mkdir test_root
$ mkdir test_root/subdir
$ touch test_root/subdir/not-a-fuzzer
$ ln test_root/subdir/not-a-fuzzer test_root/fuzzer-a
$ ln test_root/subdir/not-a-fuzzer test_root/fuzzer-b

$ python3
Python 3.7.5 (default, Oct 27 2019, 15:43:29)
[GCC 9.2.1 20191022] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> path = "./test_root"
>>>
>>> print("BEFORE:")
BEFORE:
>>> for root, _, _ in os.walk(path):
...   for filename in os.listdir(path):
...     file_path = os.path.join(root, filename)
...     print(file_path)
...
./test_root/fuzzer-a
./test_root/subdir
./test_root/fuzzer-b
./test_root/subdir/fuzzer-a
./test_root/subdir/subdir
./test_root/subdir/fuzzer-b
>>> print("AFTER:")
AFTER:
>>> for root, _, fuzzers in os.walk(path):
...   for fuzzer in fuzzers:
...     file_path = os.path.join(root, fuzzer)
...     print(file_path)
...
./test_root/fuzzer-a
./test_root/fuzzer-b
./test_root/subdir/not-a-fuzzer

Though in "BEFORE" we have some bogus paths, none of them are ./test_root/subdir/not-a-fuzzer. In "AFTER", all the paths are valid, but now they include ./test_root/subdir/not-a-fuzzer

By "remove the duplicate hard link and keep only original", do you mean (for the example above), to remove "./test_root/subdir/not-a-fuzzer" and make ./test_root/fuzzer-{b,c,d,...} links to fuzzer-a ?
Thanks

inferno-chromium · 2020-10-31T20:26:58Z

Dont need to touch any non-fuzzer files.
i mean you dont need to create additional links in root dir, just leave the ones in bin/ and it should all auto-work.

a1xndr · 2020-10-31T21:19:30Z

Dont need to touch any non-fuzzer files.
i mean you dont need to create additional links in root dir, just leave the ones in bin/ and it should all auto-work.

Ah but we don't have additional links - Just one base binary in bin/ and ~28 links to it in the root. The 28 links have special names that select and run a fuzz target. The base binary in bin/ doesn't have a special name so it doesn't do any fuzzing. Since it seems that the current behavior of get_fuzz_targets is intended, we can work around this by a.) making copies instead of links, or b.) giving the base binary a special name, so it performs fuzzing.

a1xndr · 2020-11-01T20:41:14Z

I think we will simply rename bin/qemu-fuzz-i386 to bin/qemu-fuzz-i386.base, then it won't pass the check against ALLOWED_FUZZ_TARGET_EXTENSIONS.

Re-opening the issue, since I noticed that the behavior of python infra/helper.py check_build does not match the way OSS-Fuzz finds fuzzers with get_fuzz_targets:

from infra/base-images/base-runner/test_all:

for FUZZER_BINARY in $(find $TMP_FUZZER_DIR -maxdepth 1 -executable -type f); do
...

This doesn't check fuzzers in sub-directories. Thus, as in our case, it is possible to pass all the local checks but end up with crashes and build-failures on OSS-Fuzz

inferno-chromium · 2020-11-01T21:21:16Z

@oliverchang @Dor1s - thoughts on this behavior difference, shouldnt we be traversing sub-dirs for test_all ?

oliverchang · 2020-11-01T22:59:43Z

Yes, we should make this behaviour the same. We also have libClusterFuzz coming which might help -- @Dor1s do you think we can move more of these checks into libClusterFuzz? (Filed #4585)

oliverchang · 2020-11-02T05:22:45Z

Another issue: llvm_libcxx is failing on ClusterFuzz due to the targets having a .cpp extension. This was also not caught by the build infra here due to the behaviour difference.

Filed #4586 for that.

Dor1s · 2020-11-02T18:34:28Z

First of all I'd strongly advise against using a single binary to execute multiple fuzz targets. Fuzz targets should be as much isolated as possible (i.e. less unreacheable coverage and dead code).

Is disk space really a concern for you? Using different binaries for different fuzz targets is a foolproof solution that ensures fuzz targets are well recognized and run by pretty much any infra, be it CF bots, OSS-Fuzz builders / CI bots, CIFuzz, any other system(s) you decide to run you fuzzers on.

As for the overall OSS-Fuzz logic, large +1 to rely on a unified implementation in libClusterFuzz.

Dor1s · 2020-11-02T19:03:41Z

Regarding recursive traversing of the $OUT, OSS-Fuzz documentation says

Resulting binaries should be placed in $OUT.

from https://google.github.io/oss-fuzz/getting-started/new-project-guide/#buildsh

To me that implies that sub-directories are not (or at least "may not") be supported and the binaries should be placed directly into $OUT. Please let us know if that sounds ambiguous -- we can make it more explicit in such case.

The fact that ClusterFuzz has a more sophisticated implementation doesn't mean that OSS-Fuzz would support all of it. ClusterFuzz is widely deployed inside and outside of Google and therefore it has to be robust and flexible. In OSS-Fuzz we tend to keep things strict but sufficient, given that it's mostly a self-service platform. Placing fuzz target binaries in subdirs within $OUT is an undocumented behavior and should be avoided as well.

a1xndr · 2020-11-02T19:07:26Z

It is true that there are a couple unused fuzzers that get linked into the resulting binary, but I haven't found a good way to build them separately that would play nicely with QEMU's build system. In any case, those fuzzers should be a tiny fraction of the ~1200 objects linked into the final binary. For libfuzzer, I don't think this should be a huge issue, since the coverage-bitmap is lossless.
However, most of our fuzzers are based on the exact same fuzzing code. The only difference is that they specify different arguments used to launch the QEMU vm and select the devices to fuzz.
For example, qemu-fuzz-i386-target-generic-fuzz-virtio-net fuzzes a QEMU VM with a virtio-net device attached, while qemu-fuzz-i386-target-generic-fuzz-e1000e fuzzes a VM with an e1000e device attached. In the background, they rely on the same fuzzing code. For now it is probably fine to keep copies of the same binary, however I think space might become a larger concern, when people add more configurations to our generic-fuzzer. Each ASAN binary weighs ~180MB. For 50 configurations, that is a total of 9 GB of duplicate fuzzing binaries. Not only is this a waste of space for OSS-Fuzz, but it also complicates our local testing.

One other option would be to do something like firefox, and make thin setenv+execve wrappers around the fuzzers, that trick OSS-Fuzz into thinking they are real fuzzers, though this solution doesn't seem much prettier to me:
https://github.com/google/oss-fuzz/blob/master/projects/firefox/target.c

Thanks for looking at this

Dor1s · 2020-11-02T20:16:20Z

since the coverage-bitmap is lossless

That's why inflated targets have a worse performance, as libFuzzer iterates through the whole coverage map after every execution.

I understand your argument that the most of the code is the same. There's probably no value in debating how much of the same code is ok. In any case it's not the best practice to use a single binary for multiple fuzz targets.

To reduce disk space, can you use dynamic linking? Supposedly most of the core functionality can be compiled as shared object(s), and fuzz target binaries would be rather small harnesses linked against the rest of the project.

Please check out that section regarding nuances of dynamic linking in OSS-Fuzz: https://google.github.io/oss-fuzz/further-reading/fuzzer-environment/#runtime-dependencies

a1xndr · 2020-11-03T15:06:50Z

We already do the $ORIGIN/-rpath dance for our shared libraries, so I don't think it should be a problem to put most of the code into another shared object. I will look into that option more - Thanks

We switched to hardlinks in a942f64 ("scripts/oss-fuzz: use hardlinks instead of copying") The motivation was to conserve space (50 fuzzers built with ASAN, can weigh close to 9 GB). Unfortunately, OSS-Fuzz (partially) treated the underlying copy of the fuzzer as a standalone fuzzer. To attempt to fix, we tried: f8b8f37 ("scripts/oss-fuzz: rename bin/qemu-fuzz-i386") This was also not a complete fix, because though OSS-Fuzz ignores the renamed fuzzer, the underlying ClusterFuzz, doesn't: https://storage.googleapis.com/clusterfuzz-builds/qemu/targets.list.address https://oss-fuzz-build-logs.storage.googleapis.com/log-9bfb55f9-1c20-4aa6-a49c-ede12864eeb2.txt (clusterfuzz still lists qemu-fuzz-i386.base as a fuzzer) This change keeps the hard-links, but makes them all point to a file with a qemu-fuzz-i386-target-.. name. If we have targets, A, B, C, the result will be: qemu-fuzz-i386-target-A (base file) qemu-fuzz-i386-target-B -> qemu-fuzz-i386-target-A qemu-fuzz-i386-target-C -> qemu-fuzz-i386-target-A The result should be that every file that looks like a fuzzer to OSS-Fuzz/ClusterFuzz, can run as a fuzzer (we don't have a separate base copy). Unfortunately, there is not simple way to test this locally. In the future, it might be worth it to link the majority of QEMU in as a shared-object (see google/oss-fuzz#4575 ) Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Message-Id: <20201108171136.160607-1-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>

Dor1s · 2020-11-18T22:40:53Z

Looks like this can be closed. Migration to libClusterFuzz is tracked as #4585

a1xndr closed this as completed Oct 31, 2020

a1xndr reopened this Nov 1, 2020

inferno-chromium assigned oliverchang and Dor1s Nov 1, 2020

Dor1s closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files in subdirs of $OUT are treated as fuzzers #4575

Files in subdirs of $OUT are treated as fuzzers #4575

a1xndr commented Oct 29, 2020

inferno-chromium commented Oct 31, 2020

a1xndr commented Oct 31, 2020 •

edited

Loading

inferno-chromium commented Oct 31, 2020

a1xndr commented Oct 31, 2020

a1xndr commented Nov 1, 2020

inferno-chromium commented Nov 1, 2020

oliverchang commented Nov 1, 2020 •

edited

Loading

oliverchang commented Nov 2, 2020 •

edited

Loading

Dor1s commented Nov 2, 2020

Dor1s commented Nov 2, 2020

a1xndr commented Nov 2, 2020 •

edited

Loading

Dor1s commented Nov 2, 2020

a1xndr commented Nov 3, 2020

Dor1s commented Nov 18, 2020

Files in subdirs of $OUT are treated as fuzzers #4575

Files in subdirs of $OUT are treated as fuzzers #4575

Comments

a1xndr commented Oct 29, 2020

inferno-chromium commented Oct 31, 2020

a1xndr commented Oct 31, 2020 • edited Loading

inferno-chromium commented Oct 31, 2020

a1xndr commented Oct 31, 2020

a1xndr commented Nov 1, 2020

inferno-chromium commented Nov 1, 2020

oliverchang commented Nov 1, 2020 • edited Loading

oliverchang commented Nov 2, 2020 • edited Loading

Dor1s commented Nov 2, 2020

Dor1s commented Nov 2, 2020

a1xndr commented Nov 2, 2020 • edited Loading

Dor1s commented Nov 2, 2020

a1xndr commented Nov 3, 2020

Dor1s commented Nov 18, 2020

a1xndr commented Oct 31, 2020 •

edited

Loading

oliverchang commented Nov 1, 2020 •

edited

Loading

oliverchang commented Nov 2, 2020 •

edited

Loading

a1xndr commented Nov 2, 2020 •

edited

Loading