Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix saturated maps & stability cliff in recalibration #425

Merged
merged 1 commit into from
Jun 25, 2020

Conversation

dgmelski
Copy link
Contributor

I have observed two problems:

  1. A sudden "stability cliff" where stability drops precipitously.

  2. A sudden jump to a 100% saturated "density map".

I tracked both issues down to the attempted "recalibrations" of a case
at the beginnings of fuzz_one_original() and mopt_common_fuzzing().
See the comments "CALIBRATION (only if failed earlier on)" in those
functions and the subsequent call to calibrate_case().

At those two calls to calibrate_case(), afl->fsrv.trace_bits holds
trace_bits for a previous run of the SUT for a prior queue entry.
However, calibrate_case() may use the trace_bits as if they apply to
the current queue entry (afl->queue_cur).

Most often this bug causes the "stability cliff". Trace bits are
compared for runs on completely different inputs and there are a lot
of differences. Hence, the sudden drop in stability.

Sometimes it leads to the "map density" saturation problem. This
happens if the trace bits on the previous entry were "simplified" (by
simplify_trace). Simplified traces only contain the values 1 and 128.
They are meant to be compared against virgin_crashes and
virgin_tmouts.

However, this bug causes the (stale) simplified trace to be compared
against virgin_bits (in a call has_new_bits()), which causes every
byte in vigin_bits to be something other than 255. The overall map
density is determined by the percentage of bytes not 255, which will
be 100%. Worse, AFL will be unable to detect novel occurrences of
edge counts 1 and 128 going forward.

This patch avoids this issues by clearing afl->queue_cur->exec_cksum
before the relevant calls to calibrate_case(). As a result,
calibrate_case() assumes it has to do a new run before using
trace_bits. It seems to work, but there may be better ways to address
the issue.

@dgmelski
Copy link
Contributor Author

I hope this pull request makes clear an issue I've observed (and one work around). Should I open an issue as well? Or instead?

As I noted in the commit message, there are likely other ways to handle the issue.

@andreafioraldi
Copy link
Member

Hi, good catch, thank you for the patch!
I'll investigate it ASAP.

@andreafioraldi
Copy link
Member

Should be also in old AFL https://github.com/google/AFL/blob/master/afl-fuzz.c#L5076

@andreafioraldi
Copy link
Member

My proposed fix is to reset q->exec_cksum = 0 when calibration fails checking if q->cal_failed is != 0 here:

abort_calibration:

This bug should be very rare but I also experienced it with a custom instrumentation but I tought was a bug if the instrumentation itself.

@andreafioraldi
Copy link
Member

@vanhauser-thc double check it, my fix should be the correct one IMO

@dgmelski
Copy link
Contributor Author

Ah, I like your fix better. Thank you.

Regarding triggering this bug: I can get it relatively frequently by overloading the system. E.g., disabling CPU affinity and running double the AFL instances as available cores. I think it forces more timeouts and more calibration failures.

@dgmelski
Copy link
Contributor Author

I also suspected this was in old AFL and I verified that the stability cliff happens, there too. I don't recall seeing the saturated density map and I did not test a patch of the old AFL.

@vanhauser-thc
Copy link
Member

vanhauser-thc commented Jun 24, 2020

@dgmelski very good catch!

@andreafioraldi yes your patch is the more precise one, good solution.

@dgmelski do you want to update your commit to reflect that? this way you get the attribution.

looking at the code I would propose one other thing:

--- afl-fuzz-run.c	(revision 5289)
+++ afl-fuzz-run.c	(working copy)
@@ -220,8 +220,7 @@
 u8 calibrate_case(afl_state_t *afl, struct queue_entry *q, u8 *use_mem,
                   u32 handicap, u8 from_queue) {
 
@@ -286,12 +285,6 @@
 
     u64 cksum;
 
-    if (!first_run && !(afl->stage_cur % afl->stats_update_freq)) {
-
-      show_stats(afl);
-
-    }
-
     write_to_testcase(afl, use_mem, q->len);

removing the updating of the stats because a) it is not really needed IMHO and b) messes up the timing of start_us/end_us.

@dgmelski
Copy link
Contributor Author

@vanhauser-thc: I am happy to update. I'm re-running my tests right now with @andreafioraldi's suggested patch. I'd like to give that at least a few hours to check if it works. Assuming it goes well, I'll put in the changes.

I have observed two problems:

  1. A sudden "stability cliff" where stability drops precipitously.

  2. A sudden jump to a 100% saturated "density map".

Both issues are due to attempted "recalibration" of a case at the
beginning of fuzz_one_original() or mopt_common_fuzzing().  See the
comments "CALIBRATION (only if failed earlier on)" in those functions
and the subsequent call to calibrate_case().

At those calls to calibrate_case(), afl->fsrv.trace_bits holds
trace_bits for a run of the SUT on a prior queue entry.  However,
calibrate_case() may use the trace_bits as if they apply to the
current queue entry (afl->queue_cur).

Most often this bug causes the "stability cliff".  Trace bits are
compared for runs on distinct inputs, which can be very different.
The result is a sudden drop in stability.

Sometimes it leads to the "saturated map" problem.  A saturated
density map arises if the trace bits on the previous entry were
"simplified" by simplify_trace().  Simplified traces only contain the
values 1 and 128.  They are meant to be compared against
virgin_crashes and virgin_tmouts.

However, this bug causes the (stale) simplified trace to be compared
against virgin_bits during a call to has_new_bits(), which causes
every byte in vigin_bits to be something other than 255.  The overall
map density is determined by the percentage of bytes not 255, which
will be 100%.  Worse, AFL++ will be unable to detect novel occurrences
of edge counts 1 and 128 going forward.

This patch avoids the above issues by clearing q->exec_cksum when
calibration fails.  Recalibrations are forced to start with a fresh
trace on the queue entry.

Thanks to @andreafioraldi for suggesting the current, improved patch.
@dgmelski
Copy link
Contributor Author

I modified the commit to match @andreafioraldi's suggested change.

My testing shows recalibration occurring safely without triggering either issue.

I did not pursue the changes around first_run. There are two other uses of first_run in calibrate_case. One of them is used to warn the user about inputs that do not add additional coverage. Probably just dropping the status updates would be adequate, but I'll leave that to you to decide.

@vanhauser-thc
Copy link
Member

@andreafioraldi I pushed the removal of the stat update while calibrating. if you think that is not good remove it or tell me to remove it :)

@andreafioraldi
Copy link
Member

So far so good, I merge it.

@andreafioraldi andreafioraldi merged commit 4a3305c into AFLplusplus:dev Jun 25, 2020
andreafioraldi added a commit to andreafioraldi/AFL-1 that referenced this pull request Jun 25, 2020
Dor1s pushed a commit to google/AFL that referenced this pull request Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants