Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong input size with custom mutator implementing custom_post_process #1397

Closed
agantet opened this issue May 3, 2022 · 13 comments
Closed

Wrong input size with custom mutator implementing custom_post_process #1397

agantet opened this issue May 3, 2022 · 13 comments

Comments

@agantet
Copy link

agantet commented May 3, 2022

Environment variables

AFL_CUSTOM_MUTATOR_LIBRARY=AFLplusplusCustomMutatorPoC/libmutator.so
AFL_CUSTOM_MUTATOR_ONLY=1
AFL_DEBUG=1
AFL_DEBUG_CHILD=1
AFL_DISABLE_TRIM=1
AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1
AFL_KILL_SIGNAL=9
AFL_SKIP_CPUFREQ=1
AFL_TMPDIR=/dev/shm
# command line:
'AFLplusplusCustomMutatorPoC/aflplusplus/afl-fuzz' '-i' 'AFLplusplusCustomMutatorPoC/corpus' '-o' 'AFLplusplusCustomMutatorPoC/output' '--' 'AFLplusplusCustomMutatorPoC/target.o' '@@'

Bug description

When using AFL++ with a custom mutator implementing the custom_post_process function, the wrong value for buf_size is sometimes passed.

We observed that custom_post_process is sometimes called twice whereas custom_fuzz is called only once. In the second call to custom_post_process, the value returned by the first call is used in the buf_size argument.
However, this can cause an issue if the custom_post_process changes the size of the buffer (e.g. when working with a protobuf mutator).

To Reproduce
This can be reproduced using the PoC at https://github.com/JRomainG/AFLplusplusCustomMutatorPoC.

Screen output
Output can be found at the end of the README of https://github.com/JRomainG/AFLplusplusCustomMutatorPoC

Idea of fix
By investigating the AFL++ code, we found a fix which works for our case:

diff --git a/src/afl-fuzz-run.c b/src/afl-fuzz-run.c
index ffba3475..79151faa 100644
--- a/src/afl-fuzz-run.c
+++ b/src/afl-fuzz-run.c
@@ -967,7 +967,7 @@ common_fuzz_stuff(afl_state_t *afl, u8 *out_buf, u32 len) {
 
   u8 fault;
 
-  len = write_to_testcase(afl, out_buf, len, 0);
+  write_to_testcase(afl, out_buf, len, 0);
 
   fault = fuzz_run_target(afl, &afl->fsrv, afl->fsrv.exec_tmout);

This seems to have been introduced by d5b9cd4. Is there perhaps a misunderstanding on our part on how to use this feature?

@vanhauser-thc
Copy link
Member

d5b9cd4 not only added a feature, it also provided a fix for postprocess. if a post process module was changing the size of the mutated data then this was not picked up and hence a wrong length queue input written.

The issue in your case is actually this:

The afl_custom_post_process function is called twice.

which is not true :) it is only called once. per fuzz. but what is really happening is that in calibration an input is taken and replayed X times, to see how variable the coverage and timing results are.
so for every calibration fuzz attempt post process is called again and again - but no mutation is called because it is calibration not fuzzing.

There is nothing wrong with afl-fuzz IMHO, it is rather your setup that does not fit the structure we provide.

I see the following options:

  1. move all code from post process to the mutate function.
  2. have a signature in the mutate generated data, and only perform the transformation in postprocess if that signature is there and otherwise silently do nothing instead of reporting a problem.

@agantet
Copy link
Author

agantet commented May 4, 2022

Thanks for the quick feedback and explanations.
We tried your suggestions out, but didn't find a proper way of solving our issue:

1. move all code from post process to the mutate function.

Our understanding is that this would only work until a test case generated by the mutator is reused as entry for the mutator.
For example, for a protobuf mutator, we would have:

corpus (protobuf) -> afl_custom_fuzz -> test case (plaintext) -> target

If the test case is later reused as part of the corpus, then we would have:

previous test case (plaintext) -> afl_custom_fuzz

This was implemented in the no_afl_custom_post_process branch of the repo containing the original PoC.
This is why we decided to use afl_custom_post_process, as suggested in the documentation which explicitly mentions protobuf. Perhaps this part of the documentation is outdated, however.

An alternative would be to decode the protobuf data in the target and completely remove the afl_custom_post_process function from our mutator.
We believe this may not always be possible in binary-only mode however. Perhaps this would also behave strangely in cases where multiple custom mutators are used?

We could probably also go the other way and have the corpus in "plaintext", convert it to protobuf at the start of afl_custom_fuzz, and convert it back to plaintext at the end, but we are afraid this would negatively impact performance.

2. have a signature in the mutate generated data, and only perform the transformation in postprocess if that signature is there

Still using the above example, the issue here would be that, in the call to afl_custom_fuzz, buf is always encoded using protobuf (even when the size if wrong during the calibration phase).
However, when calibration starts, buf_size contains the size of the plaintext buffer instead of the size of buf.

Using the logs from the repo mentioned in the original issue, the call to afl_custom_fuzz after the mutation shows:

[mutator] Called afl_custom_post_process with buffer of size 16

  0000  00 00 00 0c 74 65 73 74 30 31 32 33 34 35 36 37  ....test01234567  <-- encoded value

[mutator] Generated post-process data of size 12

  0000  74 65 73 74 30 31 32 33 34 35 36 37              test01234567      <-- decoded value

The output data is then passed to the target, and the first stage of calibration then starts. afl_custom_fuzz is thus called again without mutation as you explain, and the logs show:

[mutator] Called afl_custom_post_process with buffer of size 12

  0000  00 00 00 0c 74 65 73 74 30 31 32 33              ....test0123      <-- truncated encoded value

[mutator] Buffer size doesn't match encoded size: expected 16 but got 12

[mutator] According to the encoding, the buffer looks like:

  0000  00 00 00 0c 74 65 73 74 30 31 32 33 34 35 36 37  ....test01234567  <-- actual encoded value

As you can see, the buffer contains encoded data, so it does have to be decoded, but the size of the buffer is wrong.
The way we understand it, the issue isn't about knowing whether the data is encoded, but rather knowing the actual size of the buffer. Perhaps storing the actual size of the buffer in a header and using that instead of the value in buf_size would work, but it seems like a pretty dirty workaround and we aren't sure that we won't read out of bounds of buf.

The reasoning behind the fix in d5b9cd4 to fix the wrong length queue input does make sense.
Isn't there a confusion between the length of the queue input and the length of the buffer passed to afl_custom_post_process though?

@vanhauser-thc
Copy link
Member

The reasoning behind the fix in d5b9cd4 to fix the wrong length queue input does make sense. Isn't there a confusion between the length of the queue input and the length of the buffer passed to afl_custom_post_process though?

a buffer+len that returns from afl_custom_post_process has to be what is written into the queue if it reaches new coverage.

if you have multiple post process functions active of course they stack, and receive the (potentially changed) input (and potentially changed length).

In your initial post you didnt say anything about protobuf. and if I understand correctly your queue items are protobuf serialized inputs that you then mutate further and keep (or rewrite again into) protobuf, and then transform to the real input data to send to the target. Is that correct? I will wait for your answer/explanation here, because that is a huge difference if this is the case or not.

An alternative would be to decode the protobuf data in the target and completely remove the afl_custom_post_process function from our mutator. We believe this may not always be possible in binary-only mode however. Perhaps this would also behave strangely in cases where multiple custom mutators are used?

yeah doing that in the target is wrong :)

@agantet
Copy link
Author

agantet commented May 5, 2022

In your initial post you didnt say anything about protobuf. and if I understand correctly your queue items are protobuf serialized inputs that you then mutate further and keep (or rewrite again into) protobuf, and then transform to the real input data to send to the target. Is that correct?

Sorry that the initial post wasn't clear enough on that aspect.
Indeed, our use case is to have a protobuf mutator, similar to what is done in libprotobuf-mutator-example.
Your understanding is accurate, that's exactly what we're going for.
Just to clarify, we don't think the issue is specific to protobuf, but rather that it's triggered as a side effect of decoding from protobuf to plaintext. That's why, for the sake of simplicity, we didn't use protobuf in our PoC when reporting the issue.

if you have multiple post process functions active of course they stack, and receive the (potentially changed) input (and potentially changed length).

Yep, of course. Our confusion stems from the fact that we receive the changed length but not the changed input (though we don't expect to have the changed input since we only use a single post process function).

@vanhauser-thc
Copy link
Member

OK great, now I have the feeling I understand your setup.

"normal" protobuf fuzzer implementations are not based on any valid input or previous input, but generate everything out of air every single fuzz attempt. hence it is perfectly fine that the queue entries are real data and not protobuf because nothing is based on them.

And also in your setup the queue entries should be the real data the target processes, otherwise you would be unable to replay the data outside of afl-fuzz (you cannot use afl-showmap for example), and always need a converter first.

So how to still do what you want to do? Take a look here: https://github.com/AFLplusplus/Grammar-Mutator because it is working similarly
What you can do is using the afl_custom_queue_new_entry + afl_custom_queue_get hooks. The first one is called every time a new queue entry is written to disk. The latter is called when a new entry was selected for mutation.
When the first hook is called, you create an extra file in an extra directory which will have the protobuf encoding.
When the latter hook is called you load your protobuf encoding into your data structure and work with that on all mutations until afl_custom_queue_get is called again which means a new queue entry was selected.

This way the queue has real useful data, but you also have the protobuf encodings for your mutations, plus the overhead is minimal.

WDYT?

@agantet
Copy link
Author

agantet commented May 6, 2022

Thank you for your new suggestions, @JRomainG and I hadn't thought of that approach.

"normal" protobuf fuzzer implementations are not based on any valid input or previous input, but generate everything out of air every single fuzz attempt

Perhaps this is a bit out of scope of this issue, but how would the mutator leverage the coverage-guiding feature of AFL++ in that case?

in your setup the queue entries should be the real data the target processes, otherwise you would be unable to replay the data outside of afl-fuzz

Indeed, currently we solved this by having an external tool which converts our protobuf data to plaintext (like what is done in the post process function of our mutator), so we can use this output outside of afl-fuzz.

What you can do is using the afl_custom_queue_new_entry + afl_custom_queue_get hooks

This seems like an interesting idea, we'll try it out.
The only potential concern we have is that we need to implement a robust way of converting a plaintext input into a protobuf-encoded input, which might not be trivial depending on the format of the plaintext. We'll see if we manage to do this efficiently.

@vanhauser-thc
Copy link
Member

"normal" protobuf fuzzer implementations are not based on any valid input or previous input, but generate everything out of air every single fuzz attempt

Perhaps this is a bit out of scope of this issue, but how would the mutator leverage the coverage-guiding feature of AFL++ in that case?

It does not, which is my criticism on this approach in general.
That is why when I mentored our Grammar Mutator project at GSOC that the mutated tree information is saved and used. (and that is why I could point you to this example).

What you can do is using the afl_custom_queue_new_entry + afl_custom_queue_get hooks

This seems like an interesting idea, we'll try it out. The only potential concern we have is that we need to implement a robust way of converting a plaintext input into a protobuf-encoded input, which might not be trivial depending on the format of the plaintext. We'll see if we manage to do this efficiently.

different solution, different problems :)

@agantet
Copy link
Author

agantet commented May 6, 2022

In parallel to testing the new approach you suggested, we tried to understand the reason why we had an issue in the first place. We believe it is due to a mixup between the size of the data written to the queue and the size of the input used to fuzz the program in write_to_testcase.

The write_to_testcase function is used, for example, by common_fuzz_stuff like so:

u8 __attribute__((hot))
common_fuzz_stuff(afl_state_t *afl, u8 *out_buf, u32 len) {
  // [...]
  len = write_to_testcase(afl, out_buf, len, 0);
  // [...]
}

The value of out_buf is unchanged by the call, but len is overridden for the rest of the function. It is later used in a call to the save_if_interesting function:

afl->queued_discovered += save_if_interesting(afl, out_buf, len, fault);

The output of write_to_testcase thus impacts the length of the data written to the queue, but not the data itself. In the case where no custom mutator is used, this shouldn't be an issue, as write_to_testcase only does bound checking:

if (unlikely(len < afl->min_length && !fix)) {
  len = afl->min_length;
} else if (unlikely(len > afl->max_length)) {
  len = afl->max_length;
}
/* boring uncustom. */
afl_fsrv_write_to_testcase(&afl->fsrv, mem, len);

The data written to the queue may only be truncated to fit the min and max bounds. And, in this case, the data in the queue is the same as the input of the target.

However, when a custom mutator is used, write_to_testcase behaves differently:

ssize_t new_size = len;
u8 *    new_mem = mem;
u8 *    new_buf = NULL;

LIST_FOREACH(&afl->custom_mutator_list, struct custom_mutator, {
  if (el->afl_custom_post_process) {
    new_size = el->afl_custom_post_process(el->data, new_mem, new_size, &new_buf);
    if (unlikely(!new_buf && new_size <= 0)) {
      FATAL("Custom_post_process failed (ret: %lu)", (long unsigned)new_size);
    }
    new_mem = new_buf;
  }
});

if (unlikely(new_size < afl->min_length && !fix)) {
  new_size = afl->min_length;
} else if (unlikely(new_size > afl->max_length)) {
  new_size = afl->max_length;
}

/* everything as planned. use the potentially new data. */
afl_fsrv_write_to_testcase(&afl->fsrv, new_mem, new_size);
len = new_size;

In this case, new_mem and mem may have different values (and lengths). However, the value written to the queue in common_fuzz_stuff is still mem, not new_mem, so len shouldn't be overridden by new_size.

If our understanding is correct, there is no need to do bound checking on len in this scenario (only on new_size) since, min_length and max_length are supposed to limit the size of the fuzzing input generated, not the size of the data stored in the queue.

When using protobuf, the protobuf-encoded data written in the queue may be significantly larger than the plaintext value used as input, so limiting its length would probably end up truncating the data too much.

Based on this, we believe the direct fix to this issue is simply to remove the len = new_size line in write_to_testcase:

diff --git a/src/afl-fuzz-run.c b/src/afl-fuzz-run.c
index ffba3475..6da87c76 100644
--- a/src/afl-fuzz-run.c
+++ b/src/afl-fuzz-run.c
@@ -132,7 +132,6 @@ write_to_testcase(afl_state_t *afl, void *mem, u32 len, u32 fix) {
 
     /* everything as planned. use the potentially new data. */
     afl_fsrv_write_to_testcase(&afl->fsrv, new_mem, new_size);
-    len = new_size;
 
   } else {

Would that make sense? If so, we can open a PR with this change.

@vanhauser-thc
Copy link
Member

funny, this is a bug I just fixed, because another user had a similar problem. but the fix is very different. the line you want to delete is important. what was wrong is that the mem parameter to write_to_testcase must be changeable, so u8**. fix is in dev.

@JRomainG
Copy link
Contributor

JRomainG commented May 9, 2022

Thanks for the update.
We just tested the latest fix in the "dev" branch (commit 7c8246f), assuming the fix you mentionned was the one in commits 51942b6 and 7c8246f.

Here are the logs during calibration we get when running our PoC with this new version:

[D] DEBUG: calibration stage 1/8
[mutator] Called afl_custom_post_process with buffer of size 8
  0000  00 00 00 04 74 65 73 74                          ....test  <-- encoded value
[mutator] Generated post-process data of size 4
  0000  74 65 73 74                                      test      <-- decoded value
[target] Reading file at /dev/shm/.cur_input
[target] Read 4 bytes from file
  0000  74 65 73 74                                      test

[D] DEBUG: calibration stage 2/8
[mutator] Called afl_custom_post_process with buffer of size 8
  0000  74 65 73 74 05 00 00 00                          test....  <-- decoded value with garbage at the end
[mutator] Buffer size doesn't match encoded size: expected 1952805752 but got 8
[mutator] According to the encoding, the buffer looks like:
  0000  74 65 73 74 05 00 00 00 00 00 00 00 00 00 00 00  test............
[...]

It seems like this "inverted" the problem: previously, the protobuf buffer was passed to post_process with the plaintext buffer's size, whereas now the plaintext buffer is passed with the protobuf buffer's size.
Putting the size issue aside, is it a desired behavior to sometimes pass protobuf data and sometimes pass plaintext data to post_process?

@vanhauser-thc
Copy link
Member

looks good to me and you run into the issue I told you before: you are using post process for something you should not :) you can get around it by detecting if it is protobuf and if so decode, and if it is not, return untouched.

@agantet
Copy link
Author

agantet commented May 17, 2022

OK, thanks for clarifying. It was not straightforward to have to do so by reading the official documentation about custom post_process implementation but would still work.
Hoping that further custom mutators implementations won't encounter the same issue as ours :)

@vanhauser-thc
Copy link
Member

yeah, documentation could be better on this, I agree ...

@AFLplusplus AFLplusplus locked and limited conversation to collaborators May 17, 2022
@vanhauser-thc vanhauser-thc converted this issue into discussion #1416 May 17, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants