-
-
Notifications
You must be signed in to change notification settings - Fork 965
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong input size with custom mutator implementing custom_post_process
#1397
Comments
d5b9cd4 not only added a feature, it also provided a fix for postprocess. if a post process module was changing the size of the mutated data then this was not picked up and hence a wrong length queue input written. The issue in your case is actually this:
which is not true :) it is only called once. per fuzz. but what is really happening is that in calibration an input is taken and replayed X times, to see how variable the coverage and timing results are. There is nothing wrong with afl-fuzz IMHO, it is rather your setup that does not fit the structure we provide. I see the following options:
|
Thanks for the quick feedback and explanations. 1. move all code from post process to the mutate function. Our understanding is that this would only work until a test case generated by the mutator is reused as entry for the mutator.
If the test case is later reused as part of the corpus, then we would have:
This was implemented in the no_afl_custom_post_process branch of the repo containing the original PoC. An alternative would be to decode the protobuf data in the target and completely remove the We could probably also go the other way and have the corpus in "plaintext", convert it to protobuf at the start of 2. have a signature in the mutate generated data, and only perform the transformation in postprocess if that signature is there Still using the above example, the issue here would be that, in the call to Using the logs from the repo mentioned in the original issue, the call to
The output data is then passed to the target, and the first stage of calibration then starts.
As you can see, the buffer contains encoded data, so it does have to be decoded, but the size of the buffer is wrong. The reasoning behind the fix in d5b9cd4 to fix the wrong length queue input does make sense. |
a buffer+len that returns from afl_custom_post_process has to be what is written into the queue if it reaches new coverage. if you have multiple post process functions active of course they stack, and receive the (potentially changed) input (and potentially changed length). In your initial post you didnt say anything about protobuf. and if I understand correctly your queue items are protobuf serialized inputs that you then mutate further and keep (or rewrite again into) protobuf, and then transform to the real input data to send to the target. Is that correct? I will wait for your answer/explanation here, because that is a huge difference if this is the case or not.
yeah doing that in the target is wrong :) |
Sorry that the initial post wasn't clear enough on that aspect.
Yep, of course. Our confusion stems from the fact that we receive the changed length but not the changed input (though we don't expect to have the changed input since we only use a single post process function). |
OK great, now I have the feeling I understand your setup. "normal" protobuf fuzzer implementations are not based on any valid input or previous input, but generate everything out of air every single fuzz attempt. hence it is perfectly fine that the queue entries are real data and not protobuf because nothing is based on them. And also in your setup the queue entries should be the real data the target processes, otherwise you would be unable to replay the data outside of afl-fuzz (you cannot use afl-showmap for example), and always need a converter first. So how to still do what you want to do? Take a look here: https://github.com/AFLplusplus/Grammar-Mutator because it is working similarly This way the queue has real useful data, but you also have the protobuf encodings for your mutations, plus the overhead is minimal. WDYT? |
Thank you for your new suggestions, @JRomainG and I hadn't thought of that approach.
Perhaps this is a bit out of scope of this issue, but how would the mutator leverage the coverage-guiding feature of AFL++ in that case?
Indeed, currently we solved this by having an external tool which converts our protobuf data to plaintext (like what is done in the post process function of our mutator), so we can use this output outside of afl-fuzz.
This seems like an interesting idea, we'll try it out. |
It does not, which is my criticism on this approach in general.
different solution, different problems :) |
In parallel to testing the new approach you suggested, we tried to understand the reason why we had an issue in the first place. We believe it is due to a mixup between the size of the data written to the queue and the size of the input used to fuzz the program in The u8 __attribute__((hot))
common_fuzz_stuff(afl_state_t *afl, u8 *out_buf, u32 len) {
// [...]
len = write_to_testcase(afl, out_buf, len, 0);
// [...]
} The value of afl->queued_discovered += save_if_interesting(afl, out_buf, len, fault); The output of if (unlikely(len < afl->min_length && !fix)) {
len = afl->min_length;
} else if (unlikely(len > afl->max_length)) {
len = afl->max_length;
}
/* boring uncustom. */
afl_fsrv_write_to_testcase(&afl->fsrv, mem, len); The data written to the queue may only be truncated to fit the min and max bounds. And, in this case, the data in the queue is the same as the input of the target. However, when a custom mutator is used, ssize_t new_size = len;
u8 * new_mem = mem;
u8 * new_buf = NULL;
LIST_FOREACH(&afl->custom_mutator_list, struct custom_mutator, {
if (el->afl_custom_post_process) {
new_size = el->afl_custom_post_process(el->data, new_mem, new_size, &new_buf);
if (unlikely(!new_buf && new_size <= 0)) {
FATAL("Custom_post_process failed (ret: %lu)", (long unsigned)new_size);
}
new_mem = new_buf;
}
});
if (unlikely(new_size < afl->min_length && !fix)) {
new_size = afl->min_length;
} else if (unlikely(new_size > afl->max_length)) {
new_size = afl->max_length;
}
/* everything as planned. use the potentially new data. */
afl_fsrv_write_to_testcase(&afl->fsrv, new_mem, new_size);
len = new_size; In this case, If our understanding is correct, there is no need to do bound checking on When using protobuf, the protobuf-encoded data written in the queue may be significantly larger than the plaintext value used as input, so limiting its length would probably end up truncating the data too much. Based on this, we believe the direct fix to this issue is simply to remove the diff --git a/src/afl-fuzz-run.c b/src/afl-fuzz-run.c
index ffba3475..6da87c76 100644
--- a/src/afl-fuzz-run.c
+++ b/src/afl-fuzz-run.c
@@ -132,7 +132,6 @@ write_to_testcase(afl_state_t *afl, void *mem, u32 len, u32 fix) {
/* everything as planned. use the potentially new data. */
afl_fsrv_write_to_testcase(&afl->fsrv, new_mem, new_size);
- len = new_size;
} else { Would that make sense? If so, we can open a PR with this change. |
funny, this is a bug I just fixed, because another user had a similar problem. but the fix is very different. the line you want to delete is important. what was wrong is that the mem parameter to write_to_testcase must be changeable, so u8**. fix is in dev. |
Thanks for the update. Here are the logs during calibration we get when running our PoC with this new version:
It seems like this "inverted" the problem: previously, the protobuf buffer was passed to |
looks good to me and you run into the issue I told you before: you are using post process for something you should not :) you can get around it by detecting if it is protobuf and if so decode, and if it is not, return untouched. |
OK, thanks for clarifying. It was not straightforward to have to do so by reading the official documentation about custom post_process implementation but would still work. |
yeah, documentation could be better on this, I agree ... |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Environment variables
Bug description
When using AFL++ with a custom mutator implementing the
custom_post_process
function, the wrong value forbuf_size
is sometimes passed.We observed that
custom_post_process
is sometimes called twice whereascustom_fuzz
is called only once. In the second call tocustom_post_process
, the value returned by the first call is used in thebuf_size
argument.However, this can cause an issue if the
custom_post_process
changes the size of the buffer (e.g. when working with a protobuf mutator).To Reproduce
This can be reproduced using the PoC at https://github.com/JRomainG/AFLplusplusCustomMutatorPoC.
Screen output
Output can be found at the end of the README of https://github.com/JRomainG/AFLplusplusCustomMutatorPoC
Idea of fix
By investigating the AFL++ code, we found a fix which works for our case:
This seems to have been introduced by d5b9cd4. Is there perhaps a misunderstanding on our part on how to use this feature?
The text was updated successfully, but these errors were encountered: