New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fuzz: improve coverage #797
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @catenacyber,
I'll just do a braindump for a quicker reply:
- I get and like the general idea 👍
- 🔴 clang-format CI doesn't like the formatting. Script
apply-clang-format.sh
can help. - 🔴 This is now creating four parsers instead of one but
still only freeing a single one. It likely needs three more calls tobut it is freeing five times. Let's no free insideXML_parserFree
ParseOneInput
but inLLVMFuzzerTestOneInput
and do it matching four times, please. - It is unclear to me how this approach has more chances of reaching
doIgnoreSection
than status quo. Has that really changed or is it luck in fuzzing entropy? - The parser variable names are not ideal.
- Pull request Fuzzer #617 is related, needs work, haven't gotten to taking it over, yet.
- Issue Suggest fuzzer for XML_Parse() and XML_GetCurrentLineNumber(). #660 is related, unclear.
- (We have similar code in
fuzz/xml_parsebuffer_fuzzer.c
that could benefinit from the same approach, but let's only duplicate things once we have one side right, to not waste your time.) - (I'll need to re-check if CI even builds (not runs) the fuzzing code as of now. If not, that's something I should make the CI do.)
Best, Sebastian
Thanks for the quick feedback Sebastian
Cool :-)
Trying
Ok for moving
I do not know this
Renaming
This looks a good idea to test API completeness. You can check https://github.com/catenacyber/ngolo-fuzzing that does a similar thing against golang.
The only thing I see there is calling
Ok, let me know if I need to change there as well...
Should I add CIFuzz ? cf https://google.github.io/oss-fuzz/getting-started/continuous-integration/ as easy as adding a cifuzz.yml file in .github repo like https://github.com/OISF/libhtp/pull/374/files (just changing to use |
I pushed additional commits to fix each comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- clang-format CI doesn't like the formatting. Script
apply-clang-format.sh
can help.Trying
git clang-format HEAD~
@catenacyber that's an interesting pointer, thanks!
- This is now creating four parsers instead of one but
still only freeing a single one. It likely needs three more calls tobut it is freeing five times. Let's no free insideXML_parserFree
ParseOneInput
but inLLVMFuzzerTestOneInput
and do it matching four times, please.Ok for moving
XML_parserFree
calls toLLVMFuzzerTestOneInput
Thanks!
- The parser variable names are not ideal.
Renaming
pparent
toentParser
as used in tests code, andpparent2
toext_parser
Any other suggestions for naming ?
The the name is still quite inconsistent, e.g. one uses snake case and the other camel case. I'd be good with weither of these two…
parentParser
,namespaceParser
,externalEntityParser
,externalDtdParser
p1
,p2
,p3
,p4
…or some consistent variation of the former. Would that work for you?
- Pull request Fuzzer #617 is related, needs work, haven't gotten to taking it over, yet.
This looks a good idea to test API completeness. You can check https://github.com/catenacyber/ngolo-fuzzing that does a similar thing against golang. And also good for testing the callbacks... For
fail_allocations
, you can try https://github.com/catenacyber/nallocfuzz ;-) But I would say it is beyond the scope of this PR. Shame it is stale...
I have plans to finish it myself but the build system part needs a re-write and there are more pressing priorities before that.
Thanks for the two links.
- Issue Suggest fuzzer for XML_Parse() and XML_GetCurrentLineNumber(). #660 is related, unclear.
The only thing I see there is calling
XML_GetCurrentLineNumber
andXML_ErrorString
I do not think there are many bugs to find in these...
XML_GetCurrentLineNumber
does more than one would expect: issue #317 is an example of a security bug in it from the past. XML_ErrorString
is trivial indeed.
- (I'll need to re-check if CI even builds (not runs) the fuzzing code as of now. If not, that's something I should make the CI do.)
Should I add CIFuzz ? cf https://google.github.io/oss-fuzz/getting-started/continuous-integration/ as easy as adding a cifuzz.yml file in .github repo like https://github.com/OISF/libhtp/pull/374/files (just changing to use
expat
oss-fuzz project)
No, CIFuzz is quite broken, my decision against use of CIFuzz for Expat is documented at #677 (comment) . CIFuzz does not take this seriously, as can be seen at google/oss-fuzz#10504 . That's the "printable version" of how I feel about it.
Good to learn through https://github.com/libexpat/libexpat/actions/runs/7458547770/job/20306703388?pr=797#step:6:386 that the CI already does cover building that code.
I pushed additional commits to fix each comment. Let me know if I should squash them all together
Squashing would be welcome here, yes please 🙏
expat/fuzz/xml_parse_fuzzer.c
Outdated
XML_Parse(p, (const XML_Char *)data, size, 1); | ||
if (XML_Parse(p, (const XML_Char *)data, size, 1) == XML_STATUS_ERROR) { | ||
XML_ErrorString(XML_GetErrorCode(p)); | ||
XML_GetCurrentLineNumber(parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parser
would need to be p
here to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: Also, we should be able to call XML_GetCurrentLineNumber(p);
in the non-error case also. Let's not limit it to the error case then, for more coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, fixing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least, it showed CI tests the fuzzers build ;-)
expat/fuzz/xml_parse_fuzzer.c
Outdated
XML_ParserFree(p); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have the empty line before that special (parent parser) free, not after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Looks good
Leaving to you then, and not in this PR
Oh...
So you look like you know CIFuzz better than me...
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@catenacyber thanks for the adjustments!
We can merge as-is (assuming the CI will be all-green soon) OR first get xml_parsebuffer_fuzzer.c
closer together, as you like. I guess the ideall diff between the two would be nothing but one calling XML_ParseBuffer
and the other XML_ParseBuffer
?
For the current diff:
# diff -u fuzz/*.c
--- fuzz/xml_parsebuffer_fuzzer.c 2022-02-21 15:46:39.960516443 +0100
+++ fuzz/xml_parse_fuzzer.c 2022-02-21 15:46:39.960516443 +0100
@@ -16,7 +16,6 @@
#include <assert.h>
#include <stdint.h>
-#include <string.h>
#include "expat.h"
#include "siphash.h"
@@ -50,22 +49,16 @@
int
LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
- if (size == 0)
- return 0;
-
XML_Parser p = XML_ParserCreate(xstr(ENCODING_FOR_FUZZING));
assert(p);
- XML_SetElementHandler(p, start, end);
// Set the hash salt using siphash to generate a deterministic hash.
struct sipkey *key = sip_keyof(hash_key);
XML_SetHashSalt(p, (unsigned long)siphash24(data, size, key));
- void *buf = XML_GetBuffer(p, size);
- assert(buf);
-
- memcpy(buf, data, size);
- XML_ParseBuffer(p, size, size == 0);
+ XML_SetElementHandler(p, start, end);
+ XML_Parse(p, (const XML_Char *)data, size, 0);
+ XML_Parse(p, (const XML_Char *)data, size, 1);
XML_ParserFree(p);
return 0;
}
What do you think?
Pushed the changes to the parse buffer version... |
expat/fuzz/xml_parsebuffer_fuzzer.c
Outdated
XML_ParseBuffer(p, size, 0); | ||
if (XML_ParseBuffer(p, size, 1) == XML_STATUS_ERROR) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In between these two calls to XML_ParseBuffer
we would need to add something like:
buf = XML_GetBuffer(p, size);
assert(buf);
memcpy(buf, data, size);
Because the second call to XML_ParseBuffer
starts off after how far the first call got.
Also: Can we move the XML_SetElementHandler
call right before the first XML_ParseBuffer
to reduce the cross-file diff?
PS: For the cross-file diff, I did:
cd "$(mktemp -d)"
git clone --depth 1 --branch fuzzcov https://github.com/catenacyber/libexpat
diff -u libexpat/expat/fuzz/*.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and pushed :-)
@catenacyber thanks! 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and pushed :-)
@catenacyber thank you! 👍
I'm expecting CI to soon be all-green. In that case I will merge.
Using other parsers comes from https://introspector.oss-fuzz.com/project-profile?project=expat
Another way to achieve so would be to create multiple targets based on the parser
No bugs found in a few minutes from oss-fuzz public corpus, but coverage grows and reaches functions like
doIgnoreSection