-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add fuzzing #1194
add fuzzing #1194
Conversation
there is also an exhaustive test for making sure 16 and 32 bit integers roundtrip.
often, a minute of fuzzing is sufficient to uncover problems so it is beneficial to run this as a pre merge job. longer fuzzing sessions can be run manually or through other ci jobs.
This is awesome! I was actually not aware of libFuzzer. Thanks for all the work you put into this! I'll merge in the latest code and get this merged with main. |
Thanks, happy to help with this awesome library! The quickfuzz job fails because it finds a problem - there seems to be one more problem with the generic json parsing. Do you want me to write a reproducer? The troublesome input is available as an artifact. |
@pauldreik, I'm trying to make sense of the artifact: Is this the input string that caused the error? From the artifact:
|
@pauldreik, I'm struggling a bit with replicating the issue with |
yes, that is the data.
ran the fuzzer locally:
|
And on a different machine where I have a working clang, I did this:
to get a smaller input, I ran this: and that also reproduces, but with a smaller input (using the file "cleaned_crash" created by the previous step)
Here is the minimized input:
|
this replicates the issue in a test: "invalid json_t read3"_test = [] {
glz::json_t json{};
auto blah = std::vector<char>{0x22, 0x5c, 0x75,char(0xff), 0x22, 0x00};
expect(glz::read_json(json, blah));
}; I get:
|
Thanks, I'll look into this. |
@pauldreik, I found the issue (another missing |
This adds fuzzing for selected parts of the json functionality.
Motivation
Security and untrusted input
Json parsing is often used on external and/or untrusted input. It is therefore important to make sure there are bugs such as out of bounds reads/writes, signed integer overflow or other kinds of UB that can be triggered.
For this purpose, fuzzing is an excellent tool.
While developing this, a number of bugs have been revealed and promptly fixed.
Correctness
This PR also adds roundtrip fuzzers, making sure data can be serialized and deserialized without change.
While developing this several problems were found:
What this PR adds
Json parser fuzzers
These are intended to cover the supposedly most common use cases and were
based on the examples in the README.
Roundtrip fuzzers
These are for checking conversions do roundtrip.
Exhaustive tests
For 32 bit types it is feasible to test all possible values. Such a test for conversion of integers has been added.
In a future extension, it would be good to also do this for 32 bit floating point.
CI jobs
Short-fuzz
A CI job is added that runs all the fuzzers for a short while. This is often sufficient to cover shallow bugs. Most of the problems uncovered while working with this took less than ten seconds to find. The job does not store the corpus between runs, that could be improved later if desired (using github action cache).
The output is only saved in case a fuzzer fails.
Exhaustive
This CI job compiles without sanitizers and with full optimization, so it goes quick enough to run in CI. It uses all cores.
Preventing code rot
The fuzzers have been arranged so they compile also on compilers not supporting libFuzzer. For those, a separate main function is provided. The fuzzers build by default, meaning api changes in glaze that break the fuzzers will not go undetected.
What comes next
This could be improved in a number of ways, it is intended to be a small start in the spirit of having something usable now rather than perfect never. But here are possible improvements: