-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport][2.11] Backport patches in test/fuzz #8880
[Backport][2.11] Backport patches in test/fuzz #8880
Conversation
ligurio
commented
Jul 14, 2023
•
edited
edited
- grammar SQL fuzzer
- grammar Lua fuzzer
- fixes-shmixes for fuzzing tests
- fix for buffer overflow in tnt_strptime #8502
0e23b9d
to
a51edf3
Compare
8580c82
to
5bd38e3
Compare
5bd38e3
to
c5c1c42
Compare
@ligurio, thanks for the patchset! I would like to notice, that the cherry-picked patches must contain the original hash in the commit message (i.e. use |
Added options for fuzzing and for getting more information on debugging. NO_CHANGELOG=<fuzzing options> NO_DOC=<fuzzing options> NO_TEST=<fuzzing options> (cherry picked from commit 69f21e2)
Added Google's 'libprotobuf-mutator' and 'protobuf' libraries for developing grammar-based LuaJIT and SQL fuzzers based on LibFuzzer. It is needed to build protobuf module from source because by default, the system-installed version of protobuf is used by libprotobuf-mutator, and this version can be too old. Part of tarantool#4823 NO_CHANGELOG=<dependencies> NO_DOC=<dependencies> NO_TEST=<dependencies> (cherry picked from commit b11072a)
Patch adds a LuaJIT fuzzer based on libprotobuf-mutator and LibFuzzer. Grammar is described via messages in protobuf format, serializer is applied to convert .proto format to string. For displaying generated code on the screen during fuzzing set the environment variable 'LPM_DUMP_NATIVE_INPUT'. For displaying error messages from lua functions set the environment variable 'LUA_FUZZER_VERBOSE'. Note: UndefinedBehaviourSanitizer is unsupported by LuaJIT (see tarantool#8473), so fuzzing test is disabled when CMake option ENABLE_UB_SANITIZER is passed. Closes tarantool#4823 NO_DOC=<fuzzing testing of LuaJIT> NO_TEST=<fuzzing testing of LuaJIT> (cherry picked from commit a287c85)
Fixes tarantool#8502 Needed for tarantool#8490 NO_DOC=bugfix NO_TEST=covered by fuzzing test (cherry picked from commit 783a704)
Function `datetime_strptime` decodes string with datetime according to specified format, it accepts a datetime struct, buffer with datetime and string with format in arguments. Fuzzing test used static string "iso8601" as a format and it blocked fuzzing test to cover functions used by datetime_strptime under the hood. Fuzz introspector shows that code coveraged by a test is quite low. Patch updates the test to make it more effective: buffer with datetime and format string are generated using FDP (Fuzzing Data Provider). Test file extension was changed to .cc, because FuzzingDataProvider is used and we need building it by C++ compiler. Function `tnt_strptime` uses assert, that triggered by fuzzing tests. Therefore it was replaced with to if..then. 1. https://storage.googleapis.com/oss-fuzz-introspector/tarantool/ Fixes tarantool#8490 NO_CHANGELOG=fuzzing test NO_DOC=fuzzing test NO_TEST=fuzzing test (cherry picked from commit a1bd6e0)
Follows-up tarantool#4823 NO_CHANGELOG=internal NO_DOC=internal NO_TEST=internal (cherry picked from commit 95d62cf)
Cases in two switches had no breaks, so they were falling through. Breaks were added to solve the problem. Code generated by the LuaJIT fuzzer became more various. NO_CHANGELOG=internal NO_DOC=fuzzer fix (cherry picked from commit 4430cac)
LuaJIT fuzzer used to stop due to timeout caused by infinite cycles and recursions. Counters were introduced for every cycle and function to address LuaJIT fuzzer timeouts. The idea is to add unique counters for every cycle and function to ensure finite code execution, if it wasn't already. For while, repeat, for cycles, local and global named, anonymous functions, counters will be initialized before the code generated from protobuf, and checked in the first body statement. An entry point for the serializer was created to count cycles and functions for counter initialization. The idea was taken from a paper "Program Reconditioning: Avoiding Undefined Behaviour When Finding and Reducing Compiler Bugs" [1]. Here is an example of a change in serialized code made by this commit. Before: ```lua while (true) do foo = 'bar' end function bar() bar() end ``` After: ```lua counter_0 = 0 counter_1 = 0 while (true) do if counter_0 > 5 then break end counter_0 = counter_0 + 1 foo = 'bar' end function bar() if counter_1 > 5 then return end counter_1 = counter_1 + 1 bar() end ``` Protobuf structures that reproduce the timeout problem were added to the LuaJIT fuzzer corpus. [1] https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2023/PLDI.pdf NO_CHANGELOG=internal NO_DOC=fuzzer fix (cherry picked from commit 4d004bb)
This refactoring will: 1. Move macros from a header to the source file. Macros should be used in header only with undef to avoid redefinitions. Undef directive is not useful since we want to use these macros in the source file. 2. Remove `using namespace lua_grammar` from header. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rs-using-directive 3. Moving serializer entry point and constant parameters into luajit_fuzzer namespace. It's a common practice in C++ to avoid name collisions. 4. Move serializer functions into anonymous namespace. These functions are not a part of the interface so should have static linkage. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rs-unnamed2 5. Fix ConvertToStringDefault function. It was logically wrong so it would generate an identifier `123` from `*123`. NO_CHANGELOG=internal NO_DOC=fuzzer fix (cherry picked from commit 56488e1)
c5c1c42
to
3d4f818
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ligurio, thanks for the fixes! References for the three top-most commits are invalid:
test/fuzz: add breaks to switch-case
test/fuzz: fix luaJIT fuzzer timeout
test/fuzz: refactor LuaJIT fuzzer
You used commit hashes from the PR, but the patches were cherry-picked, so the hashes differ. LGTM otherwise.
UPD: this is another GitHub problem: ligurio/tarantool master branch was not updated for a long time, hence looking for new refs failed for the aforementioned commits. Everything becomes fine, when @ligurio updated his fork.