New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple changes; python 2 support, dictionary, mutator refactor #26
-
Updates for running on Python 2.
gerph committedJan 6, 2020 The tool is now able to run on Python 2 with a suitable LRU Cache. Obviously Python 2 isn't as useful to many people, but when working with libraries that are Python 2-only, it's necessary to make the tools work with it.
-
Ensure that the 'NEW' state reports the coverage.
gerph committedJan 6, 2020 Because the 'NEW' state was being logged before the coverage count was updated, the log line would not include the new coverage count. This change ensures that the log line reports the new coverage count.
-
Rework the mutators into separate classes.
gerph committedJan 6, 2020 The mutators were supplied inline with the corpus mutation loop. This makes it tedious to extend, and difficult to filter out mutations which are not interesting. The code has been reworked here so that... * Each mutator is its own class. * Each class can provide information about what it does, such as its name and the types of mutations it performs. * Each class is registered into a list of classes that are available. * The Corpus instantiates these classes when it is intialised, and could (but does not at this time) filter the list as necessary. The name isn't even used yet. * Mutators can return None to say that they're not appropriate. This means that adding a new mutator is a matter of creating a new class, in the same style as the existing ones, and giving information on what the mutator does. Mutators could be based on one another - so for example the 'swap' mutator could be reworked to exchange variable lengths of values, rather than only bytes, and then subclassed to produce short, long and longlong variants. This has not been done here. Previously, the code attempted to retry applying mutators if they were not deemed appropriate; this was ineffective because they merely tried to decrement the iteration count, which did not affect the iterations at all - it looks like the code was originally using C-style for loops where the variable controls the termination, whilst in Python the range controls the iteration of this loop. This has been replaced by the mutator returning None to signal that it is inappropriate, and a loop in the caller repeats the selection of a new mutator.
-
Add help message describing the mutators available and filtering.
gerph committedJan 7, 2020 Mutators can now be listed as part of a help command, and may then be filtered by the user supplying a filter specification to disable or enable only certain mutators.
-
Replaced infinite mutation loop with bounded loop.
gerph committedJan 7, 2020 Previously there was always a likelihood that we would terminate our retries if the mutator said that it was unable to be used, because we had a number of mutators that were unconditional. However, now that the mutators are able to be filtered, it is possible to select a set of mutators which may always claim they are inappropriate. In such a case, we would loop forever. This change bounds the retries on looking for a mutator to 20 attempts - an arbitrary number I picked from the air as seeming reasonable.
-
Add support for dictionary files
This commits adds the support for dictionaries (https://llvm.org/docs/LibFuzzer.html#dictionaries), to help fuzzers increase their coverage faster. It seems that there is a bug in the _copy function, because the word is correctly inserted, but it seems that the padding after it is wrong, and I couldn't understand why. Although to be honest, I didn't spent much time on it, since I'd like to have feedback on this PR before investing more debug time. The implementation is pretty crude, it silently ignore invalid lines in the dictionary file, and is likely using words in the corpus a bit too often.
-
Add support for escaped strings to the dictionary.
gerph committedJan 7, 2020 The dictionary reader can now handle escaped strings in its vales, as given in the AFL examples.
-
Add dictionary support for a binary load of directories of files.
gerph committedJan 7, 2020 Raw binary files are supported by the AFL dictionaries, and matter most for the cases where we're dealing with binary chunks that would otherwise be tedious to insert into the token file.
-
Fix for dictionary mutator failing to return correct values.
gerph committedJan 7, 2020 The dictionary mutator wasn't actually returning the correct values, so was always claiming to fail.
-
Add the operation type to the mutator types; add a dictionary append.
gerph committedJan 7, 2020 Knowing which type of mutators are in use helps to favour certain kinds of operations. In particular, if you start with a small dictionary of words and wish to generate longer strings, using just an option that appends to the dictionary is useful. The dictionary append operation has been added which allows the values from the dictionary to be strung together in increasingly longer sequences. It doesn't offer the option of appending multiple values from the dictionary so the operation my end up in a local minima, but such mutators could be added in the future.
-
Replace the timeout process kill with a terminate.
gerph committedJan 7, 2020 The kill operation is never a good choice for stopping a subprocess - it does not give the subprocess any chance to clean up. It's more usual to try a terminate and later kill if the process did not stop. More importantly to me, the kill method isn't present in the python 2 multiprocessing module.
-
Fix for corpus crashing if no mutator_filters supplied.
gerph committedJan 11, 2020 The mutator filters were being treated as a string, even when they were using the request for the defaults.
-
Update the requirements in setup.py and requirements.txt.
gerph committedJan 11, 2020 The setup.py and requirements.txt had been stripped to make them work on both python2 and 3, without regard for keeping explicit statements of what it works with. This has been updated to specify the enviroment checks, so that we only install the requirements if they're needed on a given python version. Similarly, the python_requires has been updated to say "I'll take any 2.x version over 2.7, OR any 3.x version over 3.5.3" which better matches with the original requirements.
-
Fix for the dictionary text insert leaving NUL characters in string.
gerph committedJan 11, 2020 The dictionary text insert wasn't working properly, despite looking just fine. I think the reason is that the parameters on 'copy' are transposed. The parameter called 'src' is actually the destination, (and vice-versa). This caused much confusion when trying to work out what was happening there. This change fixes the behaviour of the dictionary insert, but defers fixing the parameter names until it can be confirmed what was meant and that the strings are being manipulated correctly. The change also corrects a mistaken format character in the mutator __repr__ method.