Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple changes; python 2 support, dictionary, mutator refactor #26

Merged
merged 15 commits into from Jan 19, 2020
Commits on Jan 7, 2020
  1. Updates for running on Python 2.

    gerph committed Jan 6, 2020
    The tool is now able to run on Python 2 with a suitable LRU Cache.
    Obviously Python 2 isn't as useful to many people, but when working
    with libraries that are Python 2-only, it's necessary to make the
    tools work with it.
  2. Ensure that the 'NEW' state reports the coverage.

    gerph committed Jan 6, 2020
    Because the 'NEW' state was being logged before the coverage count was
    updated, the log line would not include the new coverage count. This
    change ensures that the log line reports the new coverage count.
  3. Rework the mutators into separate classes.

    gerph committed Jan 6, 2020
    The mutators were supplied inline with the corpus mutation loop. This
    makes it tedious to extend, and difficult to filter out mutations
    which are not interesting.
    
    The code has been reworked here so that...
    
    * Each mutator is its own class.
    * Each class can provide information about what it does, such as its
      name and the types of mutations it performs.
    * Each class is registered into a list of classes that are available.
    * The Corpus instantiates these classes when it is intialised, and
      could (but does not at this time) filter the list as necessary.
      The name isn't even used yet.
    * Mutators can return None to say that they're not appropriate.
    
    This means that adding a new mutator is a matter of creating a new
    class, in the same style as the existing ones, and giving information
    on what the mutator does. Mutators could be based on one another -
    so for example the 'swap' mutator could be reworked to exchange
    variable lengths of values, rather than only bytes, and then subclassed
    to produce short, long and longlong variants. This has not been done
    here.
    
    Previously, the code attempted to retry applying mutators if they
    were not deemed appropriate; this was ineffective because they merely
    tried to decrement the iteration count, which did not affect the
    iterations at all - it looks like the code was originally using
    C-style for loops where the variable controls the termination, whilst
    in Python the range controls the iteration of this loop. This has
    been replaced by the mutator returning None to signal that it is
    inappropriate, and a loop in the caller repeats the selection of
    a new mutator.
  4. Add help message describing the mutators available and filtering.

    gerph committed Jan 7, 2020
    Mutators can now be listed as part of a help command, and may then
    be filtered by the user supplying a filter specification to disable
    or enable only certain mutators.
  5. Replaced infinite mutation loop with bounded loop.

    gerph committed Jan 7, 2020
    Previously there was always a likelihood that we would terminate our
    retries if the mutator said that it was unable to be used, because
    we had a number of mutators that were unconditional. However, now that
    the mutators are able to be filtered, it is possible to select a
    set of mutators which may always claim they are inappropriate. In
    such a case, we would loop forever.
    
    This change bounds the retries on looking for a mutator to 20
    attempts - an arbitrary number I picked from the air as seeming
    reasonable.
  6. Add support for dictionary files

    jvoisin authored and gerph committed Dec 16, 2019
    This commits adds the support for dictionaries
    (https://llvm.org/docs/LibFuzzer.html#dictionaries), to help fuzzers increase
    their coverage faster.
    
    It seems that there is a bug in the _copy function, because the word is
    correctly inserted, but it seems that the padding after it is wrong,
    and I couldn't understand why. Although to be honest,
    I didn't spent much time on it, since I'd like to have feedback
    on this PR before investing more debug time.
    
    The implementation is pretty crude, it silently ignore
    invalid lines in the dictionary file, and is likely using
    words in the corpus a bit too often.
  7. Fix dictionary typo.

    gerph committed Jan 7, 2020
    A small typo in the word dictionary, fixed.
  8. Add support for escaped strings to the dictionary.

    gerph committed Jan 7, 2020
    The dictionary reader can now handle escaped strings in its
    vales, as given in the AFL examples.
  9. Add dictionary support for a binary load of directories of files.

    gerph committed Jan 7, 2020
    Raw binary files are supported by the AFL dictionaries, and matter
    most for the cases where we're dealing with binary chunks that would
    otherwise be tedious to insert into the token file.
  10. Fix for dictionary mutator failing to return correct values.

    gerph committed Jan 7, 2020
    The dictionary mutator wasn't actually returning the correct values,
    so was always claiming to fail.
  11. Add the operation type to the mutator types; add a dictionary append.

    gerph committed Jan 7, 2020
    Knowing which type of mutators are in use helps to favour certain
    kinds of operations. In particular, if you start with a small dictionary
    of words and wish to generate longer strings, using just an option
    that appends to the dictionary is useful.
    
    The dictionary append operation has been added which allows the
    values from the dictionary to be strung together in increasingly
    longer sequences. It doesn't offer the option of appending multiple
    values from the dictionary so the operation my end up in a local
    minima, but such mutators could be added in the future.
  12. Replace the timeout process kill with a terminate.

    gerph committed Jan 7, 2020
    The kill operation is never a good choice for stopping a subprocess - it
    does not give the subprocess any chance to clean up. It's more usual to
    try a terminate and later kill if the process did not stop. More
    importantly to me, the kill method isn't present in the python 2
    multiprocessing module.
Commits on Jan 11, 2020
  1. Fix for corpus crashing if no mutator_filters supplied.

    gerph committed Jan 11, 2020
    The mutator filters were being treated as a string, even when they
    were using the request for the defaults.
Commits on Jan 12, 2020
  1. Update the requirements in setup.py and requirements.txt.

    gerph committed Jan 11, 2020
    The setup.py and requirements.txt had been stripped to make them
    work on both python2 and 3, without regard for keeping explicit
    statements of what it works with. This has been updated to specify
    the enviroment checks, so that we only install the requirements
    if they're needed on a given python version.
    
    Similarly, the python_requires has been updated to say "I'll take
    any 2.x version over 2.7, OR any 3.x version over 3.5.3" which
    better matches with the original requirements.
  2. Fix for the dictionary text insert leaving NUL characters in string.

    gerph committed Jan 11, 2020
    The dictionary text insert wasn't working properly, despite looking
    just fine. I think the reason is that the parameters on 'copy' are
    transposed. The parameter called 'src' is actually the destination,
    (and vice-versa). This caused much confusion when trying to work out
    what was happening there.
    
    This change fixes the behaviour of the dictionary insert, but defers
    fixing the parameter names until it can be confirmed what was meant
    and that the strings are being manipulated correctly.
    
    The change also corrects a mistaken format character in the mutator
    __repr__ method.
You can’t perform that action at this time.