Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple changes; python 2 support, dictionary, mutator refactor #26

Merged
merged 15 commits into from Jan 19, 2020

Conversation

@gerph
Copy link
Contributor

gerph commented Jan 8, 2020

This change incorporates a number of changes which I made to test a project which is stuck on Python 2 - the project uses a Python 2 focused library which has not yet been upgraded to Python 3. The changes here add python 2 support, refactor and add in features that I needed to test the tool I was working on. The individual commits are more descriptive and step through the changes, but the following summarises what was done and why:

  • Support for Python 2. This has been somewhat rough and ready, in just dropping the python 3 version requirements and adding in support where it was needed for python 2-specific things:

    • lru_cache isn't present, so I've used a functools32 python module to give its functionality.
    • bytes and str are the same thing in python 2, and bytes doesn't iterate returning values, but individual string characters, so the code has been reworked where necessary to work with bytearrays.
    • Python 3 support has been retained, and where possible I've checked that this still works in exactly the same way.
  • The 'NEW' report now reports after it has updated the coverage value, so that you see what the new coverage value is immediately, not on the new NEW or PULSE report.

  • Because I want to be able to add new mutators to be able to test a string-based expression evaluator, I didn't want to have to deal with the inline conditional list of mutators that are fixed in the corpus module.

    • I moved them all out into classes so that they can be implemented in isolation. These Mutator classes are instantiated by the corpus (maybe they should move out, but for now they're in that module still), and can be invoked to mutate a given resource.
    • The mutator objects can return None to indicate that they are not appropriate for the mutation. This addresses a bug in the original where it would attempt to retry by decrementing the iteration counter, which was ineffective as the counter is just a value from the range generator function.
    • Mutators hold their names and a set of 'type' values. The names are used for presentation, but the types can be used to filter the mutators that are used. This is important to me because I'm not mutating binary inputs - it's not useful for me to exercise the non-ascii values in the input buffer because I'm dealing in strings. Being able to filter the mutators to omit any of the byte-oriented strings is useful. If I were dealing with a bytecode parser, the full 8-bit manipulation might be more useful, but the text based system would not be, so I could turn off the ascii or text based mutators and just focus on the others.
    • Mutators can be filtered at the command line using these type values; for example byte will select just the byte mutators, whilst byte -ascii would select only the byte mutators that aren't also ascii mutators.
    • I had intended to add short example set of mutations as a command line option, so that the user could just see what sort of values would be given configuration of mutations and input directories or dictionary.
  • Integrated @jvoisin dictionary support, and enhanced. Because I'm testing an expression evaluator, having a collection of strings which are operands and operators in a dictionary and being able to get the fuzzer to use those as sequences to introduce seemed like an excellent idea. And I saw that jvoisin had already done it, and didn't see any point in reinventing it.

    • Their change is merged in, and a new mutator for inserting the words from the dictionary created based on the condition code that they had added.
    • Created a new mutator which merely appends a word from the dictionary - this allows me to focus on just constructing longer and longer expressions to see if they work.
    • Updated the dictionary parser slightly, with support for the directory-based files of dictionary values, as well as the file definitions, and added reference to the dictionary definitions to the header.

Tested with a bunch of my code and found a few bugs with it - thank you.

I wanted to add some tests for the new bits, to make sure that things were working, but I was more keen to continue my testing and pass this upstream in case it's useful to you. If not, then I have managed to solve a number of my own problems and have a really useful tool. I'm not sure that the python 2 changes are as useful to others, but suspect that the refactored mutators are more so. Dictionary support is all a credit to the original author as it made the testing I was doing MUCH easier.

gerph and others added 12 commits Jan 6, 2020
The tool is now able to run on Python 2 with a suitable LRU Cache.
Obviously Python 2 isn't as useful to many people, but when working
with libraries that are Python 2-only, it's necessary to make the
tools work with it.
Because the 'NEW' state was being logged before the coverage count was
updated, the log line would not include the new coverage count. This
change ensures that the log line reports the new coverage count.
The mutators were supplied inline with the corpus mutation loop. This
makes it tedious to extend, and difficult to filter out mutations
which are not interesting.

The code has been reworked here so that...

* Each mutator is its own class.
* Each class can provide information about what it does, such as its
  name and the types of mutations it performs.
* Each class is registered into a list of classes that are available.
* The Corpus instantiates these classes when it is intialised, and
  could (but does not at this time) filter the list as necessary.
  The name isn't even used yet.
* Mutators can return None to say that they're not appropriate.

This means that adding a new mutator is a matter of creating a new
class, in the same style as the existing ones, and giving information
on what the mutator does. Mutators could be based on one another -
so for example the 'swap' mutator could be reworked to exchange
variable lengths of values, rather than only bytes, and then subclassed
to produce short, long and longlong variants. This has not been done
here.

Previously, the code attempted to retry applying mutators if they
were not deemed appropriate; this was ineffective because they merely
tried to decrement the iteration count, which did not affect the
iterations at all - it looks like the code was originally using
C-style for loops where the variable controls the termination, whilst
in Python the range controls the iteration of this loop. This has
been replaced by the mutator returning None to signal that it is
inappropriate, and a loop in the caller repeats the selection of
a new mutator.
Mutators can now be listed as part of a help command, and may then
be filtered by the user supplying a filter specification to disable
or enable only certain mutators.
Previously there was always a likelihood that we would terminate our
retries if the mutator said that it was unable to be used, because
we had a number of mutators that were unconditional. However, now that
the mutators are able to be filtered, it is possible to select a
set of mutators which may always claim they are inappropriate. In
such a case, we would loop forever.

This change bounds the retries on looking for a mutator to 20
attempts - an arbitrary number I picked from the air as seeming
reasonable.
This commits adds the support for dictionaries
(https://llvm.org/docs/LibFuzzer.html#dictionaries), to help fuzzers increase
their coverage faster.

It seems that there is a bug in the _copy function, because the word is
correctly inserted, but it seems that the padding after it is wrong,
and I couldn't understand why. Although to be honest,
I didn't spent much time on it, since I'd like to have feedback
on this PR before investing more debug time.

The implementation is pretty crude, it silently ignore
invalid lines in the dictionary file, and is likely using
words in the corpus a bit too often.
A small typo in the word dictionary, fixed.
The dictionary reader can now handle escaped strings in its
vales, as given in the AFL examples.
Raw binary files are supported by the AFL dictionaries, and matter
most for the cases where we're dealing with binary chunks that would
otherwise be tedious to insert into the token file.
The dictionary mutator wasn't actually returning the correct values,
so was always claiming to fail.
Knowing which type of mutators are in use helps to favour certain
kinds of operations. In particular, if you start with a small dictionary
of words and wish to generate longer strings, using just an option
that appends to the dictionary is useful.

The dictionary append operation has been added which allows the
values from the dictionary to be strung together in increasingly
longer sequences. It doesn't offer the option of appending multiple
values from the dictionary so the operation my end up in a local
minima, but such mutators could be added in the future.
The kill operation is never a good choice for stopping a subprocess - it
does not give the subprocess any chance to clean up. It's more usual to
try a terminate and later kill if the process did not stop. More
importantly to me, the kill method isn't present in the python 2
multiprocessing module.
@@ -134,9 +156,9 @@ def start(self):
break

buf = self._corpus.generate_input()
parent_conn.send_bytes(buf)
parent_conn.send_bytes(bytes(buf))

This comment has been minimized.

Copy link
@yevgenypats

yevgenypats Jan 8, 2020

Contributor

Isn't it bytes already? I think this might slow down the fuzzer if it's an unnecessary copy.

This comment has been minimized.

Copy link
@gerph

gerph Jan 8, 2020

Author Contributor

There's a significant difference in how Python 2 and Python 3 handle 'bytes'.
In python 3, bytes is a list of vaules which when iterated over, or indexed, are just integers.
In python 2, bytes is a str, which when interated over, or indexed, contains single character str.

This difference means that working in the mutators with bytes is impractical - the code would need to handle both the cases of the elements of the iteration being integers and being str objects. That would be... tiresome and difficult to work with... but, if instead the elements are handled as bytearrays, this situation is simplified.

In both python 2 and python 3, bytearray is able to be initialised just like a bytes object, and iterates as a set of integers. It can be converted to a bytes object easily without encoding issues, and manipulated just like we were handling the bytes object. Because we're actually working with a list of bytes which we want to mutate, a bytearray is actually the ideal container to use. The python 3 documentation describes it as:

The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has...

(https://docs.python.org/3/library/functions.html#func-bytearray)

I originally moved to bytearray because 'it just worked', but as I looked at what it was, it became more obvious that it's actually the right tool for the job. There is, obviously, the fact that the conversion may incur a little cost, but compared to the other operations and the transfers between processes that are performed, I suspect that it's entirely negligable.

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 8, 2020

Thank you for this huge contribution! I'll try to review this shortly.

# Select a mutator from those we can apply
# We'll try up to 20 times, but if we don't find a
# suitable mutator after that, we'll just give up.
for n in range(20):

This comment has been minimized.

Copy link
@yevgenypats

yevgenypats Jan 11, 2020

Contributor

why change to 20 instead of the nm? nm = self._rand_exp() was a magic number used both in go-fuzz and afl/libfuzzer and it worked well. I tried other numbers and it usually affected the speed of the fuzzer significantly

This comment has been minimized.

Copy link
@gerph

gerph Jan 11, 2020

Author Contributor

You're confusing the two loops, I think.

The outer loop (line 495 in the new changes, just above those change) selects the number of mutations to run. Before this change, it was actually the 'number of allocators we'll try to run', because if the mutator said 'sorry, I'm not appropriate' (by decrementing the iteration count) it wouldn't actually affect the number of mutations run.

In this new version, when the mutator says it's not appropriate (by returning None at line 504), we want to try the mutation sequence again. This is what the inner loop of 20 attempts does. 20 was just picked from the air as seeming like a good number.

This means that the number of mutators that we plan to add, as determined in by nm, will be the number we generate (unless we are exceptionally unlucky and repeatedly select inapprorpriate mutations within the inner loop - which can happen if you've filtered the mutations that are used to those that are more likely to complain).

This is a change in behaviour from the original - instead of just ignoring mutations that are unavailable, the code now tries to meet the number that were requested. I chose to follow what appeared to be the intent of the code (that the number of mutators we plan to apply will be the number that we aim for, and that we retry rather than drop those attempts when they fail).

This comment has been minimized.

Copy link
@gerph

gerph Jan 11, 2020

Author Contributor

In case it's not clear, the inner loop is bounded so that we don't get stuck continually retrying. I originally was lazy and used while True and then when testing the filtering, generated a filter that left only a mutator that worked on a non-empty input, but the string always started out as empty when no dictionary or corpus was supplied, so it ran forever.

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 11, 2020

I'm still reviewing but all in all it looks good. Due to this being a pretty big change will you able to run the fuzzer (at least manually) on the examples in the repo and provide some benchmarks to see that after this change the fuzzer still finds the bugs + the speed stays the same more or less?

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 11, 2020

@jvoisin do you have any comments on PR? feel free to take a look as well.

@gerph
Copy link
Contributor Author

gerph commented Jan 11, 2020

I was intending on writing some tests to do that, but there wasn't a framework in place to check the behaviour at the time I started, and I wanted to get something fed back as I thought it useful. Then I rebased and a couple of tests turned up... so yeah, I think I can add in some tests and run the examples.

I'm not sure that the performance matters hugely on a fuzz test, but so long as this hasn't introduced an order of magnitude worse a set of results, I'll consider it a win - you might feel differently though, I understand :-)

[aside, it'd be really cool if there were CI to trigger the tests, but my experience is with non-github CI, so I won't be able to do that]

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 11, 2020

The mutator filters were being treated as a string, even when they
were using the request for the defaults.
@gerph
Copy link
Contributor Author

gerph commented Jan 11, 2020

Quick sampling tests for performance using aifc:

Prior to my change (Python 3 only):

charles@laputa ~/external/pythonfuzz/examples/aifc (master)> time python3 fuzz.py --runs 100000
#0 READ units: 0
#1 NEW     cov: 0 corp: 1 exec/s: 18 rss: 35.0859375 MB
#2 NEW     cov: 62 corp: 2 exec/s: 376 rss: 35.10546875 MB
#44 NEW     cov: 88 corp: 3 exec/s: 644 rss: 35.203125 MB
#59 NEW     cov: 93 corp: 4 exec/s: 692 rss: 35.234375 MB
/usr/local/lib/python3.7/site-packages/pythonfuzz-1.0.3-py3.7.egg/pythonfuzz/corpus.py:197: RuntimeWarning: overflow encountered in ushort_scalars
/usr/local/lib/python3.7/site-packages/pythonfuzz-1.0.3-py3.7.egg/pythonfuzz/corpus.py:227: RuntimeWarning: overflow encountered in ulong_scalars
/usr/local/lib/python3.7/site-packages/pythonfuzz-1.0.3-py3.7.egg/pythonfuzz/corpus.py:212: RuntimeWarning: overflow encountered in uint_scalars
/usr/local/lib/python3.7/site-packages/pythonfuzz-1.0.3-py3.7.egg/pythonfuzz/corpus.py:188: RuntimeWarning: overflow encountered in ubyte_scalars
/usr/local/lib/python3.7/site-packages/pythonfuzz-1.0.3-py3.7.egg/pythonfuzz/corpus.py:186: RuntimeWarning: overflow encountered in ubyte_scalars
#3342 PULSE     cov: 104 corp: 5 exec/s: 656 rss: 35.375 MB
#6802 PULSE     cov: 104 corp: 5 exec/s: 691 rss: 35.38671875 MB
#10242 PULSE     cov: 104 corp: 5 exec/s: 687 rss: 35.38671875 MB
#13569 PULSE     cov: 104 corp: 5 exec/s: 665 rss: 35.38671875 MB
#16951 PULSE     cov: 104 corp: 5 exec/s: 676 rss: 35.390625 MB
#20380 PULSE     cov: 104 corp: 5 exec/s: 685 rss: 35.39453125 MB
#23695 PULSE     cov: 104 corp: 5 exec/s: 662 rss: 35.3984375 MB
#27113 PULSE     cov: 104 corp: 5 exec/s: 683 rss: 35.40234375 MB
#30415 PULSE     cov: 104 corp: 5 exec/s: 660 rss: 35.40234375 MB
#33375 PULSE     cov: 104 corp: 5 exec/s: 591 rss: 35.40234375 MB
#36765 PULSE     cov: 104 corp: 5 exec/s: 677 rss: 35.40234375 MB
#39833 PULSE     cov: 104 corp: 5 exec/s: 613 rss: 35.40625 MB
#43002 PULSE     cov: 104 corp: 5 exec/s: 633 rss: 35.40625 MB
#46271 PULSE     cov: 104 corp: 5 exec/s: 653 rss: 35.40625 MB
#49568 PULSE     cov: 104 corp: 5 exec/s: 659 rss: 35.40625 MB
#52921 PULSE     cov: 104 corp: 5 exec/s: 670 rss: 35.40625 MB
#56227 PULSE     cov: 104 corp: 5 exec/s: 661 rss: 35.40625 MB
#59592 PULSE     cov: 104 corp: 5 exec/s: 672 rss: 35.40625 MB
#62942 PULSE     cov: 104 corp: 5 exec/s: 669 rss: 35.41015625 MB
#66265 PULSE     cov: 104 corp: 5 exec/s: 664 rss: 35.41015625 MB
#69659 PULSE     cov: 104 corp: 5 exec/s: 678 rss: 35.41015625 MB
#73002 PULSE     cov: 104 corp: 5 exec/s: 668 rss: 35.41015625 MB
#76415 PULSE     cov: 104 corp: 5 exec/s: 682 rss: 35.41015625 MB
#79825 PULSE     cov: 104 corp: 5 exec/s: 681 rss: 35.41015625 MB
#83231 PULSE     cov: 104 corp: 5 exec/s: 680 rss: 35.41015625 MB
#86670 PULSE     cov: 104 corp: 5 exec/s: 687 rss: 35.41015625 MB
#90105 PULSE     cov: 104 corp: 5 exec/s: 686 rss: 35.41015625 MB
#93518 PULSE     cov: 104 corp: 5 exec/s: 682 rss: 35.41015625 MB
#96951 PULSE     cov: 104 corp: 5 exec/s: 686 rss: 35.41015625 MB
did 100000 runs, stopping now.
      149.98 real       131.57 user        18.49 sys

After this change, using Python 3:

charles@laputa ~/external/pythonfuzz/examples/aifc (cjf-with-python-2↑6)> time python3 fuzz.py --runs 100000
#0 READ units: 0
#1 NEW     cov: 62 corp: 2 exec/s: 9 rss: 35.03515625 MB
#2 NEW     cov: 88 corp: 3 exec/s: 342 rss: 35.046875 MB
#4 NEW     cov: 93 corp: 4 exec/s: 484 rss: 35.07421875 MB
#7 NEW     cov: 104 corp: 5 exec/s: 325 rss: 35.09765625 MB
#2746 PULSE     cov: 104 corp: 5 exec/s: 547 rss: 35.26953125 MB
#5908 PULSE     cov: 104 corp: 5 exec/s: 632 rss: 35.29296875 MB
#8945 PULSE     cov: 104 corp: 5 exec/s: 607 rss: 35.296875 MB
#12264 PULSE     cov: 104 corp: 5 exec/s: 663 rss: 35.296875 MB
#15455 PULSE     cov: 104 corp: 5 exec/s: 638 rss: 35.296875 MB
#18760 PULSE     cov: 104 corp: 5 exec/s: 660 rss: 35.296875 MB
#22018 PULSE     cov: 104 corp: 5 exec/s: 651 rss: 35.30078125 MB
#25281 PULSE     cov: 104 corp: 5 exec/s: 652 rss: 35.3046875 MB
#28491 PULSE     cov: 104 corp: 5 exec/s: 641 rss: 35.3046875 MB
#31754 PULSE     cov: 104 corp: 5 exec/s: 652 rss: 35.3046875 MB
#35051 PULSE     cov: 104 corp: 5 exec/s: 659 rss: 35.3046875 MB
#38320 PULSE     cov: 104 corp: 5 exec/s: 653 rss: 35.3046875 MB
#41626 PULSE     cov: 104 corp: 5 exec/s: 661 rss: 35.3046875 MB
#44910 PULSE     cov: 104 corp: 5 exec/s: 656 rss: 35.3046875 MB
#48209 PULSE     cov: 104 corp: 5 exec/s: 659 rss: 35.3046875 MB
#51498 PULSE     cov: 104 corp: 5 exec/s: 657 rss: 35.3046875 MB
#54779 PULSE     cov: 104 corp: 5 exec/s: 656 rss: 35.3046875 MB
#58101 PULSE     cov: 104 corp: 5 exec/s: 664 rss: 35.3046875 MB
#61410 PULSE     cov: 104 corp: 5 exec/s: 661 rss: 35.3046875 MB
#64629 PULSE     cov: 104 corp: 5 exec/s: 643 rss: 35.3046875 MB
#67916 PULSE     cov: 104 corp: 5 exec/s: 657 rss: 35.3046875 MB
#71218 PULSE     cov: 104 corp: 5 exec/s: 660 rss: 35.3046875 MB
#74507 PULSE     cov: 104 corp: 5 exec/s: 657 rss: 35.3046875 MB
#77802 PULSE     cov: 104 corp: 5 exec/s: 658 rss: 35.3046875 MB
#80990 PULSE     cov: 104 corp: 5 exec/s: 637 rss: 35.3046875 MB
#83910 PULSE     cov: 104 corp: 5 exec/s: 583 rss: 35.3046875 MB
#86783 PULSE     cov: 104 corp: 5 exec/s: 574 rss: 35.3046875 MB
#90064 PULSE     cov: 104 corp: 5 exec/s: 656 rss: 35.3046875 MB
#93333 PULSE     cov: 104 corp: 5 exec/s: 653 rss: 35.3046875 MB
#96676 PULSE     cov: 104 corp: 5 exec/s: 668 rss: 33.29296875 MB
did 100000 runs, stopping now.
      155.65 real       135.75 user        19.26 sys

After this change, using Python 2:

charles@laputa ~/external/pythonfuzz/examples/aifc (cjf-with-python-2↑6)> time python2 fuzz.py --runs 100000
#0 READ units: 0
#1 NEW     cov: 35 corp: 2 exec/s: 20 rss: 33 MB
#2 NEW     cov: 48 corp: 3 exec/s: 462 rss: 33 MB
/usr/local/lib/python2.7/site-packages/pythonfuzz-1.0.3-py2.7.egg/pythonfuzz/corpus.py:221: RuntimeWarning: overflow encountered in ushort_scalars
/usr/local/lib/python2.7/site-packages/pythonfuzz-1.0.3-py2.7.egg/pythonfuzz/corpus.py:265: RuntimeWarning: overflow encountered in ulong_scalars
/usr/local/lib/python2.7/site-packages/pythonfuzz-1.0.3-py2.7.egg/pythonfuzz/corpus.py:206: RuntimeWarning: overflow encountered in ubyte_scalars
/usr/local/lib/python2.7/site-packages/pythonfuzz-1.0.3-py2.7.egg/pythonfuzz/corpus.py:204: RuntimeWarning: overflow encountered in ubyte_scalars
/usr/local/lib/python2.7/site-packages/pythonfuzz-1.0.3-py2.7.egg/pythonfuzz/corpus.py:243: RuntimeWarning: overflow encountered in uint_scalars
#20 NEW     cov: 51 corp: 4 exec/s: 721 rss: 33 MB
#6007 PULSE     cov: 51 corp: 4 exec/s: 1197 rss: 34 MB
#12188 PULSE     cov: 51 corp: 4 exec/s: 1235 rss: 34 MB
#18328 PULSE     cov: 51 corp: 4 exec/s: 1227 rss: 34 MB
#24209 PULSE     cov: 51 corp: 4 exec/s: 1175 rss: 34 MB
#30620 PULSE     cov: 51 corp: 4 exec/s: 1281 rss: 34 MB
#36833 PULSE     cov: 51 corp: 4 exec/s: 1242 rss: 34 MB
#43187 PULSE     cov: 51 corp: 4 exec/s: 1270 rss: 34 MB
#49674 PULSE     cov: 51 corp: 4 exec/s: 1297 rss: 34 MB
#56324 PULSE     cov: 51 corp: 4 exec/s: 1329 rss: 34 MB
#62846 PULSE     cov: 51 corp: 4 exec/s: 1304 rss: 34 MB
#69407 PULSE     cov: 51 corp: 4 exec/s: 1311 rss: 34 MB
#75869 PULSE     cov: 51 corp: 4 exec/s: 1292 rss: 34 MB
#82343 PULSE     cov: 51 corp: 4 exec/s: 1294 rss: 34 MB
#88272 PULSE     cov: 51 corp: 4 exec/s: 1185 rss: 34 MB
#94484 PULSE     cov: 51 corp: 4 exec/s: 1242 rss: 34 MB
did 100000 runs, stopping now.
       79.94 real        66.12 user        11.74 sys

So Python 2 is ~2x the speed of python 3 within these tests. That's interesting.

The change of a few seconds for Python3 is entirely acceptable to me, and actually quite surprising - I was expecting a greater slowdown.

Test system for completeness:

charles@laputa ~/external/pythonfuzz/examples/aifc (master)> system_profiler SPHardwareDataType
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro11,2
      Processor Name: Intel Core i7
      Processor Speed: 2.6 GHz
      Number of Processors: 1
      Total Number of Cores: 4
      L2 Cache (per Core): 256 KB
      L3 Cache: 6 MB
      Hyper-Threading Technology: Enabled
      Memory: 8 GB
      Boot ROM Version: 156.0.0.0.0
      SMC Version (system): 2.18f15
      Serial Number (system): OMITTED
      Hardware UUID: OMITTED

charles@laputa ~/external/pythonfuzz/examples/aifc (master)> python3 -V
Python 3.7.5
charles@laputa ~/external/pythonfuzz/examples/aifc (master)> python2 -V
Python 2.7.17

This also makes me think I could do with a little wrapper that automates this form of performance test so that I don't have to faff around with it. I might do something with that too.

@gerph
Copy link
Contributor Author

gerph commented Jan 11, 2020

I've pushed a group of changes to clean up the existing tests, put a makefile to let me test it with Python 2 and 3 in use, and then fixed a small bug that the existing tests showed up. I'd like to introduce some more tests for the bits that are there and try running through all of the examples. I've only run the aif one so far as part of the above test, so I'd like to know that it's safe on the others. I'll come back to that later today.

The newly added commits...

  • Fix the mutator_filters processing causing a crash if you didn't supply them - I guess I never tested without the filters. Oops.
  • Updates the gitignore for python2 and tidies it a little.
  • Fixed the requirements to be sensible in setup.py and requirements.txt, as I'd basically neutered them when I made it work with Python 2, which is not very nice. Now the installation should be more sensible for both environments.

[edited to remove the Makefile and tests, which have been moved to a separate PR]

@gerph
Copy link
Contributor Author

gerph commented Jan 11, 2020

Just starting to write some tests for the mutators... I'm pretty certain that the copy function doesn't do what users of it expect, or it's always being used wrongly. At the very least that's the confusion that @jvoisin had with dictionary, and MutatorRemoveRange definitely wasn't doing what it expected.

I would like to split the unit tests into a separate PR, because it's quite likely that in adding the tests and fixing up the behaviour I don't expect I'm going to do something wrong, independant of the other changes here. Would this be ok?

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 12, 2020

@jvoisin
Copy link
Contributor

jvoisin commented Jan 12, 2020

Since this PR is already quite large (+900 −243), it would be super-nice if you could split it a bit, sure :)

@gerph
Copy link
Contributor Author

gerph commented Jan 12, 2020

Ok, I'll split up what I can; I think some of these bits can be moved to entirely distinct PRs if they don't depend on what went before.

@gerph gerph force-pushed the gerph:cjf-with-python-2 branch from 3323815 to 9331ee6 Jan 12, 2020
The setup.py and requirements.txt had been stripped to make them
work on both python2 and 3, without regard for keeping explicit
statements of what it works with. This has been updated to specify
the enviroment checks, so that we only install the requirements
if they're needed on a given python version.

Similarly, the python_requires has been updated to say "I'll take
any 2.x version over 2.7, OR any 3.x version over 3.5.3" which
better matches with the original requirements.
@gerph gerph force-pushed the gerph:cjf-with-python-2 branch from 9331ee6 to 89f26cf Jan 12, 2020
The dictionary text insert wasn't working properly, despite looking
just fine. I think the reason is that the parameters on 'copy' are
transposed. The parameter called 'src' is actually the destination,
(and vice-versa). This caused much confusion when trying to work out
what was happening there.

This change fixes the behaviour of the dictionary insert, but defers
fixing the parameter names until it can be confirmed what was meant
and that the strings are being manipulated correctly.

The change also corrects a mistaken format character in the mutator
__repr__ method.
@gerph gerph force-pushed the gerph:cjf-with-python-2 branch from 89f26cf to 6daa2b5 Jan 12, 2020
@gerph
Copy link
Contributor Author

gerph commented Jan 12, 2020

Ok that's reduced to +534 −209, with separate PRs either raised, or pending on this one being committed, changed or whatever. I think that's the minimum I can go in features without breaking into the initial set of changes which involved moving things around more significantly.

@gerph
Copy link
Contributor Author

gerph commented Jan 12, 2020

PR #27, #28, #29 have been split off from this PR, because on checking them I found that they're entirely independant, and can be committed without reference to the changes in here.

PR #30 and #31 are dependant on this and/or the mentioned PRs to come in, but allowing for that are independant and do separate things. Those two are marked as draft PRs, as they merge to master with the inclusion of all the changes in the dependant PRs - once their dependants have been committed, their diffs reduce to just the commits that they add.

With that, I shall do nothing further on this PR unless changes are requested. I realise this has been a bit of a flurry of new things and then reworking the PR, which is disruptive. I hope that the result here, and in the other changes is acceptable, but if not, I won't be offended :-D

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 18, 2020

I'm closing this PR as it was split to different PRs.

@gerph
Copy link
Contributor Author

gerph commented Jan 18, 2020

It wasn't split into other PRs; the parts of it were split off which could be, leaving the parts that could not easily be split away. I cannot see a button to reopen the PR, so I can re-push this as a different PR if you'd prefer.

@yevgenypats
Copy link
Contributor

yevgenypats commented Jan 18, 2020

@gerph gerph mentioned this pull request Jan 18, 2020
@yevgenypats yevgenypats reopened this Jan 18, 2020
@yevgenypats yevgenypats merged commit 15fca35 into fuzzitdev:master Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.