Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple changes; python 2 support, dictionary, mutator refactor #26

Merged
merged 15 commits into from Jan 19, 2020
Merged
Changes from 1 commit
Commits
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Replace the timeout process kill with a terminate.

The kill operation is never a good choice for stopping a subprocess - it
does not give the subprocess any chance to clean up. It's more usual to
try a terminate and later kill if the process did not stop. More
importantly to me, the kill method isn't present in the python 2
multiprocessing module.
  • Loading branch information
gerph committed Jan 7, 2020
commit ecebfffca7d36f8ca36e0fa615c756e1c1365083
@@ -158,7 +158,7 @@ def start(self):
buf = self._corpus.generate_input()
parent_conn.send_bytes(bytes(buf))

This comment has been minimized.

Copy link
@yevgenypats

yevgenypats Jan 8, 2020

Contributor

Isn't it bytes already? I think this might slow down the fuzzer if it's an unnecessary copy.

This comment has been minimized.

Copy link
@gerph

gerph Jan 8, 2020

Author Contributor

There's a significant difference in how Python 2 and Python 3 handle 'bytes'.
In python 3, bytes is a list of vaules which when iterated over, or indexed, are just integers.
In python 2, bytes is a str, which when interated over, or indexed, contains single character str.

This difference means that working in the mutators with bytes is impractical - the code would need to handle both the cases of the elements of the iteration being integers and being str objects. That would be... tiresome and difficult to work with... but, if instead the elements are handled as bytearrays, this situation is simplified.

In both python 2 and python 3, bytearray is able to be initialised just like a bytes object, and iterates as a set of integers. It can be converted to a bytes object easily without encoding issues, and manipulated just like we were handling the bytes object. Because we're actually working with a list of bytes which we want to mutate, a bytearray is actually the ideal container to use. The python 3 documentation describes it as:

The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has...

(https://docs.python.org/3/library/functions.html#func-bytearray)

I originally moved to bytearray because 'it just worked', but as I looked at what it was, it became more obvious that it's actually the right tool for the job. There is, obviously, the fact that the conversion may incur a little cost, but compared to the other operations and the transfers between processes that are performed, I suspect that it's entirely negligable.

if not parent_conn.poll(self._timeout):
self._p.kill()
self._p.terminate()
logging.info("=================================================================")
logging.info("timeout reached. testcase took: {}".format(self._timeout))
self.write_sample(buf, prefix='timeout-')
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.