Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple changes; python 2 support, dictionary, mutator refactor #26

Merged
merged 15 commits into from Jan 19, 2020
Merged
Changes from 1 commit
Commits
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Next

Updates for running on Python 2.

The tool is now able to run on Python 2 with a suitable LRU Cache.
Obviously Python 2 isn't as useful to many people, but when working
with libraries that are Python 2-only, it's necessary to make the
tools work with it.
  • Loading branch information
gerph committed Jan 6, 2020
commit 9aa183f81671bdbb18a1b3232448c851c5018846
@@ -199,6 +199,7 @@ def mutate(self, buf):
v = struct.pack('>H', v)
else:
v = struct.pack('<H', v)
v = bytearray(v)
res[pos] = numpy.uint8(res[pos] + v[0])
res[pos+1] = numpy.uint8(res[pos] + v[1])
elif x == 9:
@@ -214,6 +215,7 @@ def mutate(self, buf):
v = struct.pack('>I', v)
else:
v = struct.pack('<I', v)
v = bytearray(v)
for k in range(4):
res[pos+k] = numpy.uint8(res[pos+k] + v[k])
elif x == 10:
@@ -229,6 +231,7 @@ def mutate(self, buf):
v = struct.pack('>Q', v)
else:
v = struct.pack('<Q', v)
v = bytearray(v)
for k in range(8):
res[pos+k] = numpy.uint8(res[pos+k] + v[k])
elif x == 11:
@@ -249,6 +252,7 @@ def mutate(self, buf):
v = struct.pack('>H', v)
else:
v = struct.pack('<H', v)
v = bytearray(v)
res[pos] = numpy.uint8(v[0])
res[pos+1] = numpy.uint8(v[1])
elif x == 13:
@@ -262,6 +266,7 @@ def mutate(self, buf):
v = struct.pack('>I', v)
else:
v = struct.pack('<I', v)
v = bytearray(v)
for k in range(4):
res[pos+k] = numpy.uint8(v[k])
elif x == 14:
@@ -16,6 +16,13 @@

SAMPLING_WINDOW = 5 # IN SECONDS

try:
lru_cache = functools.lru_cache
except:
import functools32
lru_cache = functools32.lru_cache


if coverage.version.version_info <= (5, ):
# Since we're using an old version of coverage.py,
# we're monkey patching it a bit to improve the performances.
@@ -24,7 +31,7 @@
# function triggers a lot of syscalls.
# See the benchmarks here:
# - https://github.com/fuzzitdev/pythonfuzz/issues/9
@functools.lru_cache(None)
@lru_cache(None)
def abs_file_cache(path):
"""Return the absolute normalized form of `path`."""
try:
@@ -59,6 +66,7 @@ def write(self, x):
try:
target(buf)
except Exception as e:
print("Exception: %r\n" % (e,))
logging.exception(e)
child_conn.send(e)
break
@@ -114,11 +122,14 @@ def write_sample(self, buf, prefix='crash-'):
crash_path = self._exact_artifact_path
else:
crash_path = prefix + m.hexdigest()
logging.info('sample written to {}'.format(crash_path))
if len(buf) < 200:
try:
logging.info('sample = {}'.format(buf.hex()))
except AttributeError:
logging.info('sample = {!r}'.format(buf))
with open(crash_path, 'wb') as f:
f.write(buf)
logging.info('sample was written to {}'.format(crash_path))
if len(buf) < 200:
logging.info('sample = {}'.format(buf.hex()))

def start(self):
logging.info("#0 READ units: {}".format(self._corpus.length))
@@ -134,7 +145,7 @@ def start(self):
break

buf = self._corpus.generate_input()
parent_conn.send_bytes(buf)
parent_conn.send_bytes(bytes(buf))

This comment has been minimized.

Copy link
@yevgenypats

yevgenypats Jan 8, 2020

Contributor

Isn't it bytes already? I think this might slow down the fuzzer if it's an unnecessary copy.

This comment has been minimized.

Copy link
@gerph

gerph Jan 8, 2020

Author Contributor

There's a significant difference in how Python 2 and Python 3 handle 'bytes'.
In python 3, bytes is a list of vaules which when iterated over, or indexed, are just integers.
In python 2, bytes is a str, which when interated over, or indexed, contains single character str.

This difference means that working in the mutators with bytes is impractical - the code would need to handle both the cases of the elements of the iteration being integers and being str objects. That would be... tiresome and difficult to work with... but, if instead the elements are handled as bytearrays, this situation is simplified.

In both python 2 and python 3, bytearray is able to be initialised just like a bytes object, and iterates as a set of integers. It can be converted to a bytes object easily without encoding issues, and manipulated just like we were handling the bytes object. Because we're actually working with a list of bytes which we want to mutate, a bytearray is actually the ideal container to use. The python 3 documentation describes it as:

The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has...

(https://docs.python.org/3/library/functions.html#func-bytearray)

I originally moved to bytearray because 'it just worked', but as I looked at what it was, it became more obvious that it's actually the right tool for the job. There is, obviously, the fact that the conversion may incur a little cost, but compared to the other operations and the transfers between processes that are performed, I suspect that it's entirely negligable.

if not parent_conn.poll(self._timeout):
self._p.kill()
logging.info("=================================================================")
@@ -15,14 +15,13 @@
install_requires=[
'coverage==4.5.4',
'psutil==5.6.3',
'numpy==1.17.3',
'numpy<1.17'
],
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Topic :: Software Development :: Testing"
],
python_requires='>=3.5.3',
packages=setuptools.find_packages('.', exclude=("examples",))
)
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.