# Atheris: Structure-aware fuzzing--CustomMuatator

### 随机变异很难满足结构化数据

In [None]:
import atheris
import zlib
import sys


@atheris.instrument_func
def TestOneInput(data):
  try:
    decompressed = zlib.decompress(data)
  except zlib.error:
    return

  if len(decompressed) < 2:
    return

  try:
    if decompressed.decode() == 'FU':
      raise RuntimeError('Boom')
  except UnicodeDecodeError:
    pass
  
def main():
  atheris.Setup(sys.argv, TestOneInput)
  atheris.Fuzz()

if __name__ == '__main__':
    main()

p.s. introduction of `zlib` can be found in [this notebook](./support_material/07_zlib_demo.ipynb).

To reach the `RuntimeError` crash, the fuzzer needs to be able to produce inputs that are valid compressed data and satisfy the checks after decompression. 

It is very unlikely that Atheris will be able to produce such inputs: mutations on the input data will most probably result in invalid data that will fail at decompression-time.

### 自定义变异函数 CustomMutator

To overcome this issue, you can define a custom mutator function (equivalent to `LLVMFuzzerCustomMutator`). 

This example produces valid compressed data. 

To enable Atheris to make use of it, pass the custom mutator function to the invocation of `atheris.Setup`.

In [None]:
# learn/support_material/07_custom_mutator.py
import atheris

with atheris.instrument_imports():
    import sys
    import zlib


def CustomMutator(data, max_size, seed):
    try:
        decompressed = zlib.decompress(data)
    except zlib.error:
        decompressed = b"Hi"
    else:
        decompressed = atheris.Mutate(decompressed, len(decompressed))
    return zlib.compress(decompressed)


@atheris.instrument_func  # Instrument the TestOneInput function itself
def TestOneInput(data):
    """The entry point for our fuzzer.

    This is a callback that will be repeatedly invoked with different arguments
    after Fuzz() is called.
    We translate the arbitrary byte string into a format our function being fuzzed
    can understand, then call it.

    Args:
      data: Bytestring coming from the fuzzing engine.
    """

    try:
        decompressed = zlib.decompress(data)
    except zlib.error:
        return

    if len(decompressed) < 2:
        return

    try:
        if decompressed.decode() == "FU":
            raise RuntimeError("Boom")
    except UnicodeDecodeError:
        pass


if __name__ == "__main__":
    if len(sys.argv) > 1 and sys.argv[1] == "--no_mutator":
        atheris.Setup(sys.argv, TestOneInput)
    else:
        atheris.Setup(sys.argv, TestOneInput, custom_mutator=CustomMutator)
    atheris.Fuzz()


In [None]:
# 不使用自定义的mutator执行fuzz测试
python3 ../example_fuzzers/custom_mutator_example.py --no_mutator

In [None]:
# CMD Feedback
INFO: Using built-in libfuzzer
WARNING: Failed to find function "__sanitizer_acquire_crash_state".
WARNING: Failed to find function "__sanitizer_print_stack_trace".
WARNING: Failed to find function "__sanitizer_set_death_callback".
INFO: libFuzzer ignores flags that start with '--'
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2642083398
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 40Mb
#4194304        pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 1398101 rss: 40Mb
#8388608        pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 1198372 rss: 40Mb
#16777216       pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 1118481 rss: 40Mb
#33554432       pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 1118481 rss: 40Mb

In [None]:
# 使用自定义的mutator执行fuzz测试，更快达到报错位置
python3 ../example_fuzzers/custom_mutator_example.py

In [None]:
# CMD Feedback
INFO: Using built-in libfuzzer
WARNING: Failed to find function "__sanitizer_acquire_crash_state".
WARNING: Failed to find function "__sanitizer_print_stack_trace".
WARNING: Failed to find function "__sanitizer_set_death_callback".
INFO: found LLVMFuzzerCustomMutator (0x7f3f448c2920). Disabling -len_control by default.
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 4015017639
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 41Mb
#3      NEW    cov: 4 ft: 4 corp: 2/11b lim: 4096 exec/s: 0 rss: 41Mb L: 10/10 MS: 1 Custom-
#4      NEW    cov: 5 ft: 5 corp: 3/20b lim: 4096 exec/s: 0 rss: 41Mb L: 9/10 MS: 2 EraseBytes-Custom-
#6      NEW    cov: 6 ft: 6 corp: 4/30b lim: 4096 exec/s: 0 rss: 41Mb L: 10/10 MS: 3 Custom-ChangeBinInt-Custom-

 === Uncaught Python exception: ===
RuntimeError: Boom
Traceback (most recent call last):
  File "/home/atheris/learn/../example_fuzzers/custom_mutator_example.py", line 62, in TestOneInput
    raise RuntimeError('Boom')
RuntimeError: Boom

==18== ERROR: libFuzzer: fuzz target exited
SUMMARY: libFuzzer: fuzz target exited
MS: 8 ShuffleBytes-Custom-ChangeBit-Custom-CopyPart-Custom-CMP-Custom- DE: "FU"-; base unit: 257aeb4480115fd941e6bdf0fe312c707e7c1265
0x78,0x9c,0x73,0xb,0x5,0x0,0x0,0xe3,0x0,0x9c,
x\234s\013\005\000\000\343\000\234
artifact_prefix='./'; Test unit written to ./crash-74c6edf3c13bff4e28eee233181ef6107700f0d9
Base64: eJxzCwUAAOMAnA==

In [None]:
以十六进制和ASCII码的形式显示crash文件内容
hexdump -C crash-74c6edf3c13bff4e28eee233181ef6107700f0d9

### CustomCrossOver

还支持自定义交叉函数，如将两个数据进行拼接。可以通过`atheris.Setup`传入。

See its usage in [custom_crossover_fuzz_test.py](../src/custom_crossover_fuzz_test.py).