Skip to content

Legoclones/PickleFuzzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PickleFuzzer

A grammar-based whitebox fuzzer to perform differential fuzzing between the 3 native implementations of Python pickles: pickle (Python), _pickle (C), and pickletools (Python disassembler). It patches and recompiles Python 3.13.0 to enable greater introspection into the pickle runtime environment, and may require modification to work with other Python versions.

To read more about design decisions and the effectiveness of PickleFuzzer, read our published article here.

To see a quick summary of what PickleFuzzer found after a week, see our results here.

Quick Start

Build and run the container using:

docker compose up -d --build

This will spawn 6 workers with a maximum memory limit of 12GB by default that will write discrepancy payloads to the ./discrepancies directory (mounted in Docker). Additional configuration options can be set using environmental variables as described below.

Triage

Once the ./discrepancies/ directory is filled with discrepancy payloads, you can use triage.sh to parse through them, removing duplicates and spitting out the most important information. This script should only be run inside of the Docker container built above as it relies on the custom, patched version of Python to return the necessary information. You can run it like so:

docker build . -t picklefuzzer
docker run --rm -it --entrypoint=bash -v `pwd`/:/fuzz/ -w /fuzz picklefuzzer ./triage.sh

Unique exception discrepancies look like this:

Info: [None, ['UnpicklingError', 'pickle data was truncated', 193], None]
    Name: 642f02fbddb9deb08241580b12f18e8b
    Database: payloads_0015ccf15692.db
    Timestamp: 1767682967
    Count: 7248

The Info section displays either None or a list of ['ExceptionType', 'exception string', LineNumber] for [PythonPickle, CPickle, PickleTools]. It also includes the database filename + payload name of the first instance of this unique exception discrepancy for further triage, as well as the number of times the discrepancy occurred.

Inspecting Individual Files

To see more information about an individual payload, you can run src/triage.py with a db name + payload name as an additional argument. As an example:

$ docker run --rm -it --entrypoint=python3 -v `pwd`/:/fuzz/ -w /fuzz picklefuzzer src/triage.py payloads_0015ccf15692.db 642f02fbddb9deb08241580b12f18e8b
Using database: ./discrepancies/payloads_0015ccf15692.db
Payload    : b"C\xc3\xb65\xb6{Y\x16\xf6\x12y\xa2G\xdd\x878r\xa9\x1d\xfa\x01K\x89\xe6\x8d\xfa\x93\x0e\xe1\xd1\xbe\x13/\r\x9a\xb1\xe1\x1b\xf2[b;\xfe\x8c\x9e\xac\xb5\xd9\x0e\xd4\xb4\x92\xd2\x9c\xcc\xb4\x0e\x7f\x9f\xf1\xa9\xa3\x96\x16\x85?\xa2-V\xfcv\xca\xa0h\x8d\xb1\x82\xe0L\x12\xf8Z\r\xce\xac\x85`\x9a\x9dD\xd72\x8c\x07\xe5\x06\x95\xba\x83`]a\xe5\xa9D\xda\xb7sq\xd2\xb8mk\xd4A\x8bID\xaaO<\xde\xb0\x98\xcer\x8e\xf9\xd9e%\x06\xe5\xc4`}\x02\xbe\x1c\xd8n\xe8\x01\xb0L\x87\xa9\xb3\xe2g\x01\xff\x86\xb6\xcc:\x80\xdbta\x9e1:ryf\xb1W\xd2\x96g\xcd\x8b\xf1c\xba\\\xf9\xdc\x94E$0\x9ey\xb8\xd2\xdfzg\x8e\x14o\xbb\x96\xc0\xfc\x95\n\xbc\xe7\xd0\xe5\x1a\xf5\x12\x8a\xbe6\x8f\xbc&\x02\xc0\x90\xa0'\xcc\xdeL\x96\xfb\xc3RH\xe4W\x10\xb4\xc0\xa0\xf2[\xae?a\x03\xe1j\xc3\x02\xde\x81\xe1e\xa0A\xfb\xa5b\x16\xe5_\xcc\x81\xa1\x8a\xf7\xe2\xab\xd1\xda\x91`T\xc1cxf`\xaf9\xc6\x1c\xd0S\xae#2\x90\x99Z*\x04\xa6$\xbc\x9d[\xf7\x94Ba\xf7\x14\xbe\x85$\xf2\xa2\xa4\xa0b\xa9w=\r\x9c!\xcb\x19[\xc6\xad\xf5\x19F]=|GJ\xb4\x16\xe0T\xfe0\xb6\xc3*\xb1\x12\x07\xb5\xab\x03\xa0\xec\xa2k>\xbc k\x13\xb8\xed\x81\x991\x9dP\xday-\xddt_r\xb1\xa8\xa4g\x9c\xc2\xc3\x7f\nI\xef+i\xe9\xfd\x1c\xb9\x00\xd2~\xb6\xa51Q\x9c\xb4\xd8W\xaf\xb3\x82;\xb5\nn%\x14\x0b."
Stacks     : [[b'\xb65\xb6{Y\x16\xf6\x12y\xa2G\xdd\x878r\xa9\x1d\xfa\x01K\x89\xe6\x8d\xfa\x93\x0e\xe1\xd1\xbe\x13/\r\x9a\xb1\xe1\x1b\xf2[b;\xfe\x8c\x9e\xac\xb5\xd9\x0e\xd4\xb4\x92\xd2\x9c\xcc\xb4\x0e\x7f\x9f\xf1\xa9\xa3\x96\x16\x85?\xa2-V\xfcv\xca\xa0h\x8d\xb1\x82\xe0L\x12\xf8Z\r\xce\xac\x85`\x9a\x9dD\xd72\x8c\x07\xe5\x06\x95\xba\x83`]a\xe5\xa9D\xda\xb7sq\xd2\xb8mk\xd4A\x8bID\xaaO<\xde\xb0\x98\xcer\x8e\xf9\xd9e%\x06\xe5\xc4`}\x02\xbe\x1c\xd8n\xe8\x01\xb0L\x87\xa9\xb3\xe2g\x01\xff\x86\xb6\xcc:\x80\xdbta\x9e1:ryf\xb1W\xd2\x96g\xcd\x8b\xf1c\xba\\\xf9\xdc\x94E$0\x9ey\xb8\xd2\xdfzg\x8e\x14o\xbb\x96\xc0\xfc', 1591629920383856020031790383805302478501162222885503766343539668234125664673379342039229908806298575082140611377205020097965303000568258020328682374905761946529806612495110122530263794924614241472225880772273166151637509882700757293609620229958713416868608109661342012774378406571330790410147523255817828937366733619678895525844440794971039810685567032845956722194446238887136202035503619905680001317468052894282046996512700221459816099521696194987231579958], None]
Memos      : [{}, None]
Metastacks : [[], None]
Exceptions : [None, ['UnpicklingError', 'pickle data was truncated', 193], None]
Encoding   : utf-8
Buffers    : [2]
===========================================

Configuration

To set various configuration options for the fuzzer, use environmental variables in the docker-compose.yml file.

Name Default Description
WORKERS 6 The number of simultaneous processes generating and processing fuzzing payloads.
DEBUG False If enabled, print the configuration at the beginning and information like payload, encoding, and storage area results for each fuzzing payload.
OPCODE_NUM_MIN 1 The minimum number of opcodes randomly generated for each fuzzing payload.
OPCODE_NUM_MAX 20 The maximum number of opcodes randomly generated for each fuzzing payload.
BUFFERS 3 The number of random elements to put in each out-of-band buffer provided to the Unpickler classes.
MAX_INT_DIGITS 10 For opcodes expecting a number specified using ASCII digits, this limits the total number of digits. As an example, the default value 10 means the maximum number here is 10**11 - 1.
MAX_DATA_LENGTH 300 For opcodes with unlimited variable-length data (like bytearrays), this sets the maximum length.
PUT_HARD_CAP True If the PUT and LONG_BINPUT opcodes have large arguments, this can cause OOM errors in older versions of Python. This setting limits those two opcodes specifically to prevent excessive memory consumption.

In addition, resource limits can be set for the container through Docker's resource constraints. By default, the memory limit is set to 12GB but can easily be modified. In addition, memswap_limit is set to 12GB so that no swap memory will be used (otherwise it will fill up REALLY FAST).

About

A grammar-based fuzzer that performs differential fuzzing between the 3 native implementations of Python pickles

Topics

Resources

License

Stars

Watchers

Forks

Contributors