Cohort Output Format
This file briefly describes the format of mishegos's binary output.
The details of the binary format are an implementation detail and should only be of interest
if working on mishegos itself; users looking to analyze mishegos's results should run
mish2jsonl and operate on the JSONL-formatted results.
Motivation
Earlier versions of mishegos dumped their results directly to JSONL. This required us to do JSON serialization and internal allocations in the fuzzing lifecycle, incurring a performance hit.
Format
Mishegos's binary output is a sequence of "cohorts", each of which contains N outputs
where N is the number of workers.
Each cohort begins with a header:
nworkers(u32): The number of workers present in this output cohortinput(u64+str): A length-prefixed, pretty-printed hex string of the input handled by this cohort
After the header, each cohort contains nworkers output records. Each output contains:
status(u32): A status code corresponding to thedecode_statusenumndecoded(u16): The number of bytes ofinputdecodedworkerno(u32): The worker's identifying indexworker_so(u64+str): A length-prefixed string containg the path to the worker's dynamic shared objectlen(u16): The string length of the decoded instruction, or0if none is presentresult(str): A string oflenbytes containing the decoded instruction
Visualized:
|-------------------------|
| cohort 1: nworkers: 3 |
| output 1 |
| output 2 |
| output 3 |
|-------------------------|
| cohort 2: nworkers: 3 |
| output 1 |
| output 2 |
| output 3 |
|-------------------------|
| cohort ... |
| .... |
|_________________________|
Implementation
Mishego's binary output is transformed into JSONL via a parser specified in Kaitai Struct's DSL.