Skip to content

A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



60 Commits

Repository files navigation

Awesome binary parsing

A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams.

Parser generators, parsing libraries and frameworks

  • Kaitai Struct (DSL): declarative language used for describe various binary data structures, laid out in files or in memory
  • Nom (Rust): Rust parser combinator framework
  • Hammer (C): bit-oriented parsing library
  • Construct (Python): library for parsing and building of data structures (binary or textual). Define your data structures in a declarative manner
  • Spicy (DSL, C/C++, Zeek): a next-generation parser generator for network protocols and file formats
  • Hachoir (Python): view and edit a binary stream field by field. Long list of parsers for all kinds of formats
  • Caterpillar: Python 3.12+ library to pack and unpack structurized binary data
  • RecordFlux: toolset for the formal specification of messages and the generation of verifiable binary parsers and message generators (Ada-inspired).
  • DataScript Tools (DSL): DataScript is a formal language for modelling binary datatypes, bitstreams or file formats. PDF
  • Parsifal (OCaml): OCaml-based parsing engine. Paper: A pragmatic solution to the binary parsing problem. Olivier Levillain
  • Haka (Lua): open source security oriented language which allows to describe protocols and apply security policies on (live) captured traffic
  • BinData (Ruby): provides a declarative way to read and write structured binary data
  • Binary-parser (JavaScript): binary parser builder library which enables you to write efficient parsers in a simple & declarative way
  • Gloss (Clojure): turn complicated byte formats into Clojure data structures and Clojure data structures into compact byte representations
  • Preon (Java): Bit syntax for Java. A declarative data binding framework for dealing with binary encoded data
  • attoparsec and attoparsec-binary: (Haskell): fast parser combinator library, aimed particularly at dealing efficiently with network protocols and complicated text/binary file formats
  • Marpa (C/C++, Perl, Go): libmarpa (C)
  • Scapy (Python): send, sniff and dissect and forge network packets. Usable interactively or as a library
  • libtins (C++): crafting, sending, sniffing and interpreting raw network packets
  • libcrafter (C++): high level library for C++ designed to create and decode network packets
  • scodec (Scala): Combinator library for working with binary data
  • Apache Daffodil (Scala/Java, XML Schema): an open-source implementation of DFDL (Data Format Description Language) capable of describing many industry and military standards and parsing into a infoset, which is most commonly represented as either XML or JSON, and writing back to native format.
  • binarylang (Nim, DSL): extensible Nim DSL for creating binary parsers/encoders in a symmetric fashion
  • binaryparse (Nim, DSL): In-language DSL for reading and writing binary data supporting all sorts of patterns. Generates an efficient stream based reader and writer for the runtime execution.
  • FlexT - a DSL and a tool for generating parsers in Delphi.
  • FormatFuzzer (C++): framework for high-efficiency, high-quality generation and parsing of binary inputs
  • Deku (Rust): bit-level, symmetric, serialization/deserialization implementations for structs and enums
  • restruct (Go): library for reading and writing binary data
  • Mr. Crowbar (Python): Django-esque model framework for reading and writing binary file formats. Includes a suite of command-line tools for visualising and digging through binary data.
  • jBinary (JavaScript) High-level API for working with binary data.
  • Wuffs: a memory-safe programming language (and a standard library written in that language) for Wrangling Untrusted File Formats Safely. Wrangling includes parsing, decoding and encoding.
  • EverParse: a framework for generating verified secure parsers and formatters from domain-specific format specification languages
  • binrw (Rust): binrw helps you write maintainable & easy-to-read declarative binary data readers and writers using ✨macro magic✨.
  • Dogma (DSL): human-friendly metalanguage for describing data formats in documentation using the familiar patterns of Backus-Naur Form.

Stand-alone software

Hex editors with grammars
  • Synalyze It!
  • Hexinator
  • 010 Editor
  • Kiewtai: plugin for the Hiew hex editor that makes the Kaitai parsers available
  • Hobbits: multi-platform GUI for bit-based analysis, processing, and visualization. Has a Kaitai plugin.
  • ImHex: A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
  • fq: jq for binary formats - tool, language and decoders for working with binary and text formats.

Wireshark is a network protocol analyzer that includes dissectors for over two thousand protocols.

  • TShark: command line version, can easily be called from shell scripts.
  • Wireshark Generic Dissector: add-on, allows dissection of a protocol based on a text description of the protocol elements
  • Wireshark Lua: dissectors can be written in Lua (Examples)
  • pyreshark: plugin providing a simple interface for writing Wireshark dissectors in Python
  • Sharktools (Python, Matlab): Tools for programmatic parsing of packet captures using Wireshark functionality
Other Stand-alone Software
  • GNU poke: The extensible editor for structured binary data
  • Netzob: open source tool for reverse engineering, traffic generation and fuzzing of communication protocols
  • Cat Karat Packet Builder: packet generation tool that allows to build custom packets for firewall or target testing
  • radare2 (C, with bindings/pipe for almost all languages): Unix-like reverse engineering framework and commandline tools. See Parsing a fileformat with radare2 and Types.
  • Veles: open source tool for binary analysis

Research papers

  • LangSec Platform: Towards a Platform to Compare Binary Parser Generators. Olivier Levillain, Sébastien Naud, Aina Toky Rasoamanana (Video)
  • Interval Parsing Grammars for File Format Parsing Jialun Zhang, Greg Morrisett, Gang Tan
  • Narcissus: Correct-By-Construction Derivation of Decoders and Encoders from Binary Formats. Benjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, Adam Chlipala
  • EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats. Tahina Ramananandro et. al.
  • Nail: A Practical Tool for Parsing and Generating Data Formats. Julian Bangert and Nickolai Zeldovich
  • Generic packet descriptions: Verified parsing and pretty printing of low-level data. Marcell van Geest, Wouter Swierstra
  • GAPA: Generic Application-Level Protocol Analyzer and its Language. Nikita Borisov, David J. Brumley, Helen J. Wang, Chuanxiong Guo
  • PADS/ML: a functional data description language. Y. Mandelbaum, K. Fisher, D. Walker, M. F. Fernandez, and A. Gleyzer.
  • PacketTypes: P. J. McCann and S. Chandra. Packet types: Abstract specification of network protocol messages.
  • Zebu: A Language-Based Approach for Improving the Robustness of Network Application Protocol Implementations. Larent Burgy et. al.
  • Zebra: Improving the Performance of Message Parsers for Embedded Systems. Jigar Solanki et. al.
  • z2z: Automatic Generation of Network Protocol Gateways. Yerom-David Bromberg, Laurent Reveillere, Julia L. Lawall, Gilles Muller
  • Yakker: Semantics and Algorithms for Data-dependent Grammars. Trevor Jim, Yitzhak Mandelbaum, David Walker
  • BinPAC: Superseded by BinPAC++, which is now known as Spicy
  • FlowSifter: High-Speed Application Protocol Parsing and Extraction for Deep Flow Inspection. Alex X. Liu, Chad R. Meiners, Eric Norige, and Eric Torng
  • TSN.1: Transfer Syntax Notation One (TSN.1). A formal notation for describing messages in binary protocols
  • NetPDL: Markup Language that aims to describe Protocols from OSI layer 2 to OSI layer 7
  • Tupni: Automatic Reverse Engineering of Input Formats. Weidong Cui et. al.
  • W. Underwood Grammar-Based Specification and Parsing of Binary File Formats. William Underwood

Lists of interesting binary formats

This is obviously rather subjective and definitely not supposed to be a complete list:

Related topics


A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams







No releases published


No packages published