Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS manuscript #185

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

[![build](https://github.com/keichi/binary-parser/workflows/build/badge.svg)](https://github.com/keichi/binary-parser/actions?query=workflow%3Abuild)
[![npm](https://img.shields.io/npm/v/binary-parser)](https://www.npmjs.com/package/binary-parser)
[![status](https://joss.theoj.org/papers/ec35c0e3ccc750a5cdab9771e5a6bf21/status.svg)](https://joss.theoj.org/papers/ec35c0e3ccc750a5cdab9771e5a6bf21)

Binary-parser is a parser builder for JavaScript that enables you to write
efficient binary parsers in a simple and declarative manner.
Expand Down
Binary file added paper/benchmark.pdf
Binary file not shown.
100 changes: 100 additions & 0 deletions paper/paper.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
@misc{djiparsetxt,
author = {Christian Velez},
title = {Decrypts and parse DJI logs in node},
publisher = {GitHub},
journal = {GitHub repository},
year = {2020},
url = {https://github.com/chrisvm/node-djiparsetxt}
}

@misc{libsbp,
author = {{Swift Navigation}},
title = {Swift Binary Protocol client libraries},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
url = {https://github.com/swift-nav/libsbp}
}

@misc{nimrod,
author = {Starbeamrainbowlabs},
title = {Data downloader for the 1km NIMROD rainfall radar data},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
url = {https://github.com/sbrl/nimrod-data-downloader}
}

@misc{flexradio,
author = {Stephen Houser},
title = {NodeRed nodes for working with FlexRadio 6xxx series software defined radios},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
url = {https://github.com/stephenhouser/node-red-contrib-flexradio}
}

@misc{linky,
author = {Zehir},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
url = {https://github.com/Zehir/eesmart-d2l}
}

@misc{maxcul,
author = {Florian Beek},
title = {A pimatic Plugin to control MAX! Heating devices over a Busware CUL stick},
publisher = {GitHub},
journal = {GitHub repository},
year = {2020},
url = {https://github.com/fbeek/pimatic-maxcul}
}

@misc{kaitai,
author = {{Kaitai team}},
title = {Kaitai Struct: declarative language to generate binary data parsers},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
url = {https://github.com/kaitai-io/kaitai_struct}
}

@inproceedings{nail,
author={Bangert, Julian and Zeldovich, Nickolai},
booktitle={2014 IEEE Security and Privacy Workshops},
title={Nail: A Practical Interface Generator for Data Formats},
year={2014},
pages={158-166},
doi={10.1109/SPW.2014.31}
}

@inproceedings{nom,
author={Couprie, Geoffroy},
booktitle={2015 IEEE Security and Privacy Workshops},
title={Nom, A Byte oriented, streaming, Zero copy, Parser Combinators Library in Rust},
year={2015},
pages={142-148},
doi={10.1109/SPW.2015.31}
}

@inproceedings{parsifal,
author={Levillain, Olivier},
booktitle={2014 IEEE Security and Privacy Workshops},
title={Parsifal: A Pragmatic Solution to the Binary Parsing Problems},
year={2014},
pages={191-197},
doi={10.1109/SPW.2014.35}
}

@article{monadic,
title={Monadic parsing in Haskell},
volume={8},
doi={10.1017/S0956796898003050},
number={4},
journal={Journal of Functional Programming},
publisher={Cambridge University Press},
author={Hutton, Graham and Meijer, Erik},
year={1998},
pages={437–444}
}
102 changes: 102 additions & 0 deletions paper/paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
title: 'Binary-parser: A declarative and efficient parser generator for binary data'
tags:
- JavaScript
- TypeScript
- binary
- parser
authors:
- name: Keichi Takahashi
orcid: 0000-0002-1607-5694
affiliation: 1
affiliations:
- name: Nara Institute of Science and Technology
index: 1
date: 27 September 2021
bibliography: paper.bib
---

# Summary

This paper presents `binary-parser`, a JavaScript/TypeScript library that
allows users to write high-performance binary parsers, and facilitates the
rapid prototyping of research software that works with binary files and
network protocols. `Binary-parser`'s declarative API is designed such that
expressing complex binary structures is straightforward and easy. In addition
to the high productivity, `binary-parser` utilizes meta-programming to
dynamically generate parser codes to achieve parsing performance equivalent
to a hand-written parser. `Binary-parser` is being used by over 700 GitHub
repositories and 120 npm packages as of September 2021.

# Statement of need

Parsing binary data is a ubiquitous task in developing research software. Many
scientific instruments and software tools use proprietary file formats and
network protocols, while open-source libraries to work with them are often
unavailable or limited. In such situations, the programmer has no choice but
to write a binary parser. However, writing a binary parser by hand is
error-prone and tedious because the programmer faces challenges such as
understanding the specification of the binary format, correctly managing the
byte/bit offsets during parsing, and constructing complex data structures as
outputs.

`Binary-parser` significantly reduces the programmer's effort by automatically
generating efficient parser code from a declarative description of the binary
format supplied by the user. The generated parser code is converted to a
JavaScript function and executed for efficient parsing. To accommodate diverse
needs by different users, `binary-parser` exposes various options to ensure
flexibility and provide opportunities for customization.

A large number of software packages have been developed using `binary-parser`
that demonstrates its usefulness and practicality. Some examples include
libraries and applications to work with rainfall radars [@nimrod],
software-defined radio [@flexradio], GNSS receivers [@libsbp], smart meters
[@linky], drones [@djiparsetxt], and thermostats [@maxcul].

# Design

`Binary-parser`'s design is characterized by the following three key features:

1. **Fast**: `Binary-parser` takes advantage of meta-programming to generate
a JavaScript source code during runtime from the user's description of the
target binary format. The generated source code is then passed to the
`Function` constructor to dynamically create a function that performs
parsing. This design enables `binary-parser` to achieve parsing
performance comparable to a hand-written parser.
2. **Declarative**: As opposed to parser combinator libraries [@monadic; @nom],
`binary-parser` allows the user to express the target binary format in a
declarative manner, similar to a human-readable network protocol or file
format specification. The user can combine _primitive_ parsers (integers,
floating point numbers, bit fields, strings and bytes) using _composite_
parsers (arrays, choices, nests and pointers) to express a wide variety of
binary formats.
3. **Flexible**: Unlike binary parser generators that use an external Domain
Specific Language (DSL) [@kaitai; @nail], `binary-parser` uses an internal
DSL implemented on top of JavaScript. This design allows the user to
specify most parsing options as return values of user-defined JavaScript
functions that are invoked at runtime. For example, the offset and length
of a field can be computed from another field that has been parsed already.

# Performance evaluation

To evaluate the parsing performance of `binary-parser`, we implemented a small
parser using `binary-parser` (v2.0.1) and three major JavaScript binary parser
libraries: `binparse` (v1.2.1), `structron` (v0.4.3) and `destruct.js` (v0.2.9).
We also implemented the same parser using Node.js's Buffer API as a baseline.
The binary data to be parsed was an array of 1,000 coordinates (each expressed
as three 16-bit integers) preceded by the number of coordinates (a 32-bit
integer). The benchmarks were executed on a MacBook Air (Apple M1 CPU, 2020).
The JavaScript runtime was Node.js (v16.9.1).

![Performance comparison of binary-parser, binparse, structron, destruct.js and a hand-written parser.\label{fig:benchmark}](benchmark.pdf){ width=80% }

\autoref{fig:benchmark} shows the measurement results. Evidently,
`binary-parser` significantly outperforms its alternatives by a factor of
7.5$\times$ to 180$\times$. The plot also reveals that `binary-parser`
achieves performance equal to a hand-written parser.

# Acknowledgments

This work was partly supported by JSPS KAKENHI Grant Number JP20K19808.

# References