Clone this wiki locally
A streaming tokenizer for NodeJS.
Parsing data coming off the wire in an event-driven environment can be a
difficult proposition, with naive implementations buffering all received data
in memory until a message has been received in its entirety. Not only is this
infficient from a memory standpoint, but it may not be possible to determine
the that a message has been fully received without attempting to parse it.
This requires a parser that can gracefully handle incomplete messages and
pick up where it left off. To make this task easier,
- Tokenizing primitives for common network datatypes (e.g. signed and
unsigned integers in variois endian-nesses). These primitives can be used
directly (e.g. without
strtok.parse()) to pack/unpack types to/from
Bufferobjects. See the direct token usage for more details.
- A callback-driven approach well suited to an asynchronous environment (e.g. to allow the application to asynchronously ask another party for information about what the next type should be)
- An easily extensible type system for adding support for new, application-defined types to the core.
- Very good performance; the built-in MsgPack parser performs within 30% of the native C++ implementation.
An sample run of the built-in
examples/msgpack/bench.js. Note that the V8
JSON object is highly optimized; performing close to this and in pure
json pack: 14674 ms (100% of json) unpack: 50479 ms (100% of json) native pack: 15334 ms (104% of json) unpack: 32835 ms (65% of json) strtok pack: 15861 ms (108% of json) unpack: 46650 ms (92% of json)