Peter Griess edited this page Aug 31, 2010 · 4 revisions
Clone this wiki locally


A streaming tokenizer for NodeJS.

Parsing data coming off the wire in an event-driven environment can be a difficult proposition, with naive implementations buffering all received data in memory until a message has been received in its entirety. Not only is this infficient from a memory standpoint, but it may not be possible to determine the that a message has been fully received without attempting to parse it. This requires a parser that can gracefully handle incomplete messages and pick up where it left off. To make this task easier, node-strtok provides

  • Tokenizing primitives for common network datatypes (e.g. signed and unsigned integers in variois endian-nesses). These primitives can be used directly (e.g. without strtok.parse()) to pack/unpack types to/from Buffer objects. See the direct token usage for more details.
  • A callback-driven approach well suited to an asynchronous environment (e.g. to allow the application to asynchronously ask another party for information about what the next type should be)
  • An easily extensible type system for adding support for new, application-defined types to the core.
  • Very good performance; the built-in MsgPack parser performs within 30% of the native C++ implementation.



An sample run of the built-in examples/msgpack/bench.js. Note that the V8 JSON object is highly optimized; performing close to this and in pure JavaScript is no mean feat.

  pack:   14674 ms (100% of json)
  unpack: 50479 ms (100% of json)

  pack:   15334 ms (104% of json)
  unpack: 32835 ms (65% of json)

  pack:   15861 ms (108% of json)
  unpack: 46650 ms (92% of json)