Permalink
Browse files

Initial README.md.

  • Loading branch information...
1 parent 9b4d8d1 commit 5f6dd2826220d8c10ceae66553eba0b7d051b137 @pgriess committed Aug 18, 2010
Showing with 45 additions and 0 deletions.
  1. +1 −0 .gitignore
  2. +44 −0 README.md
View
1 .gitignore
@@ -0,0 +1 @@
+README.html
View
44 README.md
@@ -0,0 +1,44 @@
+A streaming tokenizer for [NodeJS](http://nodejs.org).
+
+Parsing data coming off the wire in an event-driven environment can be a
+difficult proposition, with naive implementations buffering all received data
+in memory until a message has been received in its entirety. Not only is this
+infficient from a memory standpoint, but it may not be possible to determine
+the that a message has been fully received without attempting to parse it.
+This requires a parser that can gracefully handle incomplete messages and
+pick up where it left off. To make this task easier, `node-strtok` provides
+
+* Tokenizing primitives for common network datatypes (e.g. signed and
+ unsigned integers in variois endian-nesses).
+* A callback-driven approach well suited to an asynchronous environment (e.g.
+ to allow the application to asynchronously ask another party for
+ information about what the next type should be)
+* An easily extensible type system for adding support for new,
+ application-defined types to the core.
+
+## Usage
+
+Below is an example of a parser for a simple protocol. Each mesasge is
+prefixed with a big-endian unsigned 32-bit integer used as a length
+specifier, followed by a sequence of opaque bytes with length equal to the
+value read earlier.
+
+ var strotk = require('strtok');
+
+ var s = ... /* a net.Stream workalike */;
+
+ var numBytes = -1;
+
+ strtok.parse(s, function(v, cb) {
+ if (v === undefined) {
+ return strtok.UINT32_BE;
+ }
+
+ if (numBytes == -1) {
+ numBytes = v;
+ return new strtok.BufferType(v);
+ }
+
+ console.log('Read ' + v.toString('ascii'));
+ numBytes = -1;
+ });

4 comments on commit 5f6dd28

@ry

would love to see a msgpack benchmark

@pgriess
Owner

Meh. Numbers aren't great: node-msgpack is about 10x faster when de-serializing. Check out 2ed8a58.

node-msgpack can unpack 50k objects in 0.8s - 0.9s

node-strtok can unpack the same 50k objects in 9.8 - 10.1s

@pgriess
Owner

Getting rid of Buffer.slice() operations improved throughput 3x.

@pgriess
Owner

Both arrays and primitives are actually faster in JS than native MsgPack.

However, packing and unpacking {'abcdef' : 1}, we see the following:

native: 1379ms
js:     3592ms

The major difference appears to be in the setting of named properties on JavaScript objects. If we omit that step in the MsgPack JS parser, we instead see:

native: 1329ms
js:     2144ms
Please sign in to comment.