Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to stream data to msgpack to decode? #61

Closed
jordanbtucker opened this issue Feb 7, 2017 · 3 comments
Closed

Is it possible to stream data to msgpack to decode? #61

jordanbtucker opened this issue Feb 7, 2017 · 3 comments

Comments

@jordanbtucker
Copy link

jordanbtucker commented Feb 7, 2017

For example, if I want to build an express middleware that parses MessagePack bodies, it seems I have to read the entire body into memory before passing it to msgpack.decode, which seems inefficient for large bodies.

If msgpack.decode could handle a readble stream, then it could begin parsing the data before the all of the data has been received from the source, which would be more efficient than waiting for the entire stream to be read before parsing it.

Just to be clear, I'm talking about decoding a single MessagePack encoded object from a stream, not decoding a stream of MessagePack objects.

Example use case

function parseMsgpackBody(req, res, next) {

  // only parse when Content-Type: application/msgpack
  if(req.headers['content-type'] === 'application/msgpack') {

    // msgpack's awesome new API for reading directly from a stream
    msgpack.decode(req, function(err, data) {
      if(err) {
        res.sendStatus(400) // invalid msgpack object
      }

      // set the request's body property to the decoded msgpack object
      req.body = data
    }
  }

  // call the next middleware
  next()
}

const app = express()
app.use(parseMsgPackBody)
@kawanet
Copy link
Owner

kawanet commented Feb 7, 2017

I have to read the entire body into memory before passing it to msgpack.decode, which seems inefficient for large bodies.

Correct. Why don't you use createDecodeStream?
A sample code below is not tested however.

function parseMsgpackBody(req, res, next) {
  if(req.headers['content-type'] === 'application/msgpack') {
    var decoder = msgpack.createDecodeStream();
    req.pipe(decoder).once("data", function() {
      req.body = data;
      next();
    }).once("error", function(err) {
      res.sendStatus(400);
    });
    return;
  }

  // call the next middleware
  next()
}

@jordanbtucker
Copy link
Author

jordanbtucker commented Feb 7, 2017

It looks like that will only read the first MessagePack object from the body, which should work for conforming clients. I suppose if I wanted to restrict the API to reject bodies with more than one MessagePack object, I could detect if there is a second one and reject the request.

function parseMsgpackBody(req, res, next) {

  // only parse when Content-Type: application/msgpack
  if(req.headers['content-type'] === 'application/msgpack') {

    // stream decode the message rather than loading all data
    // into memory then decoding the buffer
    const decoder = msgpack.createDecodeStream()

    // track whether we've read one msgpack object
    let messageDecoded = false

    req.pipe(decoder).on('data', function(data) {

      // if we've already read a msgpack object, reject the request
      if(messageDecoded) {
        res.sendStatus(400)
      }

      // set the request's body property to the decoded msgpack object
      req.body = data
      messageDecoded = true
    }).on('error', function(err) {
      res.sendStatus(400) // invalid msgpack object
    }).on('end', function() {
      // when the entire stream has been read, call the next middleware
      next()
    })
  } else {
    next()
  }
}

const app = express()
app.use(parseMsgPackBody)

Thanks for the clarification! I knew about createDecodeStream, but I didn't think about using it to decode only one object and reject reading a second object.

@kawanet
Copy link
Owner

kawanet commented Feb 7, 2017

req.pipe(decoder).on('data', function(data) {
  // if we've already read a msgpack object, reject the request
  if(req.body) {
    err=400;
    res.sendStatus(400)
    return
  }
  // set the request's body property to the decoded msgpack object
  req.body = data
  messageDecoded = true
}).on('error', function(err) {
  err=400;
  res.sendStatus(400) // invalid msgpack object
}).on('end', function() {
  // when the entire stream has been read, call the next middleware
  if (!messageDecoded) res.sendStatus(400) // empty request
  if (!err) next()
})

@kawanet kawanet closed this as completed Feb 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants