req.rawBody is no longer available (regression #34) #897

Closed
bithavoc opened this Issue Nov 20, 2011 · 32 comments

Comments

Projects
None yet
9 participants

req.rawBody is no longer available in 1.5.1, I think it was the new "formidable" integration in connect/bodyParser that removed the feature. Is there another way to get the raw content after bodyParser parses the body?

@ryanrolds is there another way to get rawBody?

senchalabs/connect@c3170ee

Owner

tj commented Nov 21, 2011

I've had a few requests to add this back, however with Connect now supporting multipart this is really unlikely, it's a huge waste of memory. What do you need it for?

Sorry, pulled my comment because I was only confirming what you already knew. I don't have an answer, but your use case will affect what you can do about; which is likely why TJ is asking what you need it for.

That makes sense, but my agent is sending "application/json" and the server is using a custom JSON schema validator. I have two options, either send the request in another not parseable format like text/plain (the client has to change it's implementation and that's just not the right) or Stringify and Parse again. Point is I don't want bodyParser to work on some routes because I'd like to read the body myself, is there a way to create a custom express pipeline for certain routes? That would do the trick.

I think if we are gonna get rid of a feature like this we really need to offer an alternative, not to mention including it in the migration guide in http://expressjs.com/.

Owner

tj commented Nov 21, 2011

yeah things get a bit tricky when you want to exclude the middleware for some only, there are a few ways to approach the issues, for example if you wanted to ignore all /api/* pathnames you could do something like:

var parse = express.bodyParser();
app.use(function(req, res, next){
  if (0 == req.url.indexOf('/api') return next();
  parse(req, res, next);
});

I'm gonna try something like that, thanks TJ and Ryan.

bithavoc closed this Nov 21, 2011

Contributor

defunctzombie commented Nov 21, 2011

One way I found to get around the "exclude one" middleware issue is to have another app (with routes) and then put the app.use(other_app.router()); before any middleware I did not want run. My use case was to avoid hitting the session for certain routes but I could see this being used for this case as well.

@visionmedia On a side note, how much memory is used by rawBody? Is it really that much that it is a problem to have it? Won't it be cleaned up anyway once the request is done.

Owner

tj commented Nov 21, 2011

@shtylman well it depends on the request, if you upload several large images, a video etc, then multiply that by N users you'll be out of mem in no time

Contributor

defunctzombie commented Nov 21, 2011

In that case, don't you need to use the raw body? Or would it just be delivered in chunks (which would mean rawBody would contain the whole request when that was not needed)?

Owner

tj commented Nov 21, 2011

the data is just streamed through and into temp files

rjyo commented Nov 22, 2011

@visionmedia Thanks for sharing the middleware tip.

Actually I need the progress event of formidable, and the only way I found to do that in express 2.5.1 is using the tricks here. Hope there could be some opt-out for not using the formidable parser in connect.

Owner

tj commented Nov 22, 2011

i'll probably add some options to make it simpler but you can also disable the multipart support with delete express.bodyParser.parse['multipart/form-data']

rjyo commented Nov 22, 2011

Far more better! Thank you TJ!

Contributor

defunctzombie commented Dec 30, 2011

Was there any resolution as to how to get the raw body back? I use it for some authentication (the user needs to sign the bytes they send in the body) and so on the server I need to be able to look at the exact bytes they sent and do the same signature to validate them.

Owner

tj commented Dec 30, 2011

im still iffy about having rawBody, waste of memory if you're posting large json etc, it should really be a separate middleware buffer('json'), buffer('xml', 'json') or something

Contributor

defunctzombie commented Dec 30, 2011

@visionmedia I like the idea of it being separate middleware that can also buffer the data how it wants. How would I ensure proper middleware flow given that scenario tho? I see that the bodyParser waits for the 'end' event before proceeding to the next middleware. I would image that my middleware would also need to wait for said event to make sure I have the full buffer, however if it happens before the bodyParser middleware then the body parser will not be able to gather the data itself. Maybe I need my own version of the body parser middleware in that case.

Owner

tj commented Dec 30, 2011

yeah it would be awkward you would have to re-emit on next tick or something :s I suppose you could re-emit the entire body as one chunk for the parsers haha... that's pretty lame though

Contributor

defunctzombie commented Dec 30, 2011

Actually, the following middeware before bodyParser seems to work:

app.use (function(req, res, next) {
    req.rawBody = '';
    req.setEncoding('utf8');
    req.on('data', function(chunk) { req.rawBody += chunk });
});

This seems reasonable as all of the 'data' events must happen before any of the 'end' events and so the 'rawBody' will be populated as expected. Thanks for the tip.

ethank commented Jan 12, 2012

Question on this: Is there any way to get rawBody back? I tried this middleware chunk up above and it doesn't work. I'm using rawBody to get an HMAC signature from a post request.

Owner

tj commented Jan 12, 2012

@ethank you should be able to stream the body "data" events through crypto to produce the HMAC as well

ethank commented Jan 12, 2012

@visionmedia thanks. Any good documentation as to how? not trying to be dense. I used to be able to just do hmac.update(body.rawData)

Owner

tj commented Jan 12, 2012

you can call update() several times

tunix commented Jul 29, 2013

@ethank - have you managed to do it?

lexaux commented Oct 18, 2013

hi guys, yeah I know that's a necro-thread, but shortly the way we managed it, just in case someone will be searching for the same.

Our original requirement was to have hmac verification for the request. When we simply added middleware which combines chunks to the string and produces hmac when that's ready, and put that middleware in front of bodyParser() - the thing never worked, since bodyParser was expecting for the 'data' events all of which have been consumed by the hmac-middleware before.

So we did quite a simple thing. In our hmac middleware, we just subscribed for the data and end events, and did the chunked digest update, and only put the digest to req object on req.end. The only difference was that we did not wait for this process to complete before calling next(), but rather just subscribed and called next one.

There's a logical problem with low trust here, since when in middleware chain below, we don't know if the hmac is ready yet, but the bodyParser fixes that since it only passes execution below when all the data has been received (and thus all mehtods worked). That's a sample:

    var headerValue = req.header(secureAPIHeader);
    if (!headerValue) {
        next();
    } else {
        req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
        req.setEncoding('utf8');
        req.on('data', function (chunk) {
            req.hasher.update(chunk);
        });
        req.on('end', function() {
             var hash = req.hasher.digest('hex');
             if (hash != headerValue) {
                 res.json(403, {msg: 'Wrong signature'});
             } else {
                 req.isSecureAdmin = true;
        }
        next();
    }
Member

jonathanong commented Oct 18, 2013

we can add a verify option to urlencoded and json parsers. should only be a few lines, but the docs would be way longer. i would also only support synchronous verification for simplicity. example:

app.use(connect.json({
  verify: function (req, res, buffer) {
    var headerValue = req.header(secureAPIHeader)
    if (!headerValue) 
      return

    var hmac = crypto.createHmac('sha256', config.secretAdminAPIKey).update(buffer).digest('hex')
    if (hmac !== headerValue) {
      var err = new Error('wrong hmac signature')
      err.status = 403
      throw err
    } else {
      req.isSecureAdmin = true
    }
})

is that something you guys are interested in?

lexaux commented Oct 18, 2013

Looks great for a usual use-case! I think community would have found a use for that.

Not sure about how it fits exact our case since we'll really need to use streaming. The active side somtimes pushes files via the 'file upload' way, so multipart parser is used. We really don't want to cache entire file in mem to calculate mac, so will need to use separate middleware.

Though again, I think it may be interested for 90% cases when you need acting on raw body, but don't really want to write stop json from functioning.

Thanks!

Member

jonathanong commented Oct 18, 2013

since we're removing the multipart middleware in connect 3, you'll have to verify that yourself, anyways.

lexaux commented Oct 18, 2013

oh my that was quick! Thanks.

Do you think there are any major issues in the way we approached it?

Member

jonathanong commented Oct 18, 2013

well if you're using node v0.10, it's easier to just pipe the request. i just really hate that control flow. not a big deal either way.

Member

jonathanong commented Oct 22, 2013

@lexaux merged the verify button. would be cool if you can test it out and add more tests if needed

lexaux commented Oct 25, 2013

Thanks Jonathan, I have checked that in local env. Seems to be working. Not sure what other tests could one add :)

Member

jonathanong commented Oct 25, 2013

yeah i don't know either. w00t! now hopefully you won't have to do anymore hacks like that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment