Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

req.rawBody is no longer available (regression #34) #897

Closed
bithavoc opened this Issue · 32 comments

9 participants

@bithavoc

req.rawBody is no longer available in 1.5.1, I think it was the new "formidable" integration in connect/bodyParser that removed the feature. Is there another way to get the raw content after bodyParser parses the body?

@bithavoc

@ryanrolds is there another way to get rawBody?

senchalabs/connect@c3170ee

@tj
tj commented

I've had a few requests to add this back, however with Connect now supporting multipart this is really unlikely, it's a huge waste of memory. What do you need it for?

@ryanrolds

Sorry, pulled my comment because I was only confirming what you already knew. I don't have an answer, but your use case will affect what you can do about; which is likely why TJ is asking what you need it for.

@bithavoc

That makes sense, but my agent is sending "application/json" and the server is using a custom JSON schema validator. I have two options, either send the request in another not parseable format like text/plain (the client has to change it's implementation and that's just not the right) or Stringify and Parse again. Point is I don't want bodyParser to work on some routes because I'd like to read the body myself, is there a way to create a custom express pipeline for certain routes? That would do the trick.

I think if we are gonna get rid of a feature like this we really need to offer an alternative, not to mention including it in the migration guide in http://expressjs.com/.

@tj
tj commented

yeah things get a bit tricky when you want to exclude the middleware for some only, there are a few ways to approach the issues, for example if you wanted to ignore all /api/* pathnames you could do something like:

var parse = express.bodyParser();
app.use(function(req, res, next){
  if (0 == req.url.indexOf('/api') return next();
  parse(req, res, next);
});
@bithavoc

I'm gonna try something like that, thanks TJ and Ryan.

@bithavoc bithavoc closed this
@defunctzombie
Collaborator

One way I found to get around the "exclude one" middleware issue is to have another app (with routes) and then put the app.use(other_app.router()); before any middleware I did not want run. My use case was to avoid hitting the session for certain routes but I could see this being used for this case as well.

@visionmedia On a side note, how much memory is used by rawBody? Is it really that much that it is a problem to have it? Won't it be cleaned up anyway once the request is done.

@tj
tj commented

@shtylman well it depends on the request, if you upload several large images, a video etc, then multiply that by N users you'll be out of mem in no time

@defunctzombie
Collaborator

In that case, don't you need to use the raw body? Or would it just be delivered in chunks (which would mean rawBody would contain the whole request when that was not needed)?

@tj
tj commented

the data is just streamed through and into temp files

@rjyo

@visionmedia Thanks for sharing the middleware tip.

Actually I need the progress event of formidable, and the only way I found to do that in express 2.5.1 is using the tricks here. Hope there could be some opt-out for not using the formidable parser in connect.

@tj
tj commented

i'll probably add some options to make it simpler but you can also disable the multipart support with delete express.bodyParser.parse['multipart/form-data']

@rjyo

Far more better! Thank you TJ!

@defunctzombie
Collaborator

Was there any resolution as to how to get the raw body back? I use it for some authentication (the user needs to sign the bytes they send in the body) and so on the server I need to be able to look at the exact bytes they sent and do the same signature to validate them.

@tj
tj commented

im still iffy about having rawBody, waste of memory if you're posting large json etc, it should really be a separate middleware buffer('json'), buffer('xml', 'json') or something

@defunctzombie
Collaborator

@visionmedia I like the idea of it being separate middleware that can also buffer the data how it wants. How would I ensure proper middleware flow given that scenario tho? I see that the bodyParser waits for the 'end' event before proceeding to the next middleware. I would image that my middleware would also need to wait for said event to make sure I have the full buffer, however if it happens before the bodyParser middleware then the body parser will not be able to gather the data itself. Maybe I need my own version of the body parser middleware in that case.

@tj
tj commented

yeah it would be awkward you would have to re-emit on next tick or something :s I suppose you could re-emit the entire body as one chunk for the parsers haha... that's pretty lame though

@defunctzombie
Collaborator

Actually, the following middeware before bodyParser seems to work:

app.use (function(req, res, next) {
    req.rawBody = '';
    req.setEncoding('utf8');
    req.on('data', function(chunk) { req.rawBody += chunk });
});

This seems reasonable as all of the 'data' events must happen before any of the 'end' events and so the 'rawBody' will be populated as expected. Thanks for the tip.

@ethank

Question on this: Is there any way to get rawBody back? I tried this middleware chunk up above and it doesn't work. I'm using rawBody to get an HMAC signature from a post request.

@tj
tj commented

@ethank you should be able to stream the body "data" events through crypto to produce the HMAC as well

@ethank

@visionmedia thanks. Any good documentation as to how? not trying to be dense. I used to be able to just do hmac.update(body.rawData)

@tj
tj commented

you can call update() several times

@tunix

@ethank - have you managed to do it?

@lexaux

hi guys, yeah I know that's a necro-thread, but shortly the way we managed it, just in case someone will be searching for the same.

Our original requirement was to have hmac verification for the request. When we simply added middleware which combines chunks to the string and produces hmac when that's ready, and put that middleware in front of bodyParser() - the thing never worked, since bodyParser was expecting for the 'data' events all of which have been consumed by the hmac-middleware before.

So we did quite a simple thing. In our hmac middleware, we just subscribed for the data and end events, and did the chunked digest update, and only put the digest to req object on req.end. The only difference was that we did not wait for this process to complete before calling next(), but rather just subscribed and called next one.

There's a logical problem with low trust here, since when in middleware chain below, we don't know if the hmac is ready yet, but the bodyParser fixes that since it only passes execution below when all the data has been received (and thus all mehtods worked). That's a sample:

    var headerValue = req.header(secureAPIHeader);
    if (!headerValue) {
        next();
    } else {
        req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
        req.setEncoding('utf8');
        req.on('data', function (chunk) {
            req.hasher.update(chunk);
        });
        req.on('end', function() {
             var hash = req.hasher.digest('hex');
             if (hash != headerValue) {
                 res.json(403, {msg: 'Wrong signature'});
             } else {
                 req.isSecureAdmin = true;
        }
        next();
    }
@jonathanong

we can add a verify option to urlencoded and json parsers. should only be a few lines, but the docs would be way longer. i would also only support synchronous verification for simplicity. example:

app.use(connect.json({
  verify: function (req, res, buffer) {
    var headerValue = req.header(secureAPIHeader)
    if (!headerValue) 
      return

    var hmac = crypto.createHmac('sha256', config.secretAdminAPIKey).update(buffer).digest('hex')
    if (hmac !== headerValue) {
      var err = new Error('wrong hmac signature')
      err.status = 403
      throw err
    } else {
      req.isSecureAdmin = true
    }
})

is that something you guys are interested in?

@lexaux

Looks great for a usual use-case! I think community would have found a use for that.

Not sure about how it fits exact our case since we'll really need to use streaming. The active side somtimes pushes files via the 'file upload' way, so multipart parser is used. We really don't want to cache entire file in mem to calculate mac, so will need to use separate middleware.

Though again, I think it may be interested for 90% cases when you need acting on raw body, but don't really want to write stop json from functioning.

Thanks!

@jonathanong

since we're removing the multipart middleware in connect 3, you'll have to verify that yourself, anyways.

@lexaux

oh my that was quick! Thanks.

Do you think there are any major issues in the way we approached it?

@jonathanong

well if you're using node v0.10, it's easier to just pipe the request. i just really hate that control flow. not a big deal either way.

@jonathanong

@lexaux merged the verify button. would be cool if you can test it out and add more tests if needed

@lexaux

Thanks Jonathan, I have checked that in local env. Seems to be working. Not sure what other tests could one add :)

@jonathanong

yeah i don't know either. w00t! now hopefully you won't have to do anymore hacks like that!

@rdickert rdickert referenced this issue in iron-meteor/iron-router
Closed

Access to the raw request body needed #534

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.