allow ressource filters/callbacks #58

Closed
vvo opened this Issue Jun 8, 2011 · 18 comments

Comments

Projects
None yet
6 participants

vvo commented Jun 8, 2011

as discussed on twitter, it would be very interesting to be able to hack proxied ressources before sending them back to the client.

something like ressource filtering where we'll be able to change the response body/headers

wonder how to integrate this into node http proxy

Contributor

Marak commented Jun 8, 2011

I think this may be a related ticket: #18

I remember investigating this option a while back, I think one of the main problems is that by introducing the ability to modify the response, you are going to decrease performance for all regular proxy scenarios. I'm sure it could be implemented intelligently, but I'm not sure how.

vvo commented Jun 8, 2011

Owner

indexzero commented Jun 8, 2011

@vvo @Marak It depends on the level of introspection into the data stream that you're looking to get. For example, gzip works because afaik it's a binary encoding algorithm and can process the raw stream. Rewriting HTTP headers requires a good HTTPParser like the one ryan wrote. It's really a question of exactly what you're trying to hack in there.

A generic stream rewriter could be possible, but everything would have to be capable of parsing streams.

vvo commented Jun 8, 2011

Contributor

Marak commented Jun 8, 2011

I've looked through that codebase before. It's buffering the entire response into memory before sending it. You shouldn't be doing this unless you really plan on modifying the response object.

What I'm thinking is that we have an API option for enabling "middle-wares" that will cause the proxy to buffer the response in memory. The default API option would be to pipe responses like we do now, which I believe to be much more performant.

vvo commented Jun 8, 2011

This option of having middlewares would trully.be awesome. I do understand
that it will kill the performance when used but this would help in some
cases. Like mines :-)

Owner

indexzero commented Jun 8, 2011

I will accept a patch for this, but we need to stay focused on performance here. Taking any notable hit on requests / second cannot be allowed.

From an implementation standpoint it would be relatively inexpensive to replace these lines:

// For each data `chunk` received from the `reverseProxy`
// `response` write it to the outgoing `res`.
response.on('data', function (chunk) {
  if (req.method !== 'HEAD') {
    res.write(chunk);
  }
});

https://github.com/nodejitsu/node-http-proxy/blob/master/lib/node-http-proxy.js#L451-457

And perform a simple check to see if there are any middlewares, and if so write the mutated chunk. Something like:

// For each data `chunk` received from the `reverseProxy`
// `response` write it to the outgoing `res`, after passing it 
// to any middlewares on this instance 
response.on('data', function (chunk) {
  if (req.method !== 'HEAD') {
    if (self.middlewares.length > 0) {
      for (var i = 0; i < self.middlewares.length; i++) {
        chunk = self.middlewares[i](chunk);
      }
    }

    res.write(chunk);
  }
});

vvo commented Jun 28, 2011

Hi, I forked node http proxy and added two events + one option

The readme explains how to use, tell me what do you think :
https://github.com/fasterize/node-http-proxy/blob/master/README.md (bottom of the file)

Delegate option

The delegate option will give you full control on when is the proxied data written to the client, with this option you'll be able to modify the data before it is proxied to the client

var http = require('http'),
    httpProxy = require('http-proxy');

httpProxy.createServer(function (req, res, proxy) {

  req.on('proxiedData', function(response, data) {
    // carefull using this event in combination with proxiedRequestEnd and delegate option as this is completely asynchronous
    // you should perhaps use proxiedRequestEnd if you'r purpose is to hack the full proxiedData, otherwise you could have the buffer written to the client in bad order
  });

  req.once('proxiedRequestEnd', function(response, data) {
    // data is a binary buffer, you could do this for example :
    if (/html/.test(response.headers["content-type"])) {
      var html = data.toString();
      html = html.replace('proxied', 'proxied and hacked');
      res.writeHead(response.statusCode, response.headers);
      res.write(html);
      // Will write request successfully proxied and hacked ...
    }
  });

  proxy.proxyRequest(req, res, {
    host: 'localhost',
    port: 9000,
    delegate: true
  });

}).listen(8000);

http.createServer(function (req, res) {
  res.writeHead(200, { 'Content-Type': 'text/plain' });
  res.write('request successfully proxied: ' + req.url +'\n' + JSON.stringify(req.headers, true, 2));
  res.end();
}).listen(9000);

This should only work with classic http proxy, didn't test any other cases (https, websockets, forward proxy).
This example is not complete, it will not write non-html content type responses to the client !

vvo commented Jun 29, 2011

I updated my fork, now I do not construct any buffer into node http proxy, only using custom "req" object events.

Constructing a buffer into node http proxy was slowing down the proxy for regular users, now its not

See an example at the bottom of my updated readme : https://github.com/fasterize/node-http-proxy

vvo commented Jul 18, 2011

Updated the fork, you now have 3 new events on the req object :

  • proxiedHeaders -> when proxied request headers are received
  • proxiedData -> every time proxied request data is received
  • proxiedRequestEnd -> when the proxied request has ended

Theses 3 new events can help you saving the proxied request data.

Also, with the delegate option of the proxy, you'll be able to save the proxied data AND modify it before sending it back to the client.

Can't wait to try this, right now I am monkey patching res.write()

http.createServer(function(req, res) {
  var o_write = res.write;
  res.write = function(chunk, encoding) {
    // concat chunk somewhere (this.body ?) and process it on proxy.on('end')
    o_write.call(this, chunk, encoding);
  };
  // proxy.proxyRequest(...
});
Contributor

Marak commented Jul 25, 2011

I think @dominictarr is close to getting proper middle-wares enabled. Will be reviewing all this in the upcoming weeks.

Thanks!

dominictarr reopened this Jul 25, 2011

Contributor

dominictarr commented Jul 25, 2011

see node-http-proxy with connect middle ware support in the middleware branch. In particular see https://github.com/nodejitsu/node-http-proxy/blob/middleware/lib/node-http-proxy.js#L115-235

Contributor

dominictarr commented Jul 25, 2011

it supports connect middlewares, and is also backwards compatible with node-http-proxy < 0.6.0

vvo commented Jul 25, 2011

Hello, I've read the code it looks very nice.

Could you provide us an example on how to use ?
In particular, providing a callback AND middlewares to createServer. Those seems very close in your branch.

Thank you.

Contributor

dominictarr commented Jul 25, 2011

here:

https://github.com/nodejitsu/node-http-proxy/blob/middleware/examples/gzip-middleware.js

callback works the same as before. put it after the middlewares.

also, creating the proxy with {router: {...}} or port, host will still work, even when using middlewares.

there is a example of modify response using middlewares?

Contributor

dominictarr commented Aug 3, 2011

yes -- https://github.com/nodejitsu/node-http-proxy/blob/master/examples/jsonp-middleware.js is an example of that.

you will be wanting to have a read through the code of https://github.com/steelThread/connect-jsonp

dominictarr closed this Aug 3, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment