How to modify the response HTML data going back to the user? #382

Closed
PhilAndrew opened this Issue Mar 13, 2013 · 16 comments

Projects

None yet

9 participants

@PhilAndrew

Hi, I really need to know how to get the response HTML as a string and modify it.

This question has been asked before at stackoverflow but I don't see how to do it.

http://stackoverflow.com/questions/13596942/using-node-js-to-proxy-http-and-modify-response

In general, I need to get access to all request data and all response data and change the response data in some cases. Thanks!

@indexzero
nodejitsu member

This short answer is you can't. You can observe it, but you cannot modify it, node-http-proxy writes to the response for you. In 0.9.0 we add the proxyResponse event which allows you to observe the response data for caching purposes, but not to modify it.

e.g.

//
// Create a proxy server with custom application logic
//
httpProxy.createServer(function (req, res, proxy) {
  //
  // Listen to the `proxyResponse` event. Make note that 
  // the first two arguments to this handler are the `req`
  // and `res` in the existing scope and I'm indicating that 
  // they are meaningless by assigning them the var name `_` 
  //
  req.on('proxyResponse', function (_, _, proxyRes) {
    proxyRes.on('data', function (chunk) {
      //
      // This is the data from the target server, but modifying
      // it will not affect the outgoing `res`. 
      // 
    });
  });

  //
  // Put other custom server logic here
  //

  // Now make the proxy request.
  proxy.proxyRequest(req, res, {
    host: 'localhost',
    port: 9000
  });
}).listen(8000);

In general this is really bad practice unless you can do the modification in a streaming manner because you will have to buffer the entire response.

@indexzero indexzero closed this Mar 13, 2013
@PhilAndrew

Hi there,

What I want to do is have a Wordpress website and proxy it with NodeJS, then when a webpage comes back from Wordpress at some particular URL then I want to modify the contents of the webpage to insert a form.

The idea is that its slightly difficult to develop in Wordpress and easier to develop in NodeJS (for me). So by proxy Wordpress I thought I can add extra things to my Wordpress page. Wordpress is good for business cases, fast development of site at a cheaper price and easy for user to use as a CRM.

Can you suggest a way to go about doing this? I would really appreciate any thoughts you have on this.

I want to intercept all incoming and outgoing and modify the request/response data in insert my own data, ie a form.

Thanks! Philip

@No9

@PhilHongKong Harmon is designed to plug into node-http-proxy https://github.com/No9/harmon
It uses trumpet and so is stream based to work around the buffering problem that @indexzero mentions. It uses an element and attribute selector to enable manipulation of a response.

@indexzero
nodejitsu member

Nice. @No9 could you make a pull-request to README.md about this? We get this question a lot.

@PhilAndrew

Thanks, yes my first thought was to google for a proxy to NodeJS. The question is in general, how to find the Harmon/Trumpet in the first place for a person who is searching for a way to change the data coming out of a web-server passing through the proxy, I didn't find Haron/Trumpet from google.

@No9

Pull done. Also did a little bit of house keeping on the repo readme too.

@emanuelsaringan

How about JSON responses? What what module similar to this can I use to modify JSON responses?

@kltm kltm referenced this issue in geneontology/noctua Oct 8, 2014
Closed

Implement live synced view all the time #16

@Skyross

How about using replacestream?

@akshayl

Harmon / Trumpet allow you to replace a html element, but what if you want to insert elements at various points in the document being returned?

@No9

Hi @akshayl

https://github.com/No9/harmon/blob/master/examples/rotate.js
Demonstrates adding an additional style tag in the head without replacing.

Does this fit your use case?

@akshayl

Thanks for your reply @No9
That example seems to replace the entire head tag including the script tag which outputs the message "The piece of javascript also inside the head tag wasn't touched :)"

Before:
<html><head><script>window.onload = function () {document.getElementById("message").innerHTML = "The piece of javascript also inside the head tag wasn't touched :)";}</script></head><body><h3>A simple example of injecting some css to rotate an image into a page before it is rendered.</h3><image src="http://i.imgur.com/fpMGL.png" /><div id="message"></div></body></html>

After:
<html><head><style type="text/css"> img { -webkit-transform: rotate(-90deg); -moz-transform: rotate(-90deg); filter: progid:DXImageTransform.Microsoft.BasicImage(rotation=3);}</style></head><body><h3>A simple example of injecting some css to rotate an image into a page before it is rendered.</h3><image src="http://i.imgur.com/fpMGL.png" /><div id="message"></div></body></html>

@No9

@akshayl ah now I get you.
So https://github.com/No9/harmon/blob/master/examples/doge.js#L22
Shows how you can hold the content of the node and then use it afterwards.
The sample just logs it to console but I think this might be what you are looking for?

@shlomihassan

hi guys i am working with the same solution

//
// Create a proxy server with custom application logic
//
httpProxy.createServer(function (req, res, proxy) {
//
// Listen to the proxyResponse event. Make note that
// the first two arguments to this handler are the req
// and res in the existing scope and I'm indicating that
// they are meaningless by assigning them the var name _
//
req.on('proxyResponse', function (_, _, proxyRes) {
proxyRes.on('data', function (chunk) {
//
// This is the data from the target server, but modifying
// it will not affect the outgoing res.
//
});
});

//
// Put other custom server logic here
//

// Now make the proxy request.
proxy.proxyRequest(req, res, {
host: 'localhost',
port: 9000
});
}).listen(8000);

but in sum cases i am getting only half of the data
is there a way to get all the data no mater the size?

shlomi

@ndarilek

Sorry to hijack this issue, but I want to do exactly what this issue says I should be able to. I want to inject an httpProxy between clients and a JSON API server, only observe the JSON data before passing it back, and perform server-side logic. This seems like it should be easy but it isn't. Here is my code:

proxy.on("proxyRes", (proxyReq, req, res) => {
  res.on("data", (msg) => {
    console.log("msg", msg)
  })
  res.on("end", () => {
    console.log("end")
  })
  res.on("finish", (data) => {
    console.log("finish", data)
  })
})

I just get a bunch of finish events and no data. Using this issue because it was what I found when I searched, and I'd have expected the code above to work but either I'm putting it in the wrong place or this solution is outdated.

I'm stuck on Node 0.10.43 if that makes a difference--not my choice, unfortunately.

@ndarilek

And of course 5 minutes after I post that issue, I spot my typo:

proxy.on("proxyRes", (proxyRes, req, res) => {

I.e. The first parameter to the event callback is a res, not a req. Sorry for the spam folks. The lesson for the day is coffee first, then code.

@Armalon
Armalon commented Aug 10, 2016 edited

I had same problem and only solution I've found I peeped in the very same harmon module: https://github.com/No9/harmon/blob/master/examples/gzipped.js#L47-L64
When the author creates an server which listens 9000 port he uses such a hack when he gzipes the response. He uses zlib.Gzip which is a Transform stream and forwards his regular data through it to gzip it.
So we could do the same, here is the example:

var http = require('http'),
    connect = require('connect'),
    httpProxy = require('http-proxy');
var replaceStream = require('replacestream');
var app = connect();
var proxy = httpProxy.createProxyServer({});

app.use(function(req, res) {
    // Magic starts HERE
    // Put here any Transform stream you wish, I used replaceStream, 
    // you can write your own using following method 
    // http://codewinds.com/blog/2013-08-20-nodejs-transform-streams.html#creating-transform-stream-which-uppercases-all-text
    var replace = replaceStream('script', 'NOSCRIPT');
    var _write = res.write;
    var _end = res.end;

    replace.on('data', function(buf){
        _write.call(res, buf);
    });
    replace.on('end', function(){
        _end.call(res);
    });

    res.write = function(data){
        replace.write(data);
    };
    res.end = function(){
        replace.end();
    };
    // Magic ends HERE

    proxy.web(req, res, { target: 'http://' + req.headers.host });
});

var proxyServer = http.createServer(app);
console.log("listening on port 5050")
proxyServer.listen(5050);

Or you can just use https://github.com/philippotto/transformer-proxy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment