Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform HTML body before to send back to the proxied website #85

Closed
devniz opened this issue Mar 13, 2020 · 21 comments
Closed

Transform HTML body before to send back to the proxied website #85

devniz opened this issue Mar 13, 2020 · 21 comments

Comments

@devniz
Copy link

devniz commented Mar 13, 2020

I am trying to proxy any website, do some changes on the returned HTML string from the proxied website and then return it to the Browser. This is how far I've got:

const Fastify = require('fastify');
const server = Fastify();
const { ungzip } = require('node-gzip');
const cheerio = require('cheerio');

export default class Proxy {
    _initProxy(host: string) {
        server.register(require('fastify-http-proxy'), { upstream: host });
        server.addHook('onSend', (request: any, reply: any, payload: any, done: () => void) => {
            const chunks = [];
            payload.on('data', (chunk) => {
                chunks.push(chunk);
            });
            payload.on('end', async () => {
                const buffer = Buffer.concat(chunks);
                if (payload.headers['content-encoding'] === 'gzip') {
                    try {
                        const decompressed = await ungzip(buffer);
                        let $ = null;
                        $ = await cheerio.load(decompressed.toString());
                        const scriptTag = '<script src="my-custom.js"></script>';
                        $('body').append(scriptTag);
                    } catch (e) {
                        console.log(e);
                    }
                }
            })
            done();
        })
        server.listen(5051);
    }
}

You can see that I am adding my-custom.js script inside the BODY tag once I have unzipped and parsed the returned payload. It is working fine, however, the last bit I can't find yet is: How to return the transformed HTML string to the browser?

I have an API responsible for starting the Proxy server:

   _initApi(): void {
        server.get('/', opts, async (request, reply) => {
            return { res: '200 - OK' }
        });

        server.get('/start-proxy', opts, async (request, reply) => {
            this.proxy._initProxy('http://example.org')
            return { res: '200 - PROXY STARTED' }
        });

        server.listen(5050, (err) => {
            if (err) {
                server.log.error(err)
                process.exit(1)
            }
            server.log.info(`server listening on ${server.server.address()}`)
        });
    } 
@Eomm
Copy link
Member

Eomm commented Mar 13, 2020

You need to provide the new payload as 2nd parameter in the done:
https://github.com/fastify/fastify/blob/master/docs/Hooks.md#onsend

@devniz
Copy link
Author

devniz commented Mar 13, 2020

@Eomm

I tried using done to return the new payload but this is actually where I can't see why it is not working as expected. The page will be blank:

const Fastify = require('fastify');
const server = Fastify();
const { ungzip } = require('node-gzip');
const cheerio = require('cheerio');

export default class Proxy {
    private newPayload: string = '';
    _initProxy(host: string) {

        server.register(require('fastify-http-proxy'), { upstream: host });
        server.addHook('onSend', (request: any, reply: any, payload: any, done: any) => {
            const err = null;
            const chunks = [];
            payload.on('data', (chunk) => {
                chunks.push(chunk);
            });
            payload.on('end', async () => {
                const buffer = Buffer.concat(chunks);
                if (payload.headers['content-encoding'] === 'gzip') {
                    try {
                        const decompressed = await ungzip(buffer);
                        let $ = null;
                        $ = await cheerio.load(decompressed.toString());
                        const scriptTag = '<script src="my-custom.js"></script>';
                        $('body').append(scriptTag);
                        this.newPayload = $.html();
                        console.log(this.newPayload) // Will show up fine in my logs.
                        done(err, this.newPayload);
                    } catch (e) {
                        console.log(e);
                    }
                }
            })

        })
        server.listen(5051);
    }
}

@devniz
Copy link
Author

devniz commented Mar 13, 2020

@Eomm Note that right before the done a console.log will show the new payload without any issues.

@devniz
Copy link
Author

devniz commented Mar 13, 2020

@Eomm I think the problem I am having is how to get the new payload once done() is called. I want to get it from the API endpoint:

 _initApi(): void {
        server.get('/', opts, async (request, reply) => {
            return { res: '200 - OK' }
        });

        server.get('/start-proxy', opts, async (request, reply) => {
            this.proxy._initProxy('http://example.org')
            return { res: '200 - PROXY STARTED' }
        });

        server.listen(5050, (err) => {
            if (err) {
                server.log.error(err)
                process.exit(1)
            }
            server.log.info(`server listening on ${server.server.address()}`)
        });
    } 

@Eomm
Copy link
Member

Eomm commented Mar 13, 2020

Your code seems ok, we should check if this is a bug managing the streams

Would you like to give it a shot?
I would start creating a (failing) test for this case and then search for what is not working as expected

@devniz
Copy link
Author

devniz commented Mar 13, 2020

@Eomm Sure thing, I let you know how it goes.

@devniz
Copy link
Author

devniz commented Mar 15, 2020

@Eomm

It appears to me that whenever I am trying to pass some parameters in the done() method, the browser won't render the proxied web app. I am not yet entirely sure how this is happening, but it could be a potential bug to fix. Have you got any luck with your failing tests?

@Eomm
Copy link
Member

Eomm commented Mar 15, 2020

I understood that you were glad to work on it so I didn't allocate time 😅

@devniz
Copy link
Author

devniz commented Mar 15, 2020

I understood that you were glad to work on it so I didn't allocate time

No worries, I'm still on it anyway. So far I have it working very well without using fastify-http-proxy middleware. It's good for progressing on my project but not helpful for that specific issue. Basically I'm just using http-proxy and it is works fine. This let me think that we may have a bug to fix and I will try to find out more in the upcoming days.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@Eomm

It appears to me that this issue has nothing to do with fastify-http-proxy plugin. Basically my proxied connection just hang up due to wrong order of middleware usage. The issue that I have is very similar to this one described here: http-party/node-http-proxy#180 (comment) I suggest to close the ticket for now and will open another one if I feel like there is a proper bug we need to fix. Also as a separate comment, it looks like node-http-proxy dies a lot and it is not reliable. Would like to see an example of production usage of fastify-http-proxy, I may have to consider doing my reverse-proxy server in Java instead. Thanks for your help.

@devniz devniz closed this as completed Mar 17, 2020
@jsumners
Copy link
Member

Also as a separate comment, it looks like node-http-proxy dies a lot and it is not reliable. Would like to see an example of production usage of fastify-http-proxy, I may have to consider doing my reverse-proxy server in Java instead.

Please provide supporting evidence for such statements and list versions being used. There are plenty of us using this in production without issue.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@jsumners Agreed with you, it was my very personal opinion here, while going through Stackoverflow and Googling around, I found different issues mentioning the one that I raised, where node-http-proxy dies after multiple runs. Most of the time it is related to bad usage (Having wrong order of middleware), so I am not saying this is unusable but quite tricky to get it right.

@mcollina
Copy link
Member

fastify-http-proxy and node-http-proxy are two seperate modules.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@jsumners I'm pretty sure there is lot of usage of fastify-http-proxy middleware, I was more asking for a solid example of usage in production.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

fastify-http-proxy and node-http-proxy are two seperate modules.

@mcollina Yes, I was originally commenting about node-http-proxy, not fastify-http-proxy.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@mcollina To be more explicit, using an HTTP proxy is not only for solving CORS issue or just for the sake of proxying an external URL. A common case is to modify the body at the end and I did not succeed with node-http-proxy package. (It works but again it dies after multiple runs). I think it would be great to have a working example of modifying the body, somewhere in the documentation of fastify-http-proxy. If I get it to work, I will be happy to push a PR. Note that node-http-proxy does have an example of modifying the body.

@jsumners
Copy link
Member

A common case is to modify the body at the end...

I'd like to see supporting evidence of that as well. A proxy is meant to facilitate communication between two systems. That can mean altering the metadata around the communication (e.g. HTTP headers), but that does not imply the content has been changed.

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@jsumners Well, I've got a requirement to inject custom JavaScript in any web app (A lot of testing framework do that to inject selenium drivers for example, ..). I went with the approach of using a reverse-proxy since there is not a thousand solution to that problem and the injection works pretty well. Not entirely sure what kind of "supporting evidence" you are looking for here?

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@jsumners Please read "Page proxying" section: https://devexpress.github.io/testcafe/documentation/how-it-works/ They use nodeJS to reverse proxy the web app under test and inject their own JS for automation.

@Eomm
Copy link
Member

Eomm commented Mar 17, 2020

I'm using fastify-reply-from in prod, that it is used under the hood by this plugin

So for sure, we could improve this module by adding new features like:

  • change the body output for some routes
  • manage errors in a custom way
  • hide some headers
    (these were my needs)

@devniz
Copy link
Author

devniz commented Mar 17, 2020

@Eomm Thanks for sharing, it is useful, will have a look at it. And yes, I strongly think we could improve this feature of modifying body output, this is a common case for node http proxy middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants