According to the HTTP standard, a request's URI (the second field in the first line of an HTTP request, after the verb, and which gets into req.url) may be relative or absolute.
A request to a proxy must always have an absolute URI, and indeed when we receive a request, req.url is always absolute. We just forward the request in proxyRequest and copy the URI into the new request.
The receiving end of our new request will in turn get an absolute URI. This behaviour, however standard, is breaking some sites which naively expect the path to be relative to the hostname (starting with a "/"). They are wrong, but still node-http-proxy could be "fixed" so as to not break this assumption of theirs.
By placing "req.url = req.url.replace(/.?:\/\/.?\//, '/')" (an ugly regexp, granted) before calling proxyRequest I was able to conform to the naiveté of Wordpress and other stuff which is out there being used and does not implement the standard.
Thanks for building this nice, flexible proxy.
Here is my changeset to my front-end dev tool using node-http-proxy which fixes this assumption.
This is also the cause of incompatibility with socket.io.
Add "all express sites using the express.sessions middleware," also known as "all sites using the connect sessions middleware."
if (0 != req.originalUrl.indexOf(cookie.path || '/')) return next();
That is trying to detect a leading / in the default configuration. It'll never happen with an absolute URL.
There are just too many commonly used frameworks, and therefore sites, that don't work with this behavior.
@fabiosantoscode Thanks for that simple solution, it should have occurred to me since I'm doing custom proxying with proxy.web too.
Here's a better regexp that should not experience false positives:
req.url = req.url.replace(/^\w+:\/\/.*?\//, '/');
@boutell and @fabiosantoscode if you haven't already, check out the caronte branch as that will be node-http-proxy 1.0 in the near future.
@boutell that was not my implication, I just wanted to make sure you guys were trying the newest code :). Im sure @yawnt will be on it. If you can post a gist of the smallest reproducible case, this will be extremely helpful in developing a test and figuring out a solution!
Smallest reproducible case:
It's unfortunate that the API will be one option more complicated just because there's too much stuff not implementing this correctly.
We should report this problem when we see it in the real world.
Real world cases cited here so far:
Express/Connect session middleware (many node-powered sites)
socket.io's built-in asset server
This is a duplicate of #416 which cites pages on dailymotion.
Re: the spec, that bit says in full:
"To allow for transition to absoluteURIs in all requests in future versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI form in requests, even though HTTP/1.1 clients will only generate them in requests to proxies."
Emphasis on that last bit. The servers that don't like the absolute URI are most in the wrong here, but we're not doing all that hot either because we're generating absolute URIs when not talking to (another) proxy.
[fix] closes #529
apologies for the delay, should be fixed in 9e74a63
@fabiosantoscode i used the test case you provided and now it's correctly sending "/" instead of the full path :)
With this closed, do we need to open a new issue on support for being downstream from another proxy? That has the opposite issue: an absolute URL is required.
i would be happy to accept a pull request regarding that, or a test case which outlines the issue :)