Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Location header when proxy upstream responds with a redirect #1011

Closed
limeburst opened this issue Aug 7, 2016 · 19 comments

Comments

@limeburst
Copy link

commented Aug 7, 2016

1. What version of Caddy are you running (caddy -version)?

Caddy 0.9.0

2. What are you trying to do?

Proxy a redirect.

3. What is your entire Caddyfile?

example.com
proxy / unix:/var/run/example.sock
example.com
proxy / localhost:5000

4. How did you run Caddy (give the full command and describe the execution environment)?

sudo caddy

5. What did you expect to see?

Location header should be set correctly.

6. What did you see instead (give full error messages and/or log)?

$ curl -v https://example.com/number/random/
*   Trying 0.0.0.0...
* Connected to example.com (0.0.0.0) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate: example.com
* Server certificate: Let's Encrypt Authority X3
* Server certificate: DST Root CA X3
> GET /number/random/ HTTP/1.1
> Host: example.com
> User-Agent: curl/7.49.1
> Accept: */*
> 
< HTTP/1.1 302 Found
< Content-Length: 235
< Content-Type: text/html; charset=utf-8
< Date: Sun, 07 Aug 2016 18:19:27 GMT
< Location: http://socket/number/42/
< Server: Caddy
< Server: gunicorn/17.5
< 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
* Connection #0 to host example.com left intact
<p>You should be redirected automatically to target URL: <a href="/number/42/">/number/42/</a>.  If not click the link.
$ curl -v https://example.com/number/random/
*   Trying 0.0.0.0...
* Connected to example.com (0.0.0.0) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate: example.com
* Server certificate: Let's Encrypt Authority X3
* Server certificate: DST Root CA X3
> GET /number/random/ HTTP/1.1
> Host: example.com
> User-Agent: curl/7.49.1
> Accept: */*
> 
< HTTP/1.1 302 Found
< Content-Length: 235
< Content-Type: text/html; charset=utf-8
< Date: Sun, 07 Aug 2016 18:19:27 GMT
< Location: http://localhost:5000/number/42/
< Server: Caddy
< Server: gunicorn/17.5
< 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
* Connection #0 to host example.com left intact
<p>You should be redirected automatically to target URL: <a href="/number/42/">/number/42/</a>.  If not click the link.

7. How can someone who is starting from scratch reproduce this behavior as minimally as possible?

Use the above Caddyfile, and set up a simple redirect application.

from flask import Flask, redirect, url_for


app = Flask(__name__)


@app.route('/number/random/')
def random():
    return redirect(url_for('number', number=42))


@app.route('/number/<number>/')
def number(number):
    return number
@mholt

This comment has been minimized.

Copy link
Member

commented Aug 7, 2016

Hi, thanks for your issue. But:

Location header should be set correctly.

I don't know what you mean by this. What is "correctly"? Is it that the hostname portion of the URL is different? Try the transparent preset.

@limeburst

This comment has been minimized.

Copy link
Author

commented Aug 8, 2016

Thank you for the fast reply. Yes, the hostname portion of the URL is different. Here's an example of a correct response, generated by nginx.

$ curl -v https://example.com/number/random/
*   Trying 0.0.0.0...
* Connected to example.com (0.0.0.0) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate: example.com
* Server certificate: Let's Encrypt Authority X3
* Server certificate: DST Root CA X3
> GET /number/random/ HTTP/1.1
> Host: example.com
> User-Agent: curl/7.49.1
> Accept: */*
> 
< HTTP/1.1 302 FOUND
< Server: nginx/1.4.6 (Ubuntu)
< Date: Mon, 08 Aug 2016 06:04:57 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 235
< Connection: keep-alive
< Location: http://example.com/number/42/
< 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
* Connection #0 to host example.com left intact
<p>You should be redirected automatically to target URL: <a href="/number/42/">/number/42/</a>.  If not click the link.

Setting the trasparent preset for proxy doesn't to seem to work. Instead, when using the unix socket upstream, it changes hostname part of the Location header of the reply from socket to unix.

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

Hmm, can you reproduce this without Python? Lacking a gunicorn backend, I tried with a Caddy backend and am not seeing the same problem.

Front:

localhost
proxy / localhost:5000
log stderr

Back:

localhost:5000
redir / /asdf 307
log stderr
$ curl -v http://localhost:2015/foo
*   Trying ::1...
* Connected to localhost (::1) port 2015 (#0)
> GET /foo HTTP/1.1
> Host: localhost:2015
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 307 Temporary Redirect
< Content-Length: 41
< Content-Type: text/html; charset=utf-8
< Date: Mon, 08 Aug 2016 19:22:18 GMT
< Location: /asdf
< Server: Caddy
< Server: Caddy
< 
<a href="/asdf">Temporary Redirect</a>.

Notice that Caddy is not inserting the hostname at all, let alone the wrong one. What exactly is your Python program doing?

@hogmoru

This comment has been minimized.

Copy link

commented Aug 8, 2016

Hello,

I have an issue that might be the same, with a different scenario that might be easier to reproduce. If it's indeed the same root cause, maybe it will help clarify the issue at hand :)

In my Caddyfile:

https://mycaddy.com {
  ...
  proxy /foo http://1.2.3.4:8181 {
    without /foo
  }

So my Caddy front-end redirects /foo/* to http://1.2.3.4:8181/*.
It works... except that if upstream sends a 301 redirection, then the Location header contains the upstream address.
It seems to me that Caddy should substitute this.

Example:

$ curl -I https://mycaddy.com/foo/bar
HTTP/1.1 301 Moved Permanently
Date: Mon, 08 Aug 2016 22:29:45 GMT
Location: http://1.2.3.4:8181/bar/
Server: Caddy
Server: Upstream
X-Frame-Options: DENY
Content-Type: text/plain; charset=utf-8

Then my browser happily goes directly to http://1.2.3.4:8181/bar/ :(
I think the upstream server should not be exposed (whether it is public or not), and Caddy should substitute this with the corresponding "downstream" URL.

To reproduce, you need any downstream "flat files" http server that redirects /xxx to /xxx/ when /xxx is a directory (which I believe many http servers do, Caddy included, except maybe at server root).

So the general problem seems to be that Caddy leaves the original "incorrect" Location: header, whereas the "correct" header should be the "proxied" location.

So, does it help, or is it a completely different issue and have I missed something obvious in the options ?

Edit: using Caddy 0.9.0

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

@hogmoru What are you using upstream (1.2.3.4:8181)? And what is its configuration?

@hogmoru

This comment has been minimized.

Copy link

commented Aug 8, 2016

Wow you're fast.
It's lighttpd, actually I was looking at the conf because I just realized it should simply redirect to Location: /foo/ (and then Caddy would need to do nothing), but it redirects to Location: http://1.2.3.4:8181/foo/, which is weird.

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

@hogmoru If you use transparent in the Caddy proxy config, does it change anything?

@hogmoru

This comment has been minimized.

Copy link

commented Aug 8, 2016

@mholt: no, it changes nothing.
But I need to find out why lighttpd includes the host in the Location header (if I could fix that then it would solve my problem).
In @limeburst 's setup, the URLs in Location headers also include the host, so maybe his problem is also in his upstream server?
In your example when you tried to reproduce earlier, they did not (using Caddy upstream, so Caddy 1, lighttpd 0 ? 😄 ).

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

@hogmoru Can you build Caddy from source and try again? There's potentially another bug fix on master, not yet released, that might make transparent help the situation.

@hogmoru

This comment has been minimized.

Copy link

commented Aug 8, 2016

@mholt I can try that tomorrow (either in 10h or 20h), and will let you know.
Found this interesting discussion... Looks like absolute URLs are to be expected in Location header... maybe lighttpd is right, as it does return its preferred URI.
I hope transparent does the trick, will see about that tomorrow... Going to bed now (it's late in Western Europe, hehe).
Thanks!

@hogmoru

This comment has been minimized.

Copy link

commented Aug 8, 2016

Could not wait... Tried it.
Well, the good news is now the Location header is translated. Progress!
The bad news: in my case where I want https://mycaddy.com/foo/ to map to http://1.2.3.4:8181/, I use the without /foo directive, and then the following happens:

$ curl -s -I https://mycaddy.com/foo/bar | grep ^Location:
Location: https://mycaddy.com/bar/

It should be https://mycaddy.com/foo/bar/

That is with both without /foo and transparent.
If I remove transparent I get the original upstream URL.
If I remove without /foo I get a 404 because Caddy sends a request to http://1.2.3.4:8181/foo/bar which does not exist.

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

I'm guessing this might have something to do with it.

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 8, 2016

Hm, yours seems to be the same issue as @limeburst is having; the transparent preset fix rolls out with the next update.

I think it's up to your backend to set the right Location header; I don't believe Caddy should be messing with it; it just proxies the results back downstream.

@hogmoru

This comment has been minimized.

Copy link

commented Aug 9, 2016

Hmmm my backend does set the right Location header.
This works (now with latest sources, thanks):

  proxy /bar http://1.2.3.4:8181 {
     transparent
  }

Then mycaddy.com/bar returns the content of http://1.2.3.4:8181/bar
But what I wanted was to put a /foo prefix so that mycaddy.com/foo/bar would get me the content of http://1.2.3.4:8181/bar
... but can't find a way to do it.

But I'm not stuck, I can move my stuff around and get a working system, I'm just pointing at a small limitation (unless I'm missing another option) about mixing without xxx and transparent.
The original issue, which was indeed incorrect handling of absolute URIs in redirection headers, seems to be fixed.
Glad we've clarified the original problem.
Thanks!

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 9, 2016

Well, if you're redirecting a client to /foo/bar, then the client will request /foo/bar, but if you only redirect to /bar, the client doesn't know to add /foo before it. And why should Caddy assume that's what is supposed to happen? Does that make sense?

@hogmoru

This comment has been minimized.

Copy link

commented Aug 9, 2016

Oh... I think you're right (did I say it was late here?), what I wanted to do cannot work, sorry.
But that was a small detail of my (ill-defined) use case, only slightly relevant to the main issue.
Thanks again.

@mholt

This comment has been minimized.

Copy link
Member

commented Aug 9, 2016

Sure, no problem. Glad we had this discussion. Will close this then, since it seems to be an upstream issue.

@mholt mholt closed this Aug 9, 2016

@mholt mholt added the discussion label Aug 9, 2016

@mixmastamyk

This comment has been minimized.

Copy link

commented Jun 24, 2017

Thanks this helped me figure out why this was happening to me on a login redirect.

TL;DR - the proxy.transparent directive sends headers back to gunicorn (and e.g. Flask) so that it can respond with a proper Location header.

It works correctly, however seems inefficient to do this on every single request.

@CristianCantoro

This comment has been minimized.

Copy link

commented Sep 7, 2019

I am experiencing this problem with Caddy v1.0.3, with http.ratelimit and dns/ovh plugins.

$ ./caddy -version
Caddy v1.0.3 (h1:i9gRhBgvc5ifchwWtSe7pDpsdS9+Q0Rw9oYQmYUTw1w=)

Context

Since a little more than two years, I am maintaining https://vikiansiklopedi.org, a proxy of Wikipedia in many languages first set up in response to the blockade of Wikipedia in Turkey ("vikiansiklopedi" means "wiki-encyclopedia" in Turkish).

The proxy uses Caddy server configured to pass requests to Wikipedia's servers, see the configuration template. I also proxy mobile domains.

See also: https://phabricator.wikimedia.org/T232213

Steps to reproduce

It looks like Caddy isn't setting the location header properly:

Sams-MBP:~ reedy$ curl -I https://en.wikipedia.org/wiki/Special:Random | grep location
location: https://en.wikipedia.org/wiki/Shea_Hillenbrand
Sams-MBP:~ reedy$ curl -I https://en.vikiansiklopedi.org/wiki/Special:Random | grep location
location: https://en.wikipedia.org/wiki/KTFO-CD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.