Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X-Accel-Redirect setup doesn't work due to strange behaior of rewrite #5208

Open
olivierdalang opened this issue Nov 18, 2022 · 21 comments · May be fixed by #5438
Open

X-Accel-Redirect setup doesn't work due to strange behaior of rewrite #5208

olivierdalang opened this issue Nov 18, 2022 · 21 comments · May be fixed by #5438
Labels
discussion 💬 The right solution needs to be found

Comments

@olivierdalang
Copy link

olivierdalang commented Nov 18, 2022

Hey ! I'm trying to configure a pretty simple X-Accel-Redirect kind of setup under Caddy 2.6.2, but think I encountered an issue that prevents me to do so.

My stripped down Caddyfile looks like this:

localhost {
    reverse_proxy service_a:5000 {
        @accel header X-Accel-Redirect *
        handle_response @accel {
            rewrite * {rp.header.X-Accel-Redirect}
            reverse_proxy service_b:5000
        }
    }
}

The service_a returns a response with a header X-Accel-Redirect: /hello?some=param&some_other=param. This is be picked by Caddy and rewritten so that this query is handled by service_b. The rewrite of the path works, but the GET params get swallowed and are not visible to service_b where I only see /hello.

It works as expected when I hardcode the same value in the caddy file like this (which IMO should be 100% equivalent).

localhost {
    reverse_proxy service_a:5000 {
        @accel header X-Accel-Redirect *
        handle_response @accel {
            rewrite * /hello?some=param&some_other=param
            reverse_proxy service_b:5000
        }
    }
}

Just to understand what's happening, I tried with some variants such as rewrite * /hello?{rp.header.X-Accel-Redirect} with just the paramters in the header. In this case, service_b sees http://localhost/hello?some%3Dparam%26some_other%3Dparam, so not good either.

Seems some rogue urlencoding happens, but I couldn't find anything in the doc about that.

Any pointer about how to solve this ? I searched through the issues but could not find anything about this exactly.

@francislavoie
Copy link
Member

Hmm, good question. I think the rewrite module decides whether it should rewrite the query based on the original string being input, before doing placeholder replacement, by checking if it contains a ?.

This is definitely tricky. There are security considerations here, because not all placeholders contain trusted values, and in many configs there are implicit rewrite * {path} that happen (e.g. with try_files and via php_fastcgi) so we need to be very careful here about not allowing arbitrary clients to inject a ? to affect the behaviour of the rewrite.

@mholt mholt added the discussion 💬 The right solution needs to be found label Nov 18, 2022
@mholt
Copy link
Member

mholt commented Nov 19, 2022

Hm, if the rewrite destination has a ? it should treat what comes after as a query string. Maybe it should work as long as the placeholder doesn't span different parts of the URI (e.g. path and also QS).

@francislavoie
Copy link
Member

As a workaround, would it be possible for your app to send the path and query as different headers for now? Then it should be easy to merge them together with a rewrite with Caddy.

@olivierdalang
Copy link
Author

Thanks for your answers.

@francislavoie Unfortunately, even treating path and query with different header does not solve it, as the query part still gets URL-encoded(rewrite * /hello?{rp.header.X-Accel-Redirect} -> http://localhost/hello?some%3Dparam%26some_other%3Dparam)

I agree automatic encoding makes sense for untrusted variables. Some ideas about how this could be implemented:

  1. have a way to mark variables as trusted, and have trusted vars not escaped/sanitized by further directives (something like trust_var {rp.header.X-Accel-Redirect} trusted_redirect_string, then rewrite * trusted_redirect_string)
  2. add a new trusted argument to rewrite allowing to specify that the parameters are to be trusted

Still very much interested about any ideas for workaround if you have any (at this point, the only thing I found is url-decoding the string manually with

uri replace %3D =
uri replace %26 &
uri replace %2C ,

after my rewrite block, but that's super dirty and will break as soon as there are new special characters in my query strings.

@mvaled
Copy link

mvaled commented Jan 19, 2023

l'm trying to do something similar. I have a Python app, which stores files in a S3-like service. I want to have permanent links to those files, but the S3 service uses transient URLs like: https://cloud.canary.kaikosystems.com/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T143400Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=redacted

So I have configured Caddy with

(s3_media) {
	# {args.0}: is the name of the Environment. Ex: production, demo, staging
	handle_path /media/* {
		# See https://github.com/caddyserver/caddy/issues/5208
		uri replace %3D =
		uri replace %26 &
		uri replace %2C ,
		uri replace %2D -
		uri replace %2F /

		header X-S3-URI {uri}
		reverse_proxy https://kaiko-{args.0}.s3.fr-par.scw.cloud {
			header_up Host kaiko-{args.0}.s3.fr-par.scw.cloud
		}
	}
}

(dashboard_permalinks) {
	# {args.0}: is the name of the Environment. Ex: production, demo, staging
	encode zstd gzip

	# this settings is something related to the DashOnBoard
	# not strictly neccesary
	header +Vary Origin

	log {
		format console
		level INFO
	}

	import robots

	@dyn not path /static/* /media/* /.well-known/*

	reverse_proxy @dyn {args.0}_web:8000 {
		header_up X-Real-IP {remote}

		@accel header X-Accel-Redirect *

		handle_response @accel	{
			rewrite * {rp.header.X-Accel-Redirect}
			header X-Original-Accel-Redirect {rp.header.X-Accel-Redirect}
			import s3_media {args.0}
		}
	}

	handle_path /static/* {
		reverse_proxy {args.0}_statics:80
	}
	import s3_media {args.0}
}

cloud.canary.kaikosystems.com {
	import dashboard_permalinks canary
}

The Python returns a response with both a Location and the X-Accel-Redirect. The permalink URL has the following path pattern /attachment/<uuid>.

When I request one of those URLs, I'm getting a 403 with the following headers:

< x-original-accel-redirect: /media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T143855Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=REDACTED
< x-s3-uri: /attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185

I notice that header X-S3-URI {uri} doesn't pick up the rewrite. Is it because header directives are processed before any rewriting?

@francislavoie
Copy link
Member

Directives are sorted according to this order: https://caddyserver.com/docs/caddyfile/directives#directive-order. Notice that uri is after header. You can either set the defer option of the header directive so that the header operation is done on the way out of the middleware chain, or wrap it in a route to override the order.

@mvaled
Copy link

mvaled commented Jan 19, 2023

@francislavoie Thanks for your response.

Unfortunately, using defer like this:

(s3_media) {
	# {args.0}: is the name of the Environment. Ex: production, demo, staging
	handle_path /media/* {
		# See https://github.com/caddyserver/caddy/issues/5208
		uri replace %3D =
		uri replace %26 &
		uri replace %2C ,
		uri replace %2D -
		uri replace %2F /

		header X-S3-URI {uri} {defer}
		reverse_proxy https://kaiko-{args.0}.s3.fr-par.scw.cloud {
			header_up Host kaiko-{args.0}.s3.fr-par.scw.cloud
		}
	}
}

has the effect to remove the header entirely from the response.

@mholt
Copy link
Member

mholt commented Jan 19, 2023

@mvaled That:

header X-S3-URI {uri} {defer}

replaces all instances of the {uri} with the contents of {defer}, which would be nothing unless you've defined a {defer} placeholder somewhere.

You probably meant:

header X-S3-URI {uri} {
   defer
}

@mvaled
Copy link

mvaled commented Jan 19, 2023

@mholt Thanks!

I did like you suggested and now I'm getting the X-S3-URI again, but once again, with the non-rewritten URI.

@mvaled
Copy link

mvaled commented Jan 19, 2023

I activated the global debug to the URI the reverse was being sent:

  1. The first request which the Python server responds with the X-Accel-Redirect:
{
  "level": "debug",
  "ts": 1674160975.7986755,
  "logger": "http.handlers.reverse_proxy",
  "msg": "upstream roundtrip",
  "upstream": "canary_web:8000",
  "duration": 0.087298235,
  "request": {
    "remote_ip": "51.15.194.75",
    "remote_port": "56242",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "cloud.canary.kaikosystems.com",
    "uri": "/attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185",
    "headers": {
      "X-Real-Ip": [
        "51.15.194.75:56242"
      ],
      "Accept": [
        "*/*"
      ],
      "User-Agent": [
        "curl/7.87.0"
      ],
      "X-Forwarded-For": [
        "51.15.194.75"
      ],
      "X-Forwarded-Proto": [
        "https"
      ],
      "X-Forwarded-Host": [
        "cloud.canary.kaikosystems.com"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "server_name": "cloud.canary.kaikosystems.com"
    }
  },
  "headers": {
    "Content-Length": [
      "0"
    ],
    "X-Content-Type-Options": [
      "nosniff"
    ],
    "Cross-Origin-Opener-Policy": [
      "same-origin"
    ],
    "Vary": [
      "Origin"
    ],
    "Server": [
      "uvicorn"
    ],
    "Content-Type": [
      "text/html; charset=utf-8"
    ],
    "X-Frame-Options": [
      "DENY"
    ],
    "Referrer-Policy": [
      "same-origin"
    ],
    "Date": [
      "Thu, 19 Jan 2023 20:42:54 GMT"
    ],
    "Location": [
      "https://cloud.canary.kaikosystems.com/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCW9MDFB3PJGAQ5GBWZA%2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T203023Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=98c2380284b8f5b886decb676e1e4f8108a3dfc0bd633db3a9aa862f5633008b"
    ],
    "X-Accel-Redirect": [
      "/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCW9MDFB3PJGAQ5GBWZA%2F20230119%2Ffr-par%2Fs3%2Faws4_request&X-Amz-Date=20230119T203023Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=98c2380284b8f5b886decb676e1e4f8108a3dfc0bd633db3a9aa862f5633008b"
    ]
  },
  "status": 302
}

Then come two logs from the rewriter, and we can see the problem there. The first rewrite is our rewrite * {rq.header.X-Accel-Redirect}. It already lacks the query string:

{
  "level": "debug",
  "ts": 1674160975.7993755,
  "logger": "http.handlers.rewrite",
  "msg": "rewrote request",
  "request": {
    "remote_ip": "51.15.194.75",
    "remote_port": "56242",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "cloud.canary.kaikosystems.com",
    "uri": "/attachment/01835a39-af47-7a7e-a7d5-6d7063fb7185",
    "headers": {
      "Accept": [
        "*/*"
      ],
      "User-Agent": [
        "curl/7.87.0"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "server_name": "cloud.canary.kaikosystems.com"
    }
  },
  "method": "GET",
  "uri": "/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg"
}

The second rewrite simply removes the '/media' from the URL:

{
  "level": "debug",
  "ts": 1674160975.799484,
  "logger": "http.handlers.rewrite",
  "msg": "rewrote request",
  "request": {
    "remote_ip": "51.15.194.75",
    "remote_port": "56242",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "cloud.canary.kaikosystems.com",
    "uri": "/media/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg",
    "headers": {
      "Accept": [
        "*/*"
      ],
      "User-Agent": [
        "curl/7.87.0"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "server_name": "cloud.canary.kaikosystems.com"
    }
  },
  "method": "GET",
  "uri": "/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg"
}

Finally comes the internal redirect:

{
  "level": "debug",
  "ts": 1674160976.0522823,
  "logger": "http.handlers.reverse_proxy",
  "msg": "upstream roundtrip",
  "upstream": "kaiko-canary.s3.fr-par.scw.cloud:443",
  "duration": 0.2523489,
  "request": {
    "remote_ip": "51.15.194.75",
    "remote_port": "56242",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "kaiko-canary.s3.fr-par.scw.cloud",
    "uri": "/kaiko-canary/artifacts/2022/09/20/file_au74XBv.jpg",
    "headers": {
      "X-Forwarded-Proto": [
        "https"
      ],
      "X-Forwarded-Host": [
        "cloud.canary.kaikosystems.com"
      ],
      "User-Agent": [
        "curl/7.87.0"
      ],
      "Accept": [
        "*/*"
      ],
      "X-Forwarded-For": [
        "51.15.194.75"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "server_name": "cloud.canary.kaikosystems.com"
    }
  },
  "headers": {
    "Content-Type": [
      "application/xml"
    ],
    "Date": [
      "Thu, 19 Jan 2023 20:42:56GMT"
    ],
    "X-Amz-Id-2": [
      "tx115f1623206f43fa8f651-0063c9ab4f"
    ],
    "X-Amz-Request-Id": [
      "tx115f1623206f43fa8f651-0063c9ab4f"
    ]
  },
  "status": 403
}

@mvaled
Copy link

mvaled commented Jan 19, 2023

A little trick helps:

   rewrite * {rp.header.X-Accel-Redirect}?

@francislavoie
Copy link
Member

I did some thinking, and I think the best solution is to provide a new option to rewrite to force-allow query replacement. See #5438, looks like this:

rewrite * {rp.header.X-Accel-Redirect} {
	modify_query
}

@ottenhoff
Copy link
Contributor

I just wanted to add a working Caddyfile snippet I am using with Caddy 2.7.6:

                @sendfile header X-Accel-Redirect *
                handle_response @sendfile {
                        rewrite * {rp.header.X-Accel-Redirect}?
                        reverse_proxy https://dev-bucket.s3.amazonaws.com {
                                header_up Host {http.reverse_proxy.upstream.host}
                                header_down -Content-Encoding
                                header_down -Content-Type
                                header_down -Accept-Ranges
                                header_down -Cache-Control
                                header_down -Server
                                header_down -X-Amz-*
                        }
                }

@francislavoie
Copy link
Member

The problem with that is it'll append another ? at the end of the query which is invalid syntax (it might cause the last query param to have ? added to the value).

@ottenhoff
Copy link
Contributor

ottenhoff commented Feb 20, 2024

That's not what I see in my testing with Caddy 2.7.6 or master @francislavoie.

rewrite * {rp.header.X-Accel-Redirect} produces a URI that is missing all query params

rewrite * {rp.header.X-Accel-Redirect}? produces a correct URI with proper query params (and without a ? at the end of the query string).

@mholt
Copy link
Member

mholt commented Feb 21, 2024

That might have to do with how rewrite decides whether to rewrite the query or not. If you put a question mark at the end it signals to the rewrite handler to also rewrite the query string. That's my guess :)

@ottenhoff
Copy link
Contributor

any tips on where I can add a test so we're all on the same page with this rewrite behavior (and so it doesn't get altered without a bit of discussion)? this X-Accel-Redirect with S3 seems like a pretty common use case

@mholt
Copy link
Member

mholt commented May 11, 2024

@ottenhoff You mean like a rewrite that is {placeholder}? ? That is already tested here:

{
rule: Rewrite{URI: "{http.request.uri.path}?"},
input: newRequest(t, "GET", "/foo/bar?a=b&c=d"),
expect: newRequest(t, "GET", "/foo/bar"),
},

@francislavoie
Copy link
Member

@mholt that's not quite the same because {path} is guaranteed to never contain a query, but {rp.header.X-Accel-Redirect} could contain a query.

@ottenhoff
Copy link
Contributor

I'm still trying to figure out why this config works and produces a transparent redirect to the S3 asset with correct GET params (?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKQS...)

                @sendfile header X-Accel-Redirect *
                handle_response @sendfile {
                        rewrite * {rp.header.X-Accel-Redirect}?
                        reverse_proxy https://dev-bucket.s3.amazonaws.com {
                                header_up Host {http.reverse_proxy.upstream.host}

and this rewrite results in /path/to/s3/asset with all necessary AWS GET params stripped:

                @sendfile header X-Accel-Redirect *
                handle_response @sendfile {
                        rewrite * {rp.header.X-Accel-Redirect}
                        reverse_proxy https://dev-bucket.s3.amazonaws.com {
                                header_up Host {http.reverse_proxy.upstream.host}

@mholt
Copy link
Member

mholt commented May 11, 2024

@francislavoie Right; that part still needs a patch (I think you've had a PR open on it but I just haven't put the energy into reviewing it yet 😅 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion 💬 The right solution needs to be found
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants