Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reverseproxy: Implement trusted proxies for X-Forwarded-* headers #4507

Merged
merged 6 commits into from Mar 6, 2022

Conversation

francislavoie
Copy link
Member

Addresses concerns brought up in #4503

Changes the default behaviour of reverse_proxy to no longer use values from X-Forwarded-For and X-Forwarded-Proto from the incoming request implicitly. Now, the existing values will only be used if trusted_proxies is configured and the incoming req.RemoteAddr matches.

The CIDR parsing logic in provisioning is taken from MatchRemoteIP.

I also added a shorthand in the Caddyfile to quickly trust requests from private IP ranges (i.e. the ranges from https://en.wikipedia.org/wiki/Private_network plus "localhost" IPs)

/cc @dunglas for review since you've done some related work previously

@francislavoie francislavoie added feature ⚙️ New feature or request under review 🧐 Review is pending before merging labels Jan 4, 2022
@francislavoie francislavoie added this to the v2.5.0 milestone Jan 4, 2022
modules/caddyhttp/reverseproxy/caddyfile.go Outdated Show resolved Hide resolved
// though, unless some other module manipulated the request's
// remote address and used an invalid value.
clientIP, _, err := net.SplitHostPort(req.RemoteAddr)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we remove the existing XFF headers coming from untrusted servers even in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added .Del() calls here.

I'm still not 100% sure just "return early" is the right thing to do here, I'd like @mholt to chime in on what we should do if req.RemoteAddr can't be split 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addresses that can't be split are almost always missing a port. I usually disregard these errors mostly, except that I assume the entire input was already just the host.

But I'm not sure what the implications are if a RemoteAddr can't be split, especially in this context. We could try this current logic and see if anyone complains... I'm not sure I've ever really seen a RemoteAddr without a port (or a malformed RemoteAddr in general).

if !omit {
req.Header.Set("X-Forwarded-For", clientIP)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for X-Forwarded-Host is missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually let users add that one on their own, because it's usually redundant since we keep Host the same as the incoming request. It's only helpful if you're proxying to an HTTPS upstream where the Host needs to be the same as the upstream domain name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering that maybe we should add X-Forwarded-Host support.

It's not uncommon to add it when proxying over HTTPS, less common over HTTP since Host is preserved, but there's no harm in always having it anyways (other than the handful of bytes added to upstream requests I guess).

WDYT @mholt ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added this in the latest commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much you wanna bet we'll get a report that this breaks someone because the incoming Host is sensitive for some reason, and they want the proxy to NOT send it on to the upstream, which may be an arbitrary user application (or something untrusted)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's probably fine, they can just do header_up -X-Forwarded-Host to omit it. (I just double checked, and that does work since header_up is applied after this logic)

@francislavoie francislavoie force-pushed the trusted-proxy-headers branch 3 times, most recently from ca17c9a to 14649e2 Compare January 20, 2022 05:17
@francislavoie
Copy link
Member Author

francislavoie commented Mar 6, 2022

After reading https://adam-p.ca/blog/2022/03/x-forwarded-for/, I think I need to do an additional change here - we were using [] notation on the headers (same as .Value) which gives us a slice of all the header values. We should be grabbing only the last one in case the original client set the header to try to spoof but then some downstream proxy only added a header and didn't replace it.

@mholt
Copy link
Member

mholt commented Mar 6, 2022

I was just reading that too... 🧐

@adam-p
Copy link

adam-p commented Mar 6, 2022

We should be grabbing only the last one in case the original client set the header to try to spoof but then some downstream proxy only added a header and didn't replace it.

I haven't read into your code, but a warning, for completeness...

If you are taking the absolute rightmost XFF IP, using only the last header is fine. It's also fine if you added that header and you're sure you added only that header (like, you didn't split the stuff you added across multiple instances of the header).

When it's not okay -- theoretically -- is if you're counting backwards from the rightmost. For example, if you're using one of these strategies:

  • The user configured that there are N reverse proxies, so you use the XFF IP that is N-1 from the rightmost. (Because that's how far back the external IP is after each reverse proxy added the IP before it.)
  • The user configured a set of "trusted proxy" IPs (or ranges). Then you search from the rightmost for the first IP that is not in that set of trusted IPs.
  • You are searching from the right for the first non-private/non-internal IP address.

In short, if you're "searching from the right" rather than "taking the absolute rightmost" then you need to consider that the IP you want might not actually be the last XFF header.

(In the "multiple headers" section of the blog post I give a very hypothetical example of an attack against using the last header instead of the full set.)

@francislavoie
Copy link
Member Author

@adam-p We're a reverse proxy, we keep the existing incoming values, but only if trusted_proxies is configured with a list of CIDRs and .RemoteAddr matches one in that list.

The change I made was that we were grabbing all the incoming XFF values (if multiple headers), but I changed it to only look at the last header value.

But maybe that's not correct as a reverse_proxy and if the downstream is trusted then we should assume it did the right thing and prevented spoofing itself. I'm not sure.

@adam-p
Copy link

adam-p commented Mar 6, 2022

@francislavoie

But maybe that's not correct as a reverse_proxy and if the downstream is trusted then we should assume it did the right thing and prevented spoofing itself. I'm not sure.

I think trusting the things you're configured to trust is probably correct.

An (hypothetical) example of what can go wrong with taking only the last header... Let's say there are three well-configured and well-behaved reverse proxies: Caddy1, OtherProxy2, then Caddy3. And then this happens:

  1. Caddy1 sees that RemoteAddr is not trusted. It overwrites any existing XFF with RemoteAddr.
  2. OtherProxy2 creates a new XFF header and adds the IP of the first proxy to it. I know this sounds weird, but it's valid per the HTTP/1.1 RFC.
  3. Caddy3 gets the last XFF header and... the client IP isn't there.

I am of the opinion that all XFF headers should be considered as one list. (This is what NodeJS's message.headers does.)

Tangential, but: I don't love the approach of overwriting XFF and then taking the leftmost (Go's httputil.ReverseProxy seems to be going this way as well). While it certainly prevents a lot of accidental footguns, it also prevents some legitimate uses of the original leftmost value. Additionally, if the first trusted proxy doesn't actually overwrite the list (misbehaviour, misconfiguration, misunderstanding), then the leftmost XFF IP is insecure. I think a better approach -- especially if you already have trusted CIDRs configured -- is to append to the original XFF and then search from the right for the first non-trusted IP.

ETA: I think that adding a header like X-Real-IP is better than overwriting XFF. It feels like you're trying to shoehorn the behaviour of the former onto the latter. (With some exceptions, I'm sure. Maybe there are times when the list of reverse proxy IPs is useful.)

@mholt
Copy link
Member

mholt commented Mar 6, 2022

I think the primary reason for "shoehorning" this behavior into XFF is for compatibility (which is unfortunate).

I think Envoy's docs explain their solution to the issue very well, and seems pretty similar to this PR: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/headers#x-forwarded-for

@francislavoie
Copy link
Member Author

I am of the opinion that all XFF headers should be considered as one list.

Fair point -- I'll revert that change, and use all the XFF values.

I think it still makes sense to use the last header value for X-Forwarded-Proto and X-Forwarded-Host (which we also support) since those don't have comma support.

@francislavoie
Copy link
Member Author

I amended the last commit with that change.

Thanks for the comments @adam-p, much appreciated! The code is pretty simple if you'd like to give it a quick look-over if you find the time.

@adam-p
Copy link

adam-p commented Mar 6, 2022

@francislavoie As proof that I read it over, I have some nitpick comments (that I don't think are worth mentioning inline):

  • When parsing the CIDRs ahead of time: The len(ip) for an IPv4 is still 16, so the mask ends up being net.CIDRMask(128, 128). Which sort of isn't right for a /32. But... it seems to work just fine: https://go.dev/play/p/A447Q4DpJ0-
  • In addForwardedHeaders I probably wouldn't use clientIP as the variable that also holds the XFF list. I feel like some later code added lower down could get confused about it. I'd use a new variable.
  • I would probably have allHeaderValues and lastHeaderValue call http.CanonicalHeaderKey(field) before indexing h. The keys in http.Header are guaranteed to be canonicalized and yeah, I can see that the strings you're supplying are in the canonical form, but... I'd still do it to avoid a possible future mistake (maybe reuse of the function with user input). (Note that http.Get() and http.Values() also automatically canonicalize the input header name.)

That's it. I don't see anything actually incorrect.

@francislavoie
Copy link
Member Author

francislavoie commented Mar 6, 2022

When parsing the CIDRs ahead of time: The len(ip) for an IPv4 is still 16, so the mask ends up being net.CIDRMask(128, 128). Which sort of isn't right for a /32. But... it seems to work just fine: go.dev/play/p/A447Q4DpJ0-

Interesting, yeah I think I have a fix for correctness, we can do ip.To4() and if it's non-nil then we got the ipv4 representation (4 bytes instead of 16 with leading zeroes) so we can do ip = ipv4 to replace it, then len(ip) * 8 will be correct. Thanks for the playground, that was helpful.

In addForwardedHeaders I probably wouldn't use clientIP as the variable that also holds the XFF list. I feel like some later code added lower down could get confused about it. I'd use a new variable.

Hmm, fair enough. I'll make a clientXFF that will hold the final header value.

I would probably have allHeaderValues and lastHeaderValue call http.CanonicalHeaderKey(field) before indexing h.

Good point. Will do.

Note that http.Get() and http.Values() also automatically canonicalize the input header name.

Yeah, I'm aware -- we're skipping those because when we use [] syntax we can get ok which lets us check if the header was explicitly set to nil. This approach was taken from how its done in stdlib to allow something earlier to cause the header not to get set at all (and it's safe because if it's already nil, it's impossible for a spoofed value to be there anyways).

This functionality isn't strictly necessary, but I think it might be useful in case someone wants to take control and write their own X-Real-IP or something and omit Caddy's automatic XFF behaviour.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'd be okay with merging in this state and improving on it / tweaking it later if needed.

Thanks for all you do Francis! As usual :)

@adam-p
Copy link

adam-p commented Mar 6, 2022

we're skipping those because when we use [] syntax we can get ok which lets us check if the header was explicitly set to nil

I understood that. I didn't mean to argue that you should use .Values -- I was just using it as part of the argument for canonicalizing first.

@francislavoie
Copy link
Member Author

I was just using it as part of the argument for canonicalizing first.

Right, sounds good.

Thanks so much!

@francislavoie francislavoie merged commit c50094f into master Mar 6, 2022
@francislavoie francislavoie deleted the trusted-proxy-headers branch March 6, 2022 23:51
@mholt
Copy link
Member

mholt commented Mar 7, 2022

And thank you very much for your extended review and feedback, @adam-p -- straight from the man himself! I think that helped make this an even better change.

@mholt mholt removed the under review 🧐 Review is pending before merging label Mar 7, 2022
@adam-p
Copy link

adam-p commented Mar 22, 2022

@francislavoie What was your rationale for stripping the zone out of IPv6 addresses?

I'm trying to implement some similar code and I can't decide what the right approach is.

@francislavoie
Copy link
Member Author

francislavoie commented Mar 22, 2022

@adam-p it relates with the findings in #4597 -- essentially, Go stdlib may keep the zone in RemoteAddr, but I don't think that's ever relevant information for upstream apps (why would they need to care what network adapter the request came in on?), and ParseIP would fail if it contains a % part in there (it expects specifically just the IP, no zone). It's a quirk of the Go APIs that the remote address and SplitHostPort may return a zone but other net functions don't deal with that gracefully, so we need to strip it out by hand.

@adam-p
Copy link

adam-p commented Mar 22, 2022

@francislavoie Thanks. That's pretty much what I expected.

I just finished a couple more blog posts related to this stuff that you might be interested in (not nearly as long as the XFF one):
A tiny flaw in Go's netip design
Should you strip the IPv6 zone?

@francislavoie
Copy link
Member Author

@adam-p so, uh, possible additional addendum to your blog post, turns out CloudFlare does not do the right thing by default with XFF, and by default they allow spoofing. https://developers.cloudflare.com/fundamentals/get-started/reference/http-request-headers/#x-forwarded-for

But it's possible to fix it by explicitly configuring a rule to remove the XFF header on incoming requests to CloudFlare, and they'll re-add the header with the correct value afterwards (as their built-in behavior).

That fix is probably the best way to go for Caddy users because configuring Caddy to use CF-Connecting-IP authoritatively and correctly (by appending its own client's IP to XFF using their custom header's value) is messy. It is possible but awkward, by using the request_header directive to replace the incoming XFF with the CloudFlare header, but only when the client is known to actually be CloudFlare (with a remote_ip matcher probably).

@timthelion
Copy link

timthelion commented Feb 17, 2023

Hello,

we're currently having some challenges with this code on Caddy Ingress using a Digital Ocean K8S Load balancer.

The issue is that there is a logic ordering bug. In the code as it stands, first the X-Forwarded-For header is deleted and then it is read, if a trusted proxy is being used.

Here it gets deleted: https://github.com/caddyserver/caddy/pull/4507/files#diff-a248c9a1ec018edea8b377d155bc1df1a642bf79d00ababb5cdacc6b86c5733dR579

Here it gets read: https://github.com/caddyserver/caddy/pull/4507/files#diff-a248c9a1ec018edea8b377d155bc1df1a642bf79d00ababb5cdacc6b86c5733dR608

I believe that the fix is to move line 608 up above line 575. Unfortunately that makes the code a little harder to read, but I'm having trouble coming up with a better correct solution. I will be doing a test of this change over the coming days and if it fixes the issue I will send a PR.

@francislavoie
Copy link
Member Author

@timthelion I don't think that's true. That's only the case if the remote address is invalid. Which would only happen if something tampered with it before it reaches this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants