Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reverseproxy: Prevent copying the response if a response handler ran #4388

Merged
merged 1 commit into from Oct 18, 2021

Conversation

francislavoie
Copy link
Member

@francislavoie francislavoie commented Oct 17, 2021

This should fix #4298.

Pretty obvious fix, just using a boolean to track whether we did close it ahead of time or not, before reaching the h.copyResponse() part.

I also did a slight re-ordering, moving res.Body.Close() to after the response handler route execution, because I think we should probably put the res on the request before passing it through the routes.

I think we should actually use the http.Request.Response field for this. Might be a bit "hacky", but idk, seems like a good way to do it maybe. Then, we could introduce a dummy HTTP handler that is only effectual in handle_response (I figure it should error out if req.Request.Response is missing) which does a copy of the response body (taking req.Request.Response and copying it to the ResponseWriter), allowing stuff like replace_response to conditionally run based on response matchers.

WDYT about that approach? I could do that in this PR or in another 🤷‍♂️

This should fix #4298.

Pretty obvious fix, just using a boolean to track whether we did close it ahead of time or not, before reaching the `h.copyResponse()` part.
@francislavoie francislavoie added the bug 🐞 Something isn't working label Oct 17, 2021
@francislavoie francislavoie added this to the v2.4.6 milestone Oct 17, 2021
Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this patch, Francis. Let's give it a try and see if that helps a greater number of use cases.

I think we should actually use the http.Request.Response field for this. Might be a bit "hacky", but idk, seems like a good way to do it maybe. Then, we could introduce a dummy HTTP handler that is only effectual in handle_response (I figure it should error out if req.Request.Response is missing) which does a copy of the response body (taking req.Request.Response and copying it to the ResponseWriter), allowing stuff like replace_response to conditionally run based on response matchers.

I'm open to seeing this in another PR; given the limited use of http.Request.Response on the server side, that might be an acceptable hack if we document this in comments and have a fallback plan if the semantics of that field change and our hack fails in a future version of Go.

@francislavoie francislavoie merged commit b092061 into master Oct 18, 2021
@francislavoie francislavoie deleted the handle-response-body-close-fix branch October 18, 2021 18:00
francislavoie added a commit that referenced this pull request Oct 19, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Oct 19, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Oct 20, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Oct 31, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Nov 30, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Dec 19, 2021
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Mar 7, 2022
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
francislavoie added a commit that referenced this pull request Mar 7, 2022
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
mholt added a commit that referenced this pull request Mar 9, 2022
…onse routes (#4391)

* reverseproxy: New `copy_response` handler for `handle_response` routes

Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.

* Delete the "Content-Length" header if we aren't copying

Fixes a bug where the Content-Length will mismatch the actual bytes written if we skipped copying the response, so we get a message like this when using curl:

```
curl: (18) transfer closed with 18 bytes remaining to read
```

To replicate:

```
{
	admin off
	debug
}

:8881 {
	reverse_proxy 127.0.0.1:8882 {
		@200 status 200
		handle_response @200 {
			header Foo bar
		}
	}
}

:8882 {
	header Content-Type application/json
	respond `{"hello": "world"}` 200
}
```

* Implement `copy_response_headers`, with include/exclude list support

* Apply suggestions from code review

Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

handle_response errors with "read on closed response body"
2 participants