-
Notifications
You must be signed in to change notification settings - Fork 111
Closed
cloudfoundry/gorouter
#366Description
Is this a security vulnerability?
No.
Issue
- With the attempt details commit the calculation for the app backend time was moved from the proxy to proxy_round_tripper
- This had a subtle side effect.
- In the original code, the
AppRequestFinishedAttimestamp was set only after all of the middleware had finished, including ReverseProxy - ReverseProxy first calls out to Transport (which in our case is proxy_round_tripper) just to send the request and receive a response object. -> By this time we had mistakingly set
AppRequestFinishedAtalready <- - Then, the response body is copied to the client which blocks until the entire body has been sent. -> This is where the actual time is spent on the backend, if a response body exists <-
- In the new code, we set the
AppRequestFinishedAttimestamp early on in proxy_round_tripper (which is actually the Transport used for ReverseProxy). - This causes the time that is needed to copy the response body to client to be attributed to gorouter_time because we subtract
appTimefromroundTripTimeto getgorouter_time. SinceappTimeis so little as it misses the response body copy time, this skewsgorouter_timeto be much larger than it actually is.
Affected Versions
Routing-Release v0.272.0 and greater
CF-Deployment v30.6.0 and greater
Steps to Reproduce
You can easily reproduce this issue by pushing two apps that are slow in different ways:
App A is "slow" because it waits for 10s before sending a response:
func slowResponse(w http.ResponseWriter, r *http.Request) {
time.Sleep(10 * time.Second)
_, _ = w.Write([]byte("OK after 10s"))
}
This will yield an access_log such as this:
(...) gorouter_time:0.001506 backend_time:10.016147
App B is "slow" because it waits for 10s before sending a body:
func slowBody(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
if flusher, ok := w.(http.Flusher); ok {
flusher.Flush()
}
time.Sleep(10 * time.Second)
_, _ = w.Write([]byte("OK after 10s"))
}
This will yield an access_log such as this:
(...) gorouter_time:10.010651 backend_time:0.006369
Expected result
gorouter_timeis very small andbackend_timeis about 10s for both app A and app B
Current result
gorouter_timeis very small for app Agorouter_timeis about 10s for app B
Possible Fix
- Gorouter needs to take the time for transmitting the response body into account again
- This can be done by moving the calculation of app_time from
proxy_round_tripperback toproxyas before - The cases where retries happened during
RoundTripbecause one, several or all backend instances were unavailable will also have to be taken into account. A possible solution will be to calculate thisretry timeseparately and subtract it from app_time to get the "real" time spent at the backend app. - If none of the attempts worked (i.e. the app could not be reached at all), we must make sure that app_time is zero.
maxmoehl
Metadata
Metadata
Assignees
Labels
No labels