-
Notifications
You must be signed in to change notification settings - Fork 22
Revert to using timer-based cancellation for proxied requests #59
Conversation
Thinking about this, I think this has the problem that it leaks the requests that didn't make it. They'd get closed when the upstream request finishes, but that's not exactly what we want. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance I could get you to use a custom http.Transport
/ http.Client
and to set a Timeout
on the Transport
's Dialer
?
Here's the default one, if you want to cargo-cult: https://github.com/golang/go/blob/3107c91e2d390771888df6b47fd6f8fc7a364cd3/src/net/http/transport.go#L34-L50
ctx, _ := context.WithDeadline(r.Context(), | ||
time.Now().Add(vs.sequins.config.Sharding.ProxyTimeout.Duration)) | ||
totalTimeout := time.NewTimer(vs.sequins.config.Sharding.ProxyTimeout.Duration) | ||
ctx, cancel := context.WithCancel(r.Context()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to defer cancel()
—otherwise I'm pretty sure you leak resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer cancel()
is what causes the problem; that cancels the request when we return out of the method, but we want to keep it open so we can stream data off of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I should just move this logic one method up the chain, or something. hrm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing to call the CancelFunc leaks the child and its children until the parent is canceled or the timer fires. The go vet tool checks that CancelFuncs are used on all control-flow paths.
Since the parent is the incoming client request, and those only last a couple milliseconds, I think this is bad (or at least unhygienic) but not critically so.
I don't think that does what I want, since the whole idea here is to have somewhat dynamic timeouts and backup requests. But maybe I'm misunderstanding what you're suggesting? Edit to add: I should correct myself, suggesting that a connect timeout is what I want; What I really want is connection + headers. |
In c6eb0b, we switched to using context to manage proxying requests to peers and the timeouts therein. Using context still makes this flow easier, but using a context with a deadline had the unintended side effect that it acts as a deadline for the entire request, not just to connect. Since we return the still- open http.Response from the proxy method, we want to leave it open indefinitely from that point on. In this way, it was also inadvertently working as a server timeout; we probably want one of those, but that should be addressed separately (and not be exclusive to proxied requests).
bb6f6f8
to
c391e13
Compare
I added some logic to make sure other outstanding requests are canceled when one succeeds, and added a comment to clarify why I'm not |
Just so I'm reading this correctly: this is in the proxy fanout, where you fire off a bunch of goroutines to ask ~all of the other nodes for the key. You want to cancel all of them but the one that came back first, because you still need the http.Response object from that one, which you'll cancel after you're done with it (or the context gets GC'd) Seems legit! |
Yup, that's right! |
Apologies—I'm not sure I fully understood the control flow here until my second read through. One thing that might help is some kind of concise (documented, tested, etc.) abstraction around this "first request past the post" idea. To be honest I'm not really sure what this should look like, and I'm not even sure what level it should act at ( Here's an untested implementation of the second idea (using |
(but |
In c6eb0b, we switched to using context to manage proxying requests to peers and the timeouts therein.
Using context still makes this flow easier, but using a context with a deadline had the unintended side effect that it acts as a deadline for the entire request, not just to connect. Since we return the still-open http.Response from the proxy method, we want to leave it open indefinitely from that point on.
In this way, it was also inadvertently working as a server timeout; we probably want some of those, but that should be addressed separately (and not be exclusive to proxied requests).
r? @zenazn
cc @kitchen