Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Revert to using timer-based cancellation for proxied requests #59

Merged
merged 1 commit into from
Oct 17, 2016

Conversation

colinmarc
Copy link
Contributor

@colinmarc colinmarc commented Oct 5, 2016

In c6eb0b, we switched to using context to manage proxying requests to peers and the timeouts therein.

Using context still makes this flow easier, but using a context with a deadline had the unintended side effect that it acts as a deadline for the entire request, not just to connect. Since we return the still-open http.Response from the proxy method, we want to leave it open indefinitely from that point on.

In this way, it was also inadvertently working as a server timeout; we probably want some of those, but that should be addressed separately (and not be exclusive to proxied requests).

r? @zenazn
cc @kitchen

@colinmarc
Copy link
Contributor Author

Thinking about this, I think this has the problem that it leaks the requests that didn't make it. They'd get closed when the upstream request finishes, but that's not exactly what we want.

Copy link

@zenazn zenazn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance I could get you to use a custom http.Transport / http.Client and to set a Timeout on the Transport's Dialer?

Here's the default one, if you want to cargo-cult: https://github.com/golang/go/blob/3107c91e2d390771888df6b47fd6f8fc7a364cd3/src/net/http/transport.go#L34-L50

ctx, _ := context.WithDeadline(r.Context(),
time.Now().Add(vs.sequins.config.Sharding.ProxyTimeout.Duration))
totalTimeout := time.NewTimer(vs.sequins.config.Sharding.ProxyTimeout.Duration)
ctx, cancel := context.WithCancel(r.Context())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to defer cancel()—otherwise I'm pretty sure you leak resources

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer cancel() is what causes the problem; that cancels the request when we return out of the method, but we want to keep it open so we can stream data off of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I should just move this logic one method up the chain, or something. hrm

Copy link
Contributor Author

@colinmarc colinmarc Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing to call the CancelFunc leaks the child and its children until the parent is canceled or the timer fires. The go vet tool checks that CancelFuncs are used on all control-flow paths.

Since the parent is the incoming client request, and those only last a couple milliseconds, I think this is bad (or at least unhygienic) but not critically so.

@colinmarc
Copy link
Contributor Author

colinmarc commented Oct 5, 2016

Any chance I could get you to use a custom http.Transport / http.Client and to set a Timeout on the Transport's Dialer?

I don't think that does what I want, since the whole idea here is to have somewhat dynamic timeouts and backup requests. But maybe I'm misunderstanding what you're suggesting?

Edit to add: I should correct myself, suggesting that a connect timeout is what I want; What I really want is connection + headers.

In c6eb0b, we switched to using context to manage proxying requests to
peers and the timeouts therein.

Using context still makes this flow easier, but using a context with a
deadline had the unintended side effect that it acts as a deadline for
the entire request, not just to connect. Since we return the still-
open http.Response from the proxy method, we want to leave it open
indefinitely from that point on.

In this way, it was also inadvertently working as a server timeout; we
probably want one of those, but that should be addressed separately
(and not be exclusive to proxied requests).
@colinmarc
Copy link
Contributor Author

I added some logic to make sure other outstanding requests are canceled when one succeeds, and added a comment to clarify why I'm not defer cancel()ing.

@kitchen
Copy link

kitchen commented Oct 6, 2016

Just so I'm reading this correctly:

this is in the proxy fanout, where you fire off a bunch of goroutines to ask ~all of the other nodes for the key. You want to cancel all of them but the one that came back first, because you still need the http.Response object from that one, which you'll cancel after you're done with it (or the context gets GC'd)

Seems legit!

@colinmarc
Copy link
Contributor Author

Yup, that's right!

@zenazn
Copy link

zenazn commented Oct 10, 2016

Apologies—I'm not sure I fully understood the control flow here until my second read through.

One thing that might help is some kind of concise (documented, tested, etc.) abstraction around this "first request past the post" idea. To be honest I'm not really sure what this should look like, and I'm not even sure what level it should act at (http.Request and a hijacked ReadCloser vs. context.Context being two options that come to mind), but I think it would do a lot for understandability.

Here's an untested implementation of the second idea (using context.Context) that I'll offer as a starting point for discussion: https://gist.github.com/zenazn/85ec533af61e45448f9d64bc091b70d3

@zenazn
Copy link

zenazn commented Oct 13, 2016

(but
LGTM
I believe these changes, as written, are likely correct)

@stripe-ci stripe-ci assigned colinmarc and unassigned zenazn Oct 13, 2016
@colinmarc colinmarc merged commit 375cc68 into master Oct 17, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants