New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retries #326

Merged
merged 39 commits into from Jul 2, 2014

Conversation

Projects
None yet
5 participants
@kevinburke
Contributor

kevinburke commented Jan 28, 2014

Okay:

  • Adds support for retries, per the spec we laid out (mostly).
  • Adds docs etc.
  • Splits out util.py into submodules - I need to know what you want to do here because as written it would change from urllib3.util.Timeout into urllib3.util.timeout.Timeout which is not good.
  • adds a new /successful_retry handler to the DummyTestCase which keys based on a test-name header and returns 200 only after the request has been retried once.
  • I believe there are some API changes here.
    • some subclasses of httplib.HTTPException can actually be raised as connection errors because they imply the request never actually got sent to the server.
    • urlopen previously would retry on read timeouts, which violated the urlopen contract (as I understood it) of only retrying things that couldn't possibly be side effecting. this code does not retry read timeouts by default.

I am also testing this in two new environments - in the office which places my IP on 10.* subnet and I think has weird/different behavior when connecting to TARPIT_HOST than do standard wifi networks, and without an Internet connection, in which case a bunch of tests fail. Also, it's difficult to test some of these exceptional cases because the errors raised rely on the network stack, which (I think) is why the tests are failing on the branch. I'm still looking into it.

Either way I am losing some confidence in the connection timeout tests; getting a network to reliably generate ECONNREFUSED, or not generate it and tie up a connection, is tricky, unless we want to go down a path like this: http://bugs.python.org/file26890/test_timeout.patch

[Edit: This is an implementation of #260]

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Jan 28, 2014

Contributor

This is a lot! Sorry!

Contributor

kevinburke commented Jan 28, 2014

This is a lot! Sorry!

@kevinburke kevinburke closed this Jan 28, 2014

@kevinburke kevinburke deleted the kevinburke:retries branch Jan 28, 2014

@kevinburke kevinburke reopened this Jan 28, 2014

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Jan 28, 2014

Contributor

Also open to ideas about how to make this more manageable - looking at the util.py split into smaller modules is a good candidate.

Contributor

kevinburke commented Jan 28, 2014

Also open to ideas about how to make this more manageable - looking at the util.py split into smaller modules is a good candidate.

@@ -168,6 +170,21 @@ def encodingrequest(self, request):
def headers(self, request):
return Response(json.dumps(request.headers))
def successful_retry(self, request):

This comment has been minimized.

@shazow

shazow Jan 28, 2014

Collaborator

IMO these things are easier to do in socket-level tests (we already have some similar ones). Any particular reason why not do it that way?

@shazow

shazow Jan 28, 2014

Collaborator

IMO these things are easier to do in socket-level tests (we already have some similar ones). Any particular reason why not do it that way?

This comment has been minimized.

@kevinburke

kevinburke Jan 28, 2014

Contributor

I guess I just preferred to use HTTP as server interface, and I am not too solid on the semantics of when to call accept(), close(), whether I have to close the server, whether you can reuse sockets across requests, etc.

There's also a standing issue with DummyServer where the tests hang if an exception is raised during the run.

@kevinburke

kevinburke Jan 28, 2014

Contributor

I guess I just preferred to use HTTP as server interface, and I am not too solid on the semantics of when to call accept(), close(), whether I have to close the server, whether you can reuse sockets across requests, etc.

There's also a standing issue with DummyServer where the tests hang if an exception is raised during the run.

This comment has been minimized.

@shazow

shazow Jan 29, 2014

Collaborator

There are lots of example tests written. Here's one that does exactly what you want: https://github.com/shazow/urllib3/blob/master/test/with_dummyserver/test_socketlevel.py#L134

Up to you though. We can keep it the way it is if you're happy with it.

@shazow

shazow Jan 29, 2014

Collaborator

There are lots of example tests written. Here's one that does exactly what you want: https://github.com/shazow/urllib3/blob/master/test/with_dummyserver/test_socketlevel.py#L134

Up to you though. We can keep it the way it is if you're happy with it.

This comment has been minimized.

@Lukasa

Lukasa Jan 29, 2014

Contributor

I'm +0.5 on the socket-level tests: I find them substantially easier to follow when I'm coming in to the library because all the logic is in one place. Take that under advisement, though. =)

@Lukasa

Lukasa Jan 29, 2014

Contributor

I'm +0.5 on the socket-level tests: I find them substantially easier to follow when I'm coming in to the library because all the logic is in one place. Take that under advisement, though. =)

Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/secure.py
@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jan 28, 2014

Collaborator

Thanks for this Kevin! That's a lot of really useful work. :) I really appreciate it.

Some feedback:

  • Adds support for retries, per the spec we laid out (mostly).

I noted a few questions in code comments.

  • Splits out util.py into submodules - I need to know what you want to do here because as written it would change from urllib3.util.Timeout into urllib3.util.timeout.Timeout which is not good.

In the future, a separate change might have been more productive, but we can try to plow through this. Could you add the relevant imports to urllib3/util/__init__.py such that all the existing urllib3.util.Timeout and such continue to work?

  • adds a new /successful_retry handler to the DummyTestCase which keys based on a test-name header and returns 200 only after the request has been retried once.

As I mentioned, I think this would be better done in socket-level tests, unless you're fully satisfied with what you have now.

  • urlopen previously would retry on read timeouts, which violated the urlopen contract (as I understood it) of only retrying things that couldn't possibly be side effecting. this code does not retry read timeouts by default.

Isn't that exactly why we have method_whitelist?

Thanks again for doing this. You're much more thorough than I would have been in my first implementation. :)

P.S. Worth noting that there's a Py3 failure right now.

Collaborator

shazow commented Jan 28, 2014

Thanks for this Kevin! That's a lot of really useful work. :) I really appreciate it.

Some feedback:

  • Adds support for retries, per the spec we laid out (mostly).

I noted a few questions in code comments.

  • Splits out util.py into submodules - I need to know what you want to do here because as written it would change from urllib3.util.Timeout into urllib3.util.timeout.Timeout which is not good.

In the future, a separate change might have been more productive, but we can try to plow through this. Could you add the relevant imports to urllib3/util/__init__.py such that all the existing urllib3.util.Timeout and such continue to work?

  • adds a new /successful_retry handler to the DummyTestCase which keys based on a test-name header and returns 200 only after the request has been retried once.

As I mentioned, I think this would be better done in socket-level tests, unless you're fully satisfied with what you have now.

  • urlopen previously would retry on read timeouts, which violated the urlopen contract (as I understood it) of only retrying things that couldn't possibly be side effecting. this code does not retry read timeouts by default.

Isn't that exactly why we have method_whitelist?

Thanks again for doing this. You're much more thorough than I would have been in my first implementation. :)

P.S. Worth noting that there's a Py3 failure right now.

Show outdated Hide outdated urllib3/response.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated test/__init__.py
Show outdated Hide outdated test/test_retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jan 31, 2014

Collaborator

... a Retry object is not safe to use in multiple calls to request() because the state of the object gets modified when an error condition is hit.

I think that's fine to start with. Let's add a note to the documentation that people need to be aware of this. I want to make sure the pattern we're promoting is pool.request(..., retry=Retry(...)).

We could add an easy .clone() or .new() function or something if people want to do...

retry = Retry(...)
pool.request(..., retry=retry.new())
pool.request(..., retry=retry.new())
...

But that can happen in a future PR.

Collaborator

shazow commented Jan 31, 2014

... a Retry object is not safe to use in multiple calls to request() because the state of the object gets modified when an error condition is hit.

I think that's fine to start with. Let's add a note to the documentation that people need to be aware of this. I want to make sure the pattern we're promoting is pool.request(..., retry=Retry(...)).

We could add an easy .clone() or .new() function or something if people want to do...

retry = Retry(...)
pool.request(..., retry=retry.new())
pool.request(..., retry=retry.new())
...

But that can happen in a future PR.

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Feb 2, 2014

Contributor

I actually ended up fixing this by making the Retry object immutable, e.g. you update the state by creating a new Retry object. See kevinburke/urllib3@4a1849e...3eebd06 for details. I actually like it a fair amount, it removes a lot of the complexity in the Retry class, but can back it out if you wish.

I'm going to squash commits at some point, but I'm not sure how you review updates to existing PR's so have left it as is for the moment.

Contributor

kevinburke commented Feb 2, 2014

I actually ended up fixing this by making the Retry object immutable, e.g. you update the state by creating a new Retry object. See kevinburke/urllib3@4a1849e...3eebd06 for details. I actually like it a fair amount, it removes a lot of the complexity in the Retry class, but can back it out if you wish.

I'm going to squash commits at some point, but I'm not sure how you review updates to existing PR's so have left it as is for the moment.

Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
Show outdated Hide outdated urllib3/util/retry.py
@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Feb 19, 2014

Collaborator

@kevinburke How are we feeling about this? I'd love to include this in urllib3 v1.8, but not a big rush. :)

Collaborator

shazow commented Feb 19, 2014

@kevinburke How are we feeling about this? I'd love to include this in urllib3 v1.8, but not a big rush. :)

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Feb 19, 2014

Contributor

I need to work on it, sorry :(

Kevin Burke | Twilio
phone: 925.271.7005 | kev.inburke.com

On Tue, Feb 18, 2014 at 6:41 PM, Andrey Petrov notifications@github.comwrote:

@kevinburke https://github.com/kevinburke How are we feeling about
this? I'd love to include this in urllib3 v1.8, but not a big rush. :)

Reply to this email directly or view it on GitHubhttps://github.com/shazow/urllib3/pull/326#issuecomment-35460439
.

Contributor

kevinburke commented Feb 19, 2014

I need to work on it, sorry :(

Kevin Burke | Twilio
phone: 925.271.7005 | kev.inburke.com

On Tue, Feb 18, 2014 at 6:41 PM, Andrey Petrov notifications@github.comwrote:

@kevinburke https://github.com/kevinburke How are we feeling about
this? I'd love to include this in urllib3 v1.8, but not a big rush. :)

Reply to this email directly or view it on GitHubhttps://github.com/shazow/urllib3/pull/326#issuecomment-35460439
.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Feb 19, 2014

Collaborator

No worries. Good things take time. :)

Collaborator

shazow commented Feb 19, 2014

No worries. Good things take time. :)

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Feb 26, 2014

Collaborator

I feel we need to recruit an extra pair of eyes to look over this. :)

@kevinburke Is there anything left to do here as far as you're concerned?

Collaborator

shazow commented Feb 26, 2014

I feel we need to recruit an extra pair of eyes to look over this. :)

@kevinburke Is there anything left to do here as far as you're concerned?

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Feb 27, 2014

Contributor

Agreed; it's a lot of code and would be easy to introduce errors. If you're happy with the redirect behavior as is then yes I am happy with the code.

There are reference docs but no "user-guide" docs. These would be nice to have, though I am reluctant to just chuck another section onto the end of the frontpage.

Contributor

kevinburke commented Feb 27, 2014

Agreed; it's a lot of code and would be easy to introduce errors. If you're happy with the redirect behavior as is then yes I am happy with the code.

There are reference docs but no "user-guide" docs. These would be nice to have, though I am reluctant to just chuck another section onto the end of the frontpage.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Feb 27, 2014

Collaborator

@kevinburke I'll ping some people. If you know anyone who'd like some practice doing Python code reviews, toss them this way. :)

Collaborator

shazow commented Feb 27, 2014

@kevinburke I'll ping some people. If you know anyone who'd like some practice doing Python code reviews, toss them this way. :)

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Mar 6, 2014

Contributor

There is one thing I noticed which is kind of not good. This interface couples HTTP methods and status codes for retries, so you can't say something like, "retry all HTTP methods if it's a 503 or 429, but only retry GET's and DELETE's for 502, 504, etc." Which I feel is a valid use case as 429 generally means the server didn't do any processing and the request is safe to retry.

I don't know a better way around it, besides some kind of ugly map structure.

Contributor

kevinburke commented Mar 6, 2014

There is one thing I noticed which is kind of not good. This interface couples HTTP methods and status codes for retries, so you can't say something like, "retry all HTTP methods if it's a 503 or 429, but only retry GET's and DELETE's for 502, 504, etc." Which I feel is a valid use case as 429 generally means the server didn't do any processing and the request is safe to retry.

I don't know a better way around it, besides some kind of ugly map structure.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jun 27, 2014

Collaborator

Some of these cannot be triggered with the way we're using httplib.

The rest, I really don't want to care about. Still working on a way to prune them by depending more on urllib3 internal state.

Collaborator

shazow commented on urllib3/util/retry.py in a152a46 Jun 27, 2014

Some of these cannot be triggered with the way we're using httplib.

The rest, I really don't want to care about. Still working on a way to prune them by depending more on urllib3 internal state.

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Jun 27, 2014

Contributor

Fair enough... just wanted to point out that they could be raised, and some of them implied side effects & some didn't :)

Contributor

kevinburke replied Jun 27, 2014

Fair enough... just wanted to point out that they could be raised, and some of them implied side effects & some didn't :)

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jun 27, 2014

Collaborator

Yea, we already try to wrap every httplib exception with our own. If any httplib thing gets through, that's a bug in urllib3.

Collaborator

shazow replied Jun 27, 2014

Yea, we already try to wrap every httplib exception with our own. If any httplib thing gets through, that's a bug in urllib3.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jun 27, 2014

Collaborator

I don't think the immutable Retry strategy is going to work. (See refactoring in #421 for broken details.)

One big problem is that we have no way to retain Retry state between intra-host retries and inter-host retries in a PoolManager.

That is, we pass in a Retry object to a PoolManager.urlopen, it passes it to a ConnectionPool, which might do some retries and create immutable clones of the Retry object with new state. Now say 3 retries later, there is a cross-host redirect so it bubbles up to the PoolManager layer—all of the immutable Retry state from the ConnectionPool is lost as it enters a new ConnectionPool.

// Note to self: Write a test for this.

Not quite sure how we should deal with this. Bubbling up the Retry state with the response object does not seem like a good idea either. :'(

Collaborator

shazow commented Jun 27, 2014

I don't think the immutable Retry strategy is going to work. (See refactoring in #421 for broken details.)

One big problem is that we have no way to retain Retry state between intra-host retries and inter-host retries in a PoolManager.

That is, we pass in a Retry object to a PoolManager.urlopen, it passes it to a ConnectionPool, which might do some retries and create immutable clones of the Retry object with new state. Now say 3 retries later, there is a cross-host redirect so it bubbles up to the PoolManager layer—all of the immutable Retry state from the ConnectionPool is lost as it enters a new ConnectionPool.

// Note to self: Write a test for this.

Not quite sure how we should deal with this. Bubbling up the Retry state with the response object does not seem like a good idea either. :'(

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jun 30, 2014

Collaborator

Thinking about it some more, this is actually the same behaviour we always had, so probably okay to keep it. Will document that each cross-host redirect cycle is independent. Will need to make sure we don't get infinite cross-host redirects, though.

Collaborator

shazow commented Jun 30, 2014

Thinking about it some more, this is actually the same behaviour we always had, so probably okay to keep it. Will document that each cross-host redirect cycle is independent. Will need to make sure we don't get infinite cross-host redirects, though.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jul 1, 2014

Collaborator

Did some pretty big changes, fairly happy with where it's going. Thoughts appreciated.

Thinking of removing Retries._observed_errors in place of somehow tracking the retry history in the response object and relying on those instead?

Collaborator

shazow commented Jul 1, 2014

Did some pretty big changes, fairly happy with where it's going. Thoughts appreciated.

Thinking of removing Retries._observed_errors in place of somehow tracking the retry history in the response object and relying on those instead?

HOSTv6 = "::1"
def find_unused_port(family=socket.AF_INET, socktype=socket.SOCK_STREAM):
"""Returns an unused port that should be suitable for binding. This is

This comment has been minimized.

@sigmavirus24

sigmavirus24 Jul 1, 2014

Member

Holy docstring Batman!

@sigmavirus24

sigmavirus24 Jul 1, 2014

Member

Holy docstring Batman!

This comment has been minimized.

@shazow

shazow Jul 1, 2014

Collaborator

Yea it's just copypasta from the stdlib I think.

@shazow

shazow Jul 1, 2014

Collaborator

Yea it's just copypasta from the stdlib I think.

@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 Jul 1, 2014

Member

LGTM. :shipit:

Member

sigmavirus24 commented Jul 1, 2014

LGTM. :shipit:

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jul 2, 2014

Collaborator

I hope I don't regret this... :P

Collaborator

shazow commented Jul 2, 2014

I hope I don't regret this... :P

shazow added a commit that referenced this pull request Jul 2, 2014

@shazow shazow merged commit c858e19 into urllib3:master Jul 2, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 Jul 2, 2014

Member

@shazow do you regret it yet? =P

Member

sigmavirus24 commented Jul 2, 2014

@shazow do you regret it yet? =P

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jul 2, 2014

Collaborator

We'll see when v1.9 goes out. :)

Collaborator

shazow commented Jul 2, 2014

We'll see when v1.9 goes out. :)

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Jul 3, 2014

Contributor

Mind adding more comment on the critical flaws discovered?

Contributor

kevinburke commented on c08c7c1 Jul 3, 2014

Mind adding more comment on the critical flaws discovered?

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Jul 3, 2014

Collaborator

I think I was referring to c08c7c1#diff-0401b3595c5b19c0b003447baaaf35d0R172 but later decided that it wasn't a big deal.

Collaborator

shazow replied Jul 3, 2014

I think I was referring to c08c7c1#diff-0401b3595c5b19c0b003447baaaf35d0R172 but later decided that it wasn't a big deal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment