Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added AnyIO support #169

Merged
merged 7 commits into from Sep 16, 2020
Merged

Added AnyIO support #169

merged 7 commits into from Sep 16, 2020

Conversation

agronholm
Copy link
Contributor

No description provided.

@tomchristie
Copy link
Member

This is fantastic @agronholm, thank you!

There's a difficult decision to take here wrt. to which of the following we'd want to proceed with...

  • Use AnyIOBackend to outright replace the backend-native implementations.
  • Use it to replace only a subset of the backend-native implementations.
  • Pull AnyIOBackend in, but keep the backend-native implementations as the default. (We don't currently have any controls for selecting which backend to use, but we could perfectly well add one.)

My main concern about pulling it in as an outright replacement for our backend implementations would be that we're taking a small amount of surfaced code, and replacing it with a larger amount of submerged code, in the form of a dependancy. That gets really awkward in particular when we're dealing with issues such as bugs in the asyncio implementation that we're having to work around, because there's an extra layer in there for us to consider. And the extra layer in the stack is just generally something I'd prefer to avoid if possible, even if it means we have a little more code than we might otherwise.

Although I am super keen on us being able to share expertise - this implementation comes out really nice & clean, and there's one or two nice little nuggets in there, such as using the select for connection dropped in all cases, which is really interesting.

One other thought here... both this PR and #168 unsurprisingly both change the .release() and .time() signatures throughout, because curio adopts a different style to trio and asyncio there. It might make sense for someone to issue a pull request which just changes that aspect first, since that's a pretty unambiguous internal change that we clearly want to make, and is relied on by both PRs.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting, thank you so much for pulling this off!

I'm curious about something... What if, instead of adding a new and separate AnyIOBackend, we changed the implementation of AutoBackend to use anyio (and removed the other backends since they're not needed anymore)? So, no new backend class, but just an overhaul of AutoBackend so that it uses anyio instead of using our own backend classes (all of which is an implementation detail)?

Eventually we could very much just have a direct AsyncBackend, and drop the rest of the base classes and interfaces (since the async environment abstraction is done by anyio now).

I'm not saying we should do this in this PR or right now, but it's interesting to see how much code could actually be dropped if we switched to anyio entirely (which I guess is the eventual goal of this work?).

@agronholm
Copy link
Contributor Author

I'm curious about something... What if, instead of adding a new and separate AnyIOBackend, we changed the implementation of AutoBackend to use anyio (and removed the other backends since they're not needed anymore)? So, no new backend class, but just an overhaul of AutoBackend so that it uses anyio instead of using our own backend classes (all of which is an implementation detail)?

Fine by me. My next goal was to indeed do away with all the code rendered useless by this PR.

Eventually we could very much just have a direct AsyncBackend, and drop the rest of the base classes and interfaces (since the async environment abstraction is done by anyio now).

Also, do you plan on having other sync backends too? I was wondering why there is a lone sync backend. Was it just because there were multiple async backends? Maybe you want to support gevent directly or something?

I'm not saying we should do this in this PR or right now, but it's interesting to see how much code could actually be dropped if we switched to anyio entirely (which I guess is the eventual goal of this work?).

Quite a lot, I'd say :)

@agronholm
Copy link
Contributor Author

My main concern about pulling it in as an outright replacement for our backend implementations would be that we're taking a small amount of surfaced code, and replacing it with a larger amount of submerged code, in the form of a dependancy. That gets really awkward in particular when we're dealing with issues such as bugs in the asyncio implementation that we're having to work around, because there's an extra layer in there for us to consider. And the extra layer in the stack is just generally something I'd prefer to avoid if possible, even if it means we have a little more code than we might otherwise.

The primary purpose of AnyIO was to relieve library authors of the burden of having to support multiple async backends. Generally AnyIO also works around those bugs on your behalf. I get that the project is relatively new (2018-) so it doesn't have much of a track record. I can only hope that I'll be able to convince you.

@cdeler
Copy link
Member

cdeler commented Aug 25, 2020

@tomchristie you said:

One other thought here... both this PR and #168 unsurprisingly both change the .release() and .time() signatures throughout, because curio adopts a different style to trio and asyncio there. It might make sense for someone to issue a pull request which just changes that aspect first, since that's a pretty unambiguous internal change that we clearly want to make, and is relied on by both PRs.

so I created #170

@tomchristie
Copy link
Member

Right, probably worth updating this against master now that #170 is in, since it'll have a slightly smaller footprint now.

@agronholm
Copy link
Contributor Author

I've rebased against the latest master and wasted at least 30 minutes trying to play whack-a-mole with 4 different linters.

@agronholm
Copy link
Contributor Author

A tox configuration file would be nice so that I wouldn't have to wait for Github Actions to tell me what I did wrong.

@tomchristie
Copy link
Member

Running scripts/test locally should be the only thing you need to do. That'll run through the linting checks and test suite.

If there are linting issues the you can run scripts/lint, which applies black/isort and runs the unasync'ing script that ensures the async & sync code is mirrored.

@agronholm
Copy link
Contributor Author

It doesn't seem like black is respecting its own configuration.

@tomchristie
Copy link
Member

From the test runs, the issue looks like the async/sync codebases being out of sync with each other.

If you want to resolve that without running any of the other linting, then run scripts/unasync

@agronholm
Copy link
Contributor Author

It seems that I needed to edit the unasync script which I hadn't done. Now there seems to be a new method in the interface I have to implement. I'll get back to you on that.

@agronholm
Copy link
Contributor Author

The test suite does not pass yet. Something unexpected happens on the asyncio backend.

@agronholm
Copy link
Contributor Author

There are two tests that fail almost consistently: the test_connection_pool_get_connection_info test on asyncio + 0 second keepalive_expiry. For some reason, they report {'https://example.org': ['HTTP/2, IDLE, 0 streams']} (for http/2, HTTP/1.1 for http/1.1 accordingly) where an empty dict is expected.

@agronholm
Copy link
Contributor Author

I checked the results of _keepalive_sweep(), and this is what I got:

expires_at: None now: 193853.711
expires_at: None now: 193854.236
expires_at: None now: 193854.236
expires_at: 193854.239 now: 193854.239
expires_at: 193854.238 now: 193854.239

What struck me as odd was the accuracy of the timestamps: they were only accurate to the millisecond. The stdlib asyncio event loop demonstrably produces timestamps accurate to the microsecond, but uvloop limits them to the millisecond.

Since the condition is now > connection.expires_at, this is why those connections are not pruned. In short, the code runs so fast that the operations are less than a millisecond apart so it cannot tell the difference in time when uvloop is used.

I can modify the backend options so that asyncio tests don't use uvloop, but would it be possible to modify the condition to >=? That would easily solve the problem.

@tomchristie
Copy link
Member

but would it be possible to modify the condition to >=

Sounds sensible, yup.

@agronholm
Copy link
Contributor Author

What's the next step?

@agronholm agronholm marked this pull request as ready for review August 28, 2020 11:32
@florimondmanca
Copy link
Member

florimondmanca commented Aug 28, 2020

There are a couple of unrelated changes here that ideally should be dealt with in separate, isolated PRs:

  • Picking up Black 20
  • The >= fix to improve robustness of the test discussed above.

Also, what do we think of this...? Instead of the removing usage of AutoBackend, just change how it selects the backend implementation to always return an AnyIOBackend. This should result in an even lower changes footprint while keeping the intent that "in the async case we automatically select which async library to use" (with anyio staying and implementation detail, at least for now).

Thanks so much for championing this, as often in case of relatively major changes there are a couple of small adjustments to make as preliminary work, but we'll get there! :)

@agronholm
Copy link
Contributor Author

Also, there is one planned API change in AnyIO before v2.0 final. That will affect this PR and it has to be modified.

@agronholm agronholm marked this pull request as draft August 29, 2020 08:25
@agronholm
Copy link
Contributor Author

One thing out of the way: #171

@florimondmanca
Copy link
Member

And another one for fixing Black havoc: #172

@agronholm
Copy link
Contributor Author

Rebased against master again.

@agronholm
Copy link
Contributor Author

Which classes are we talking about here? The ones in httpcore that currently take a backend argument?

@tomchristie
Copy link
Member

@agronholm @florimondmanca

Okay, so #179 adds a backend=<"auto"|"asyncio"|"trio"|"curio"> parameter to the AsyncConnectionPool and AsyncHTTPProxy classes.

That'd allow users to override the current auto-detection with an explicit selection. It's also useful at this point in time, because it'd allow us to add "anyio" as an option without making a hard switch to it.

Once that's merged I think the route to getting this pull request in would be...

  • Update the PR so that it only does the following...
    • Add the AnyIOBackend implementation.
    • Update httpcore._backends.base.lookup_async_backend to include an "anyio" option.

For the purposes of keeping the PR tested with minimal changes I'd also suggest that the PR could then switch the test cases in tests/async_tests/test_interfaces.py so that anywhere httpcore.AsyncConnectionPool() or httpcore.AsyncHTTPProxy() is used, an explicit backend="anyio" argument is used.

There shouldn't be any other changes in the test suite, such as switching to pytestmark = pytest.mark.anyio, or anything like that. (We might want to pull in some of that work later if anyio can help us tidy up the test suite, but it should be treated as completely isolated from this work.)

The last change after that would be parameterizing the test suite so that the async test cases end up running both backend="auto" and backend="anyio". (Except the test_explicit_backend_name test case, which tests using an explicit name with each of backend="asyncio", backend="trio", backend="curio")

@agronholm
Copy link
Contributor Author

This means we'll be running the tests against every backend twice: one with the direct backends and once via anyio (6 variations in total).

@florimondmanca
Copy link
Member

This means we'll be running the tests against every backend twice: one with the direct backends and once via anyio (6 variations in total).

Eventually yes, and the existing testing approach of testing against a live server means the test suite will end up 2x even lower than it is now, but I guess we can roll with that for now (we're tracking the need to switch to a local server via #109).

@tomchristie
Copy link
Member

tomchristie commented Sep 9, 2020

@agronholm - Pushed up #180 to show how I think we ought to add anyio support in incrementally, with a PR that has a minimal-as-possible change footprint. I guess the sensible thing to do would be to bring this one up to date in line with #180, and then add in the last little bit of parameterisation.

(Tho I'd also be happy to finish up the work directly in #180 if that's easier from your POV)

@tomchristie
Copy link
Member

tomchristie commented Sep 9, 2020

Things that we can then keep as separate, isolated PRs...

  • PR making the test suite neater by using anyio. (Eg. tidying up the conftest as you've done here.)
  • Potentially a PR demonstrating removing the native backend implementations, and only using AnyIO backend. (So that we're considering that option in isolation.)

And thanks so much for being patient with all this work. I know that this gradual incremental approach is a bit cautious, but it's a really good way for us to have a high degree of confidence at each step. 🙂

@agronholm
Copy link
Contributor Author

PR making the test suite neater by using anyio. (Eg. tidying up the conftest as you've done here.)

I've created a PR to address this: #181

@agronholm agronholm changed the title Replaced other async backends with AnyIOBackend Added AnyIO support Sep 11, 2020
@agronholm
Copy link
Contributor Author

The footprint of this PR will go down somewhat when #183 is merged.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!

unasync.py Outdated Show resolved Hide resolved
unasync.py Outdated Show resolved Hide resolved
Comment on lines +88 to +91
def is_connection_dropped(self) -> bool:
raw_socket = self.stream.extra(SocketAttribute.raw_socket)
rready, _wready, _xready = select.select([raw_socket], [], [], 0)
return bool(rready)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to look into #182 some more before deciding that this trick is worth committing to. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for us to approach it in this order?...

  1. Merge this PR.
  2. Update Tweak dropped connection detection #185 to include curio and anyio.
  3. Get Tweak dropped connection detection #185 merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure yup, #185 isn't blocking us from merging this, just wanted to clarify that this select() approach needed a bit of refining. :)

agronholm and others added 3 commits September 13, 2020 13:04
Co-authored-by: Florimond Manca <florimond.manca@gmail.com>
Co-authored-by: Florimond Manca <florimond.manca@gmail.com>
@agronholm
Copy link
Contributor Author

Is there anything more you need from me here?

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good rolling with this one now, thank you so much @agronholm!

@florimondmanca florimondmanca merged commit 21f60dd into encode:master Sep 16, 2020
@agronholm agronholm deleted the anyio branch September 16, 2020 12:32
@florimondmanca
Copy link
Member

I'll go back to #185 and include the tweak for the curio and anyio backend's. :)

@tomchristie
Copy link
Member

🎉

@tomchristie tomchristie mentioned this pull request Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants