Add SyncBackend and use it on Client #525

florimondmanca · 2019-11-13T10:14:07Z

Okay so, related to #508 and considering the current discussion to drop sync support #522, I figured I'd give the idea of a sync backend suggested by @pquentin a shot.

This PR adds a SyncBackend that performs blocking I/O and synchronization operations using the socket and threading modules.

The idea is that we keep the async-annotated ConcurrencyBackend interface, but require all async-annotated methods to actually be synchronous. This constraint is implemented by run_secretly_sync_async_function.

There's a bunch of code adapted from urllib3, in particular sync_backend.py, taking into account that we can use modern Python features.

There's still some polishing left, but what strikes me is how little existing code had to be modified. Only the Client was modified to use SyncBackend instead of AsyncioBackend. This is interesting w.r.t. the sync code generation idea: we could generate the sync client from the async one, and tweak the concurrency backend (as well as some other parts — I don't think dropping async annotations will be enough) and things should run smoothly.

Still, I think this PR has value on its own. We currently support sync, and it's definitely a good idea to have the sync client perform sync I/O at all times.

florimondmanca · 2019-11-13T10:16:30Z

Oh and, a proof that this allows the sync client to work seamlessly in async environments, which would fix #508:

>>> import asyncio
>>> import httpx
>>> async def main():
...     r = httpx.get("https://example.org")
...     print(r)
... 
>>> asyncio.run(main())
<Response [200 OK]>

I'd need to add tests for this as in #513, though. Edit: done.

tomchristie · 2019-11-13T13:32:44Z

Great work!

I'm not sure if we'd want this in, in itself? Granted, it resolves #508, but it's a lot of extra complexity in order to get there.

I think it's more likely we'd want this alongside dropping the async-to-sync bridging code, and have this plugged into a code-gen'ed sync variant of httpx.

A couple of initial thoughts here...

There are likely potential race conditions inside the client, connection pools, and connections when used in a threaded context. We'd need to think about that anyplace that multiple threads will be sharing a single client.
Our sync implementation here uses a thread pool for the "read and write concurrently" case. I'm interested in how the existing urllib achieves read-and-write-for-a-bit, because I'm assuming that it's not by spawning a new thread?

florimondmanca · 2019-11-13T14:17:20Z

a code-gen'ed sync variant of httpx

To be fair, I'm not sure that path would lead us to a less complex setup. :-) Besides, I'm afraid we're putting too much hope in this generated sync version of HTTPX. There are aspects we'll need to deal with anyway — like, how do we actually talk to a socket at a high enough level if we don't use asyncio or trio? Those aspects are encapsulated in this backend, and even with code gen I think we won't be able to get around it. (Maybe this is what you envision too; I'm not sure from your comment.)

There are likely potential race conditions inside the client,

Hmm, probably, yes. Is this a general issue we'd need to address, or do you think this particular PR makes things worse?

I'm assuming that it's not by spawning a new thread?

Yup, urllib3's implementation is even named send_and_receive_for_a_while(). This allows urllib3 to ditch any notion of BackgroundManager as we currently have, and instead manually alternate between sending and receiving data. We could switch to this style, but it would have to be a global change, not particularly in this sync backend.

tomchristie · 2019-11-13T14:40:33Z

Yup, urllib3's implementation is even named send_and_receive_for_a_while().

Wrt. To that question I was refering specifically to the actual "urllib3" package, rather than to the trio fork. I'm interested in how the existing thread based library handles that case.

florimondmanca · 2019-11-13T19:37:15Z

Wrt. To that question I was refering specifically to the actual "urllib3" package, rather than to the trio fork. I'm interested in how the existing thread based library handles that case.

Well, as of today, it seems urllib3 doesn't do any kind of concurrency at that point: it sends the request, then reads the response - source. Perhaps @pquentin can confirm?

pquentin · 2019-11-14T05:32:52Z

Well, as of today, it seems urllib3 doesn't do any kind of concurrency at that point: it sends the request, then reads the response - source. Perhaps @pquentin can confirm?

Right. urllib3 is thread-safe: if you want more concurrency, you should open a pool manager and use it from different threads, and it will reuse connections efficiently. For what it's worth, this is why I was surprised to see that httpx uses threads, even in the trio version, as it makes it hard to cancel requests correctly.

tomchristie · 2019-11-14T13:20:54Z

"I was surprised to see that httpx uses threads, even in the trio version" - That might(?) be a misunderstanding. We don't use threads for the async cases. (With the exception of some very limited threadpooling stuff, such as "read the SSL context from a file" where we use a threadpool so as to not block the task.)

tomchristie · 2019-11-14T13:29:00Z

it sends the request, then reads the response

Okay, that's surprising. We'd been doing that in httpx, until it turned out that it doesn't actually cover everything correctly, and we were forced to switch to the more complicated "read and write concurrently", in the same way that trio's urllib3 fork has "read and write for a bit".

I don't recall the full details of that, but its not at all clear to me why that's not neccessary in urllib3's case. (I don't have the headspace to dive back into it, but perhaps to do with buffering and closed connections working differently in the sync case?)

In any case we're going off topic on this issue.

This PR is a "close" to me. There's significant extra complexity, and insufficient benefit.

The edge case of "you want to used httpx in a sync context, but there's also an existing async loop" only occurs in the slightly weird context of being in a repl with a fuzzy "you're not really in sync or async mode" state. The simplest, more sensible way to deal with that is to have the sync client transparently run the request within a threadpool in that case.

florimondmanca requested a review from a team November 13, 2019 10:14

florimondmanca force-pushed the feat/sync-backend branch 2 times, most recently from 39ebcc2 to 0f1a67a Compare November 13, 2019 10:15

Add SyncBackend and use it on Client

03e7934

florimondmanca force-pushed the feat/sync-backend branch from 0f1a67a to 03e7934 Compare November 13, 2019 10:35

Add tests for client in async environment

cb9b9f5

florimondmanca mentioned this pull request Nov 13, 2019

Discussion about dropping sync support. #522

Closed

tomchristie closed this Nov 14, 2019

florimondmanca deleted the feat/sync-backend branch November 14, 2019 13:42

florimondmanca mentioned this pull request Nov 17, 2019

Use the existing runloop for sync calls in async handler #179

Closed

florimondmanca mentioned this pull request Dec 8, 2019

Supporting Sync. Done right. #572

Closed

florimondmanca mentioned this pull request Jan 5, 2020

Unasync'ed HTTPX #722

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SyncBackend and use it on Client #525

Add SyncBackend and use it on Client #525

florimondmanca commented Nov 13, 2019 •

edited

Loading

florimondmanca commented Nov 13, 2019 •

edited

Loading

tomchristie commented Nov 13, 2019

florimondmanca commented Nov 13, 2019 •

edited

Loading

tomchristie commented Nov 13, 2019

florimondmanca commented Nov 13, 2019

pquentin commented Nov 14, 2019 •

edited

Loading

tomchristie commented Nov 14, 2019

tomchristie commented Nov 14, 2019

Add SyncBackend and use it on Client #525

Add SyncBackend and use it on Client #525

Conversation

florimondmanca commented Nov 13, 2019 • edited Loading

florimondmanca commented Nov 13, 2019 • edited Loading

tomchristie commented Nov 13, 2019

florimondmanca commented Nov 13, 2019 • edited Loading

tomchristie commented Nov 13, 2019

florimondmanca commented Nov 13, 2019

pquentin commented Nov 14, 2019 • edited Loading

tomchristie commented Nov 14, 2019

tomchristie commented Nov 14, 2019

florimondmanca commented Nov 13, 2019 •

edited

Loading

florimondmanca commented Nov 13, 2019 •

edited

Loading

florimondmanca commented Nov 13, 2019 •

edited

Loading

pquentin commented Nov 14, 2019 •

edited

Loading