Use the existing runloop for sync calls in async handler #179

bennovakovic · 2019-07-31T10:37:09Z

Hi, this library looks awesome and I really want to use it; but I seem to have hit a blocker and can't seem to find a solution in the docs.

TL;DR: Is there any way to provide our own runloop when using the sync calls as looking through the source the sync calls are actually using the async request call.

Basically I'm writing a client api that provides 2 interfaces; an async and sync interface.

The issue that i am having is that if I try to use the sync methods with an existing runloop (provided by Sanic in this scenario), it complains with the exception: Cannot run the event loop while another loop is running.

Essentially I'm just calling:

import httpx
...
class SomeClass(object):
    def get_some_url(self):
        r = httpx.get('some url')

note that nothing here is async; apart from the fact that the .get_some_url call is made from within an async function like so:

import SomeClass

async def handler(request):
    o = SomeClass()
    o.get_some_url()

This works fine when using the async calls like this:

import httpx
...
class SomeClass(object):
    async def get_some_url(self):
        client = httpx.AsyncClient()
        r = await client.get('some url')

import SomeClass

async def handler(request):
    o = SomeClass()
    await o.get_some_url()

So I was just wondering if its possible to provide our own runloop when using the sync calls? Sanic uses uvloop internally, would be nice to be able to just pass that straight in somehow when constructing the httpx object, or setting it as a property after import.

Thanks!

The text was updated successfully, but these errors were encountered:

bennovakovic · 2019-07-31T10:45:12Z

FWIW; I tried the following in a vein hope that it would do what I wanted:

import httpx
client = httpx.Client(backend=<my app uvloop>)

But alas, no success.. this the end of the stacktrace.

   self.client = httpx.Client(backend=kwargs.get('runloop'))
  File "/usr/local/lib/python3.6/site-packages/httpx/client.py", line 85, in __init__
    backend=backend,
  File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/connection_pool.py", line 97, in __init__
    self.max_connections = self.backend.get_semaphore(pool_limits)
AttributeError: 'Loop' object has no attribute 'get_semaphore'

bennovakovic · 2019-07-31T10:52:13Z

In the meantime; I'll just use requests for the sync stuff, it feels odd passing in a runloop when I'm doing sync calls; purely because internally you're using async under the hood even for sync requests, or am I mistaken?

httpx/httpx/client.py

Line 625 in 66754ad

request = AsyncRequest(

cansarigol · 2019-07-31T11:04:13Z

FWIW; I tried the following in a vein hope that it would do what I wanted:

import httpx
client = httpx.Client(backend=<my app uvloop>)

But alas, no success.. this the end of the stacktrace.

   self.client = httpx.Client(backend=kwargs.get('runloop'))
  File "/usr/local/lib/python3.6/site-packages/httpx/client.py", line 85, in __init__
    backend=backend,
  File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/connection_pool.py", line 97, in __init__
    self.max_connections = self.backend.get_semaphore(pool_limits)
AttributeError: 'Loop' object has no attribute 'get_semaphore'

Hi to use backend parameter, it has to be derived by ConcurrencyBackend.

bennovakovic · 2019-07-31T11:05:38Z

Yeah, I figured that after I got the exception. I figured it would handle a standard runloop.

cansarigol · 2019-07-31T11:06:04Z

you can check sample class from

httpx/tests/dispatch/utils.py

Line 19 in 66754ad

class MockHTTP2Backend(AsyncioBackend):

bennovakovic · 2019-07-31T11:07:34Z

Yeah, unfortunately I can't modify my existing runloop to meet the requirements of the backend parameter; it is given to me by sanic directly.

cansarigol · 2019-07-31T11:12:25Z

Maybe runloop would be added with a loop params rather backend. I'd want to :)

yeraydiazdiaz · 2019-07-31T11:13:28Z

There is no API to provide the event loop yourself, however there is a loop property in the AsyncioConcurrencyBackend that checks an existing private _loop instance variable you can potentially set to the Sanic event loop.

We should consider adding this as a feature though.

yeraydiazdiaz · 2019-07-31T11:20:53Z

I'm not sure about the context you're working with but there's some subtleties around using the sync API in an async application.

The sync API internally uses the loop's run_in_executor which will create a ThreadPoolExecutor and run your request there which might hinder performance or just be unnecessary. If possible I'd suggest staying in async land and use the AsyncClient, but, again, it depends on your context.

tomchristie · 2019-07-31T14:00:27Z

The issue that i am having is that if I try to use the sync methods with an existing runloop (provided by Sanic in this scenario), it complains with the exception: Cannot run the event loop while another loop is running.

Okay, that's a problematic thing to be doing.

If you're running in the context of an event loop, but then making blocking style network calls you're going to block the event loop. You don't want to be doing that.

We've got a relevant issue here about how we describe that most clearly in any raised exception, but it's a fundamental property of the differing concurrency model, rather than a bug (at least as I read it here)

tomchristie · 2019-07-31T14:02:13Z

I'd also echo what @yeraydiazdiaz's comment: if you're running in an async context, then you need to be using the async client.

(Tho it might be useful to understand why you're not doing that, in order to provide better support for other users who try the same thing)

bennovakovic · 2019-07-31T22:52:34Z

@yeraydiazdiaz Yeah I figured; I was just trying to make the api client library idiot proof; and also give them the option to use a sync client even in an async application (for whatever reason that might be). I totally agree; staying in async land is definitely where I want to be too; but I can't assume that everyone is in async land; in fact my experience is that most aren't in the python world, unfortunately.

bennovakovic · 2019-07-31T22:56:30Z

@tomchristie yeah I completely agree & understand; it was more a question of is it possible rather than what am I doing wrong. I understand how its working; but was seeing if the library could behave the way I was hoping based on the way I'm building out the api client library. No big deal though; I can just revert to the requests library for the sync client, and httpx for the async.
FWIW; this is how the user would create an instance of the client library:

client = MyApiClient.create('v1.async', apikey='foo')
# or 
client = MyApiClient.create('v1', apikey='foo')

Then any following calls would either need to be using await if its the async api version, or just regular method calls in the sync scenario.

tomchristie · 2019-08-01T08:11:41Z

No big deal though; I can just revert to the requests library for the sync client, and httpx for the async.

That's not really the right resolution here.

httpx provides a sync client, for sync usage, and an async client, for async usage, so you have both cases covered.

The error you're seeing only occurs when you try to call into the sync interface from within an async context which a broken thing to do for your users, regardless of if that's implemented in requests, httpx or whatever. (*)

There are technical ways we could make it not error in httpx and carry on regardless, but it's far more sensible that we error loudly. I think it'd be worth us wrapping the Cannot run the event loop while another loop is running exception in our own exception class so that we can point to a relevant part in the async documentation explaining why that's not a valid thing to do.

(*) Unless you're switching over to sync portions of the codebase within a seperate thread, in which case httpx wouldn't raise the error you're seeing.

tomchristie · 2019-08-01T08:14:04Z

I was just trying to make the api client library idiot proof; and also give them the option to use a sync client even in an async application (for whatever reason that might be)

I guess this is the core bit. Making the library robust here ought to mean raising a nice clean, explanatory error when it's used in a broken way, rather than allowing that to pass silently.

pquentin · 2019-08-01T20:14:25Z

I guess this is the core bit. Making the library robust here ought to mean raising a nice clean, explanatory error when it's used in a broken way, rather than allowing that to pass silently.

Exactly! Trying to make an API both sync and async is an attractive nuisance. It does not even help end users who write applications.

On a related note, here's one way for httpx to avoid using asyncio and its event loop (and avoid this error):

Support a synchronous backend that uses standard blocking sockets
Use this trivial loop runner

But I don't know if it's worth it or not.

florimondmanca · 2019-10-17T08:33:30Z

Hey, seems like this issue has drifted away. :) Is there anything actionable we could do about it?

Adding a warning in the docs against using the sync API in an async context?
Trying to detect when the user is doing that and raise an explanatory error?

I’m in an issue closing mood today, and I think this one could just be considered as a constraint and documented as such (ie option 1).

tomchristie · 2019-10-17T13:16:10Z

I'd say documented constraint.

There's a million and one other broken blocking operations that a user could do, that won't autodetect. Unless there ends up being some backed in support in the stdlib guarding against this (unlikely) then this really just comes down to "if you're using async you need to be attentive to avoiding blocking operations".

iluxonchik · 2019-11-17T01:28:33Z

I've just encountered exactly the same issue. In my understanding, there are cases where it makes sense to make sync calls inside a coroutine. Let's say that you have a command-line tool that reads a JSON file from a url containing a website name: website url key-value pairs. In your command line tool you want to go through each one of the urls in that JSON file and parse some information from its HTML. Thus, you will have a code similar to this:

async def parse_info_from_urls(name, url, client):
    # this part of the code uses async httpx requests
    # the obtained HTML parsing is done here

async def main(url):
    # this part is sync and making it async would only add overhead
    json_res = https.get(url).json()
    name_url_list = [(name, url) for name, url in json_res.items()]
    async with httpx.AsyncClient() as client:
        res = await asyncio.gather(*(parse_info_from_urls(elem[0], elem[1], client) for elem in name_url_list)

if __name__ == '__main__':
    asyncio.run(main('http://some-url-to-the-json-file.com'))

Of course I can make the sync requests outside of main(), but that would make the code dirtier.
I would suggest doing one of the following:

Add support to specify custom runloop
Use a different runloop in the sync case
Keeping everything as it is, but documenting the behaviour and providing more explicit error messages

florimondmanca · 2019-11-17T09:30:05Z

@iluxonchik

# this part is sync and making it async would only add overhead

I think your comment is mistaken here, for two reasons:

The only sensible thing we could do to support the sync call would be to spawn a new thread (since you can't just have multiple event loops running in the same thread). So, that would definitely be worse overhead than scheduling a coroutine on the current event loop.
If there was a way to make the sync call without using the event loop (e.g. using a sync backend, which was rejected in Add SyncBackend and use it on Client #525), the sync call would block the event loop completely, which is almost always not what you want.

In your case, it seems you have to use async (as most users do when they're using HTTPX in an async context), e.g. by wrapping all the code inside the async with AsyncClient() statement. Running coroutines is way cheaper than you might think. :-)

I think this issue is a duplicate of #508 (or vice versa). I'll go ahead a push a PR that sync calls are not supported in an async environment; as discussed earlier this should at least address this PR.

iluxonchik · 2019-11-17T13:29:30Z

@florimondmanca

the sync call would block the event loop completely, which is almost always not what you want

I completely agree, however, in the case of a simple script that needs to get some information from a url, before using that information to distribute it across multiple coroutines, blocking the event loop does no harm at all. In my example, the sync call happens when there is only one coroutine running on the event loop, so it makes no difference whether I make blocking calls or not.

Running coroutines is way cheaper than you might think. :-)

Oh trust me, I'm not worried about that overhead, which, when you account for the I/O stuff (network requests), is most likely will not going to be noticeable ever.

Just to be clear, I'm not alleging that you should break the philosophy of the library, because it makes sense so for a limited number of cases (although in web scraping that could be a common scenario). I'm just saying that it should bee made explicit 😄

For me, the issue was assuming that httpx.AsyncClient() is equivalent to aiohttp.Session(), which it is not.

florimondmanca added the concurrency Issues related to concurrency and usage of async libraries label Nov 17, 2019

florimondmanca mentioned this issue Nov 17, 2019

Document when to use AsyncClient #534

Merged

florimondmanca closed this as completed in #534 Nov 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the existing runloop for sync calls in async handler #179

Use the existing runloop for sync calls in async handler #179

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

yeraydiazdiaz commented Jul 31, 2019

yeraydiazdiaz commented Jul 31, 2019 •

edited

Loading

tomchristie commented Jul 31, 2019

tomchristie commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019 •

edited

Loading

tomchristie commented Aug 1, 2019

tomchristie commented Aug 1, 2019

pquentin commented Aug 1, 2019 •

edited

Loading

florimondmanca commented Oct 17, 2019

tomchristie commented Oct 17, 2019

iluxonchik commented Nov 17, 2019 •

edited

Loading

florimondmanca commented Nov 17, 2019

iluxonchik commented Nov 17, 2019 •

edited

Loading

Use the existing runloop for sync calls in async handler #179

Use the existing runloop for sync calls in async handler #179

Comments

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

cansarigol commented Jul 31, 2019

yeraydiazdiaz commented Jul 31, 2019

yeraydiazdiaz commented Jul 31, 2019 • edited Loading

tomchristie commented Jul 31, 2019

tomchristie commented Jul 31, 2019

bennovakovic commented Jul 31, 2019

bennovakovic commented Jul 31, 2019 • edited Loading

tomchristie commented Aug 1, 2019

tomchristie commented Aug 1, 2019

pquentin commented Aug 1, 2019 • edited Loading

florimondmanca commented Oct 17, 2019

tomchristie commented Oct 17, 2019

iluxonchik commented Nov 17, 2019 • edited Loading

florimondmanca commented Nov 17, 2019

iluxonchik commented Nov 17, 2019 • edited Loading

yeraydiazdiaz commented Jul 31, 2019 •

edited

Loading

bennovakovic commented Jul 31, 2019 •

edited

Loading

pquentin commented Aug 1, 2019 •

edited

Loading

iluxonchik commented Nov 17, 2019 •

edited

Loading

iluxonchik commented Nov 17, 2019 •

edited

Loading