Sans-I/O client middleware stack #783

florimondmanca · 2020-01-20T13:13:47Z

So…

Implementing retries naturally involved adding a .send_handling_retries() method to clients. This has resulted in creating a chain of 3 method calls:

send_handling_retries() -> send_handling_redirects() -> send_handling_auth() -> send_single_request()

With #778, both retries and auth would be using a generator-based API. We don't use one for redirects, but we very much could.

So all of this looked an awful lot like a middleware stack to me, so…

This PR moves the redirects and auth logic to a dedicated httpx.middleware, which defines a generator-based middleware API. As a result, Client and AsyncClient are really just pretty type-hinted shells around a dispatcher and a stack of middleware.

A no-op middleware looks like this:

class NullMiddleware:
    def __init__(self, parent):
        self.parent = parent

    def __call__(self, request, context, timeout) -> Response:
        return yield from self.parent(request, context, timeout)

The question obviously is how can this API work both in the sync and async cases? For example, the redirect middleware needs to call response.read()/await response.aread()...

The solution is to defer the awaiting of yielded value right to the edge... This is what these two helper functions allow to do:

consume_generator()
consume_generator_of_awaitables()

They allow writing generators based on functions that may or may not return awaitables. For example, calling dispatcher.send() will return a coroutine in the async case (because we have an AsyncDispatcher), but a plain response in the sync case (because we have a SyncDispatcher). So if we yield dispatcher.send(), and make sure the generator is passed through consume_generator() in the sync case, and await consume_generator_of_awaitables() in the async case, then we'll have the coroutine awaited in the async case, and the plain value left as-is in the sync case. (Makes sense?)

For cases when the API isn't exactly the same in the sync and async cases, we introduce a special SyncOrAsync container that defines what we should call in both situations, and then the two helper functions do the unwrapping when the stumble upon an instance of this container. So this allows doing the switch between e.g. response.read() and response.aread().

And… tada. ✨

Yes, this is generator black magic pushed to the limit.

But, I think this middleware API could actually be a viable one, which is a pretty big deal.

The real killer thing IMO is that it allows all of the following:

Writing code for both the sync and the async clients.
Writing functionality that potentially requires sending multiple requests.
Dynamically expanding the control knobs of the client, i.e. not being restricted to what can be passed to client.send()[^0]. This is done with a context that works a bit like scope in ASGI.

([^0] This bit would just require adding a **kwargs everywhere on the client API…)

Eventually, we could aim for something like this…

class MyMiddleware:
    def __init__(self, parent, initial: dict):
        self.initial = initial

    def __call__(self, request, context, timeout):
        my_data = context.get("my_data", {})
        request.headers["X-My-Data"] = ", ".join(f"{key}={value}" for key, value in my_data.items())
        return yield from self.parent(request, context, timeout)


client = httpx.Client(
    middleware=[
        # v *Maybe* make these optional…
        httpx.Middleware(AuthMiddleware, auth=("user", "pass")),
        httpx.Middleware(RedirectMiddleware, max_redirects=3),
        # ^
        httpx.Middleware(MyMiddleware, initial={"message": "Hello, world!"}),
)

response = client.get("https://example.org", my_data={"foo": "bar"})

florimondmanca · 2020-01-29T03:32:32Z

I think we should get rid of the context thingy on the middleware callable API, and replace it with a **kwargs. Then it’s even more natural to know « this middleware requires these parameters », because they’d be defined as actual parameters on the .__call__() implementation. And we wouldn’t need to manipulate a dict to extract them/set default values. Would also fit better how those parameters would be passed to the client — as that can’t be sensibly done any other way than via a **kwargs on client methods.

tomchristie · 2020-01-29T16:13:35Z

So, probably no surprise that I'm a bit cautious about us introducing something here.

Last time around adding a middleware stack felt like an abstraction too far, that masked the actual workings of the the Client, when a little more flatness was actually sufficient, and more readable.

Something I'd suggest for any exploratory work here would be to add a middleware API without also moving around our existing implementation into middleware itself. That'd be a more minimal footprint to look over, and would let us focus more on "is this the right API to expose to users" vs. "is this feasible, and how does this affect our implementation".

At this point it's not even clear to me that we neccessarily need a middleware API.

For example, one keep-it-simple alternative here would be to make sure that we're documenting the build_request/send pair of functions as our lowest level of public API on the Client instance, and if folks want to wrap some behaviour around send, then well, great, subclass Client or AsyncClient, and implement a send() method that calls into the parent class.

Granted, those wouldn't be modular drop-me-in-for-reuse classes, but that might be okay.

Alternately, it could be that a more constrained "event hooks" API would be sufficient (again along with adding good documentation for send() for more advanced stuff.)

I guess my rough take at the moment on this would be to push back on it for as long as possible while gathering example use-cases. Documenting send() in the meantime could be a sufficient and conservative choice with less potential to bite us?

florimondmanca · 2020-01-29T19:14:58Z

@tomchristie Cool, thanks for the feedback. Obviously I would have been very surprised that this would go through without push back or caution. :)

I can try to isolate the addition of the middleware API in a separate PR. Not using it for our internal layers might end up making it seem like it’s not thorough enough, when I actually think it’s a pretty meaningful API that’s just at the right level of abstraction this time...

Your point about « let’s just have users subclass and wrap .send() » is sensible, I just have the feeling we might be able to do better if this proves to be a right way to go.

But right, let’s do that and take it from there.

florimondmanca · 2020-02-02T10:28:02Z

Closing, followed up with #800.

Sans-I/O client middleware stack

cd051de

florimondmanca added the refactor Issues and PRs related to code refactoring label Jan 20, 2020

florimondmanca mentioned this pull request Jan 20, 2020

Add retries #778

Closed

florimondmanca added 4 commits January 21, 2020 07:30

Remove .contribute_to_context() API

8806b85

Move auth building to AuthMiddleware

e815ad8

Cleanup context building

5bd5e0f

Refine generator runners implementation

1a5fe2b

florimondmanca force-pushed the sansio-middleware branch from 0af9cb7 to 1a5fe2b Compare January 21, 2020 13:41

This was referenced Jan 23, 2020

Support for Event Hooks #790

Closed

Retries for connection failures #784

Closed

florimondmanca mentioned this pull request Feb 2, 2020

Middleware API design doc #800

Closed

florimondmanca closed this Feb 2, 2020

florimondmanca deleted the sansio-middleware branch February 2, 2020 10:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sans-I/O client middleware stack #783

Sans-I/O client middleware stack #783

florimondmanca commented Jan 20, 2020 •

edited

Loading

florimondmanca commented Jan 29, 2020

tomchristie commented Jan 29, 2020

florimondmanca commented Jan 29, 2020

florimondmanca commented Feb 2, 2020

Sans-I/O client middleware stack #783

Sans-I/O client middleware stack #783

Conversation

florimondmanca commented Jan 20, 2020 • edited Loading

florimondmanca commented Jan 29, 2020

tomchristie commented Jan 29, 2020

florimondmanca commented Jan 29, 2020

florimondmanca commented Feb 2, 2020

florimondmanca commented Jan 20, 2020 •

edited

Loading