Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Community discussion #78
Although I've not yet done much in the way of documentation or branding this pacakge up, it's quickly narrowing in on a slice of functionality that's lacking in the Python HTTP landscape.
I've also some thoughts on allowing the Request and Response models to provide werkzeug-like interfaces, allowing them to be used either client-side or server-side. One of the killer-apps of the new async+HTTP/2 functionality is allowing high-throughput proxy and gateway services to be easily built in Python. Having a "requests"-like package that can also use the models on the server side is something I may want to explore once all the other functionality is sufficiently nailed down.
Since "requests" is an essential & neccessary part of the Python ecosystem, and since this package is aiming to be the next steps on from that, I think it's worth opening up a bit of community discussion here, even if it's early days.
I'd originally started out expecting
Ownership, Funding & Maintainence
Given how critical a requests-like HTTP client is to the Python ecosystem as a whole I'd be ammenable to community discussions around ownership & funding options.
I guess that I need to out by documenting & pitching this package in it's own right, releasing it under the same banner and model at all the other Encode work, and then take things from there if and when it starts to gain any adoption.
I'm open to ideas from the urllib3 or requests teams, if there's alternatives that need to be explored early on.
The functionality that this pacakge is homing in on meets the requirements for the proposed "Requests III". Perhaps there's something to be explored there, if the requests team is interested, and if we can find a good community-focused arrangement around funding & ownership.
The urllib3 team obvs. have a vast stack of real-world usage expertise that'd be important for us to make use of. There's bits of work that urllib3 does, that
Something else that could well be valuable would be implementing a urllib3 dispatch class alongside the existing h11/h2/async dispatch. Any urllib3 dispatch class would still be built on top of the underlying async structure, but would dispatch the urllib3 calls within a threadpool.
Doing so would allow a couple of useful things, such as being able to isolate behavioral differences between the two implementations, or perhaps allowing a more gradual switchover for critical services that need to take a cautious approach to upgrading to a new HTTP client implementation.
I think httpcore as currently delivered makes it fairly easy to deliver a trio-based concurrency backend. It's unclear to me if supporting that in the package itself is a good balance, or if it would be more maintainable to ensure that the trio team have have the interfaces they need, but that any implementation there would live within their ecosystem.
(I'd probably tend towards the later case there.)
I guess that an HTTP/2 client would probably be useful to the Twisted team. I don't really know enough about Twisted's style of concurrency API to take a call on if there's work here that could end up being valuable to them.
It'll be worth us keeping an eye on https://github.com/aiortc/aioquic
Having a QUIC implementation isn't the only thing that we'd need in order to add HTTP/3 support, but it is a really big first step.
We currently have connect/reader/writer interfaces. If we added QUIC support then we'd want our protocol interfaces to additionally support operations like "give me a new stream", and "set the flow control", "set the priority level".
For standard TCP-based HTTP/2 connections, "give me a new stream" would always just return the existing reader/writer pair. For QUIC connections it'd return a new reader/writer pair for a protocol-level stream.
This is getting way ahead of ourselves, but I think we've probably got a good basis here to be able to later support HTTP/3.
One big blocker would probably be whatever HTTP-level changes are required between HTTP/2 and HTTP/3 The diffs between QPACK vs HPACK is one cases here, but there's likely also differences given that the stream framing in HTTP/2 is at the HTTP-level, wheras the stream framing in HTTP/3 is at the transport-level.
It's unclear to me if these differences are sufficiently incremental that they could fall into the scope of a future
One important point to draw out here is that the growing complexities from HTTP/1.1, to HTTP/2, to HTTP/3, mean that the Python community is absolutely going to need to need to tackle work in this space as a team effort - the layers in the stack need expertise in various differing areas.
Right now we've using
Any other feedback?
I'm aware that much of this might look like it's a bit premature, but the work is pretty progressed, even if I've not yet statrted focusing on any branding and documentation around it.
Are there other invested areas of the Python community that I'm not yet considering here?
Where are the urllib3, trio, requests, aiohttp teams heading in their own work in this space? Is there good scope for collaboration, and how do you think that could/should work?
What else am I missing?
Other scattered thoughts:
@tomchristie: this is super cool, and thanks for starting the conversation.
I'll start by summarizing what's happening with the async-urllib3 work and what we've been thinking about there, so we can start figuring out how these different initiatives relate.
The async-urllib3 fork
For the last few years, me & @pquentin & @RatanShreshtha have been slowly working on adding async support to urllib3 (also incorporating some older work by @Lukasa). The repo and issue tracker is here, and the basic approach is described here: urllib3/urllib3#1323
What we've done so far
What's left to do
In general, my feeling is that the core HTTP functionality here is really solid. I think I heard @Lukasa say once that it's easy to write 90% of an HTTP client; the last 10% is where all the work is. (I guess this true of everything, but even more so for HTTP.) The async-urllib3 branch doubtless has exciting new bugs we haven't found yet, but overall this is not a quick proof of concept, it's a serious attempt at a production library that handles almost all the edge cases I know about, including things that urllib3 has only figured out within the last few months. It even handles early server responses (which is a known problem with classic urllib3, and required multiple iterations to figure out how to make it supportable across multiple networking backends). Though, we do still need to figure out what to do about header casing – python-hyper/h11#31.
There are a bunch of minor things we need to do (e.g. docs, asyncio backend), and also two major ones:
urllib3 vs async-urllib3 vs httpcore vs requests vs request3 vs idek
OK so that's what we've been working on what the issues we've found. What about the larger strategy? First, just to lay out my general assumptions:
If it's at all possible, our goal should be to converge on a single implementation of the core code for making HTTP requests, that almost everyone uses (either directly or via wrappers like
I still have hope that we can switch
I don't have a strong opinion on Python 2 support right now. It's obviously getting less important every day. But the last stragglers are going to be projects like pip and botocore, which need a HTTP client, and would really like to have access to async support. Maybe they'll be happy with using different clients on py2 and py3 (and in pip's case, vendoring multiple clients)? I'm assuming
I'm not super interested in ASGI/WSGI integration – it's a neat feature that people will like, but not my main focus (and Trio will have the ability to mock out the network itself for testing, so you don't necessarily need this kind of support inside individual libraries). I do wonder how you'll provide an async API to WSGI apps or a sync API to ASGI apps, though?
I think talking about HTTP/2 is kinda premature, honestly. I looked at
(BTW, we might also want to think about websocket client support eventually too – with HTTP/2 you can have HTTP and WS traffic over a single connection.)
Anyway. Looking at httpcore, my overall impression is ... surprisingly complementary to the async-urllib3 work? The async-urllib3 stuff is really strong on low-level protocol stuff, but the public API has a decade of accumulated cruft. httpcore feels like it's a few years away from handling all the gnarly edge cases, but the overall API and structure seem way more thought-through. I wonder if there's any way to combine forces on that basis?
I believe beyond technical details we should establish a new PSF Work Group for HTTP to better acquire resources and funding to pay all of you to solve this problem.
There is no reason why we should be unorganized or alone in this. A PSF Work Group would allow us to better leverage fiscal sponsorship, governance, and cross-maintenance of projects.
From a technical perspective, my ideal world would be:
Seems very reasonable, yup. I'm not personally blocked by funding, since Encode's model is proving sufficient for my time at the moment, but certainly in terms of maintainance and long-term I think it's super important. I don't really know how the working groups function, but it'd likely be to everyone's benefit that whoever's heading up the governance aspects shouldn't also be the primary lead maintainer - keeping a clear division of responsibilities there is really helpful on both sides.
100%. That's the right level of seperation, and I'd be in favour of that even if we were only working with thread-concurrency wrappers on top, since the Sans-I/O model is just that much more clear and testable.
My personal take would probably be to lean strongly towards the importance of API compatibility w/ requests. Not everywhere, but sufficiently so that teams oughta be able to switch over painlessly. I'd tend to think that the user expectations, brand, and ecosystem of requests would mean that a "requests" v3 or a "requests3" release would be a huge advantage, but I'm also okay with exploring a non-requests brand naming. Either way it's a conversation that we can defer any hard decisions on for the time being, until we've got something release-ready.
Same. The work in this package is aiming towards that. There's a few differences in places, such as:
Sure. Agree it's not on the critical path, though I'm more bullish than yourself on achieving it. The implementation does handle stream multiplexing, and the connection pool takes account of HTTP/2 vs. HTTP/1.1 connections accordingly, tho yes - no per-stream flow control / ping support yet etc. The existing http/2 module weighs in at only 150 lines, since
Sure. I'm finding it important for one thing because it's thrashing out some more underlying functionality that is a critical requirement - the ability to write either async or sync dispatch classes, and have the client be able to bridge to them seemlessly. I'm working this through at the moment, and belive I have it nailed, tho it's more involved than the initial pass which was just "we need a sync client and an async client".
Most obvious potential points of collaboration from my POV would be:
Anyways, lots of great stuff here, thanks all.
Okay, so I think we're far enough along the road here that I think it's time to plant a stake in the ground and say "yeah, this is the direction we're going".
There's still various technical aspects to work on. In particular, stuff like:
I've taken on some of the awkward bits, such as the "early response handling", which as @njsmith noted is really quite fiddly. (For example, to get timeouts right, you want to try both reading and writing concurrently, but initially starting with only enforcing write timeouts, and later switching over to only enforcing read timeouts once you've either sent the entire request, or have started getting an early response.)
I'm also still wary of everything we're trying to take on here. Supporting HTTP/2, HTTP/3, seemless async+sync, multiple concurrency backends is a fair chunk of extra complexity on top of what the existing requests+urllib3 needs to deal with. The only way I can see of mitigating that is by really making sure that we're taking this on as a community endevour. I really like @theacodes' suggestion of an HTTP working group there.
I'm also not precluding the possibilities that we could also lean more on the urllib3 work, by working on either or both of a threaded urllib3 implementation, using the
There's a bunch of conversations that'd need to happen around eg. what GitHub organisation the project so be on, domain naming, docs branding, etc. but I think the first blocking thing that I'd really like to see happen is for this work to adopt the
Personally I think we should leave urllib3 to its long-term support maintenance state instead of trying to revive it into the spotlight via a next-gen HTTP client library. When requests depended on urllib3 for dispatch we were essentially (and actually) treated as internal code. Why not just house that complexity within the client library itself rather than shelling it out and dealing with packaging synchronization problems? It'll certainly take more planning and careful design choice.
@sethmlarson - Agreed. I guess what I really meant to say there was that it might be helpful at some point if we had a old-school urllib3 dispatcher available as a third party package or whatever, so that we can more easily isolate any behavioral differences when dealing with any gnarly edge-case-ish behaviors.
I've slung together a requests3 branch, which demos how our existing docs would look if the project did take over the mantle of Requests III.
Docs build is here: http://www.encode.io/requests3-demo/
I'm conflicted about it, but I think that encode taking over ownership and responsibility for delivering
The project is already what you’d expect from a requests v 3 release, and is API compatible most the way through, with a some documented exceptions, and a few bits of work outstanding.
I guess the proposal looks something like this:
As painful as it is, I think that'd probably also need to come alongside some kind of reasonable statement on the over-promise and under-delivery of the Requests III fundraiser. Failing to deliver in itself isn't exactly the main issue, but being unable to be open & transparent about it is.
If that's not something that we can agree on then we'll just need to push on with this project in it's current naming, which is fine, and will work out in the long term, tho we should expect much more gradual adoption. (Also there's no realistic way that "requests3" is going to actually land under that scenario.)
I'm happy to take feedback on any of the above, so long as folks keep in mind that it's a loaded topic, and pretty emotionally draining for all concerned.
On reflection I'm less sure now that trying to pursue "requests3" for the sake of continuity is necessarily the best option.
A fresh project under the umbrella of
Also open to the question of "should this live under
That's related to a couple of other questions - eg. are we expecting to move dependencies, such as
Might raise some of this over on https://github.com/python-http/python-http.org instead.