New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tunnel proxy: HTTP requests only #57
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass… :-)
@yeraydiazdiaz I'm actually not sure I understand the original bug in #54. My hypothesis is that closing the proxy pool calls |
@yeraydiazdiaz Okay, after playing with this myself, I ended up to the same conclusion: because |
So there were several problems before. The first one was an assertion in the Working around that I also found that consuming the response body closed the connection and the socket. Eventually arriving at we discussed above. |
Yep. Followed the same series of thoughts. :-) |
@yeraydiazdiaz Are you able to proxy an HTTP request via Squid/mitmproxy with the current state of this PR? Right now on Squid I get a 503 on the [(b'server', b'squid/3.5.27'),
(b'mime-version', b'1.0'),
(b'date', b'Sun, 12 Apr 2020 17:03:08 GMT'),
(b'content-type', b'text/html;charset=utf-8'),
(b'content-length', b'3481'),
(b'x-squid-error', b'ERR_CONNECT_FAIL 99'),
(b'vary', b'Accept-Language'),
(b'content-language', b'en')]
Traceback (most recent call last):
File "debug/client.py", line 16, in <module>
asyncio.run(main())
File "/Users/florimond/.pyenv/versions/3.8.2/lib/python3.8/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/Users/florimond/.pyenv/versions/3.8.2/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "debug/client.py", line 10, in main
response = await http.request(b"GET", url)
File "/Users/florimond/Developer/python-projects/httpcore/httpcore/_async/http_proxy.py", line 83, in request
return await self._tunnel_request(
File "/Users/florimond/Developer/python-projects/httpcore/httpcore/_async/http_proxy.py", line 175, in _tunnel_request
raise ProxyError(msg)
httpcore._exceptions.ProxyError: 503 Service Unavailable Simplified import asyncio
import httpcore
proxy_origin = (b"http", b"localhost", 3128)
url = (b"http", b"localhost", 8000, b"/")
async def main() -> None:
async with httpcore.AsyncHTTPProxy(proxy_origin, proxy_mode="TUNNEL_ONLY") as http:
response = await http.request(b"GET", url)
print(response)
status_code = response[1]
assert status_code == 200, status_code
asyncio.run(main()) Uvicorn server on from starlette.responses import PlainTextResponse
app = PlainTextResponse("Hello, world!") |
Your example using Starlette does not work using Squid, but it does when using mitmproxy without any options.
It also works fine when fetching example.org. Squid doesn't really log a lot with our current configuration so it's hard to say what's wrong there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second pass, bunch of nits though, I think we're on the right track :-) I was able to make it work with mitmproxy
too!
About proxy Host
headers: actually, examples in RFC 7321 suggest that CONNECT
should also contain a Host
header, so I think this is a legit constraint to put on users, and definitely not an h11
edge case we should hide (because it's not an edge case).
@yeraydiazdiaz Sorry if this is taking longer to review than you expected. 😅 This is great! stuff, and I think we're almost there. Yes, currently proxying is broken in HTTPX, and it's a regression, but that's alright. We haven't released anything yet, and previous versions are still mostly working, so no pressure on that side. This is why I think we should take the time to dissolve any fuzzy areas, make sure we understand everything that's going on (I'm discovering more and more of the new subtle aspects of HTTPCore as we're going through this), and that we enforce the "raw request in, raw response out"/tight-scope philosophy of HTTPCore, i.e. keeping the boundary between it and client libraries as clearly defined as possible. Hope you're okay with this — I know long reviews can be a bit draining, so feel free to push back when I'm being too nitpicky as I know I can sometimes be. :-) Thanks again for tackling this 💖 |
Thank you for reviewing! 🌟 It's absolutely fine, I agree we should take time to get things right, we've only just started to make actual use of HTTPCore so I expected some "where should this live" types of discussions, we're bound to have more of these, so the sooner we can define these lines the better 🙂 |
httpcore/_async/http_proxy.py
Outdated
ssl_context: SSLContext = None, | ||
max_connections: int = None, | ||
max_keepalive: int = None, | ||
keepalive_expiry: float = None, | ||
http2: bool = False, | ||
): | ||
assert proxy_mode in ("DEFAULT", "FORWARD_ONLY", "TUNNEL_ONLY") | ||
assert proxy_mode in ("FORWARD", "TUNNEL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the _ONLY
suffix here which does not match the values in HTTPX but more semantic sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to leave these the same, at least within the scope of this pull request?
It'd be helpful if we're not introducing breaking API changes from httpx 0.12 -> 0.13 wherever possible. (excepting for example, the UDS support)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this change stems from a longer conversation with @florimondmanca. TL;DR is where do we handle the logic behind the DEFAULT
mode?
I opted for removing it completely as suggested by Florimond and have httpx call into httpcore with a mode explicitly.
I think this way we reduces complexity on httpcore but I don't mind keeping the three modes and the logic behind here if you prefer it. What do you think?
@florimondmanca could you have another look at this when you have a second? Thanks! |
Will try to do soon :) |
Co-Authored-By: Florimond Manca <florimond.manca@gmail.com>
Co-Authored-By: Florimond Manca <florimond.manca@gmail.com>
Co-Authored-By: Florimond Manca <florimond.manca@gmail.com>
7066bf0
to
b9acb72
Compare
I reintroduced the At the moment httpx will fail to make a request via a proxy due to missing/conflicting headers. In #59 and here we discussed httpcore should have a small as possible API and httpx should be in charge of setting the parameters are required. We discussed in this PR that it doesn't make much sense to have a What are the next steps for this? @florimondmanca @tomchristie |
Okay, so I think I understand the sticking point better here now. In order to keep the conversation more tightly scoped, let me start by making one inline suggestion, and see if we can move forward from there... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks so much for your time on it.
🌟👍
🎉 Thanks @tomchristie and @florimondmanca! 🚀 |
Fixes #54 split from #55 excluding support for tunneling HTTPS requests. Changes: