Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieving raw response (status + headers + body) data from ClientResponse? #3877

Open
synchronizing opened this issue Jun 29, 2019 · 8 comments

Comments

@synchronizing
Copy link

synchronizing commented Jun 29, 2019

I was wondering if there is a way to effectively get the entire raw response from a ClientResponse class. Take the following code for example:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        status = response.status
        header = response.raw_headers
        resp = await response.read()

What I am trying to effectively do is get the complete and raw return from the server -- the status, header, and the response in one. I have checked out the docs, and I see there is no mention for raw complete data, only portions; status, header, and body.

Now, I did think of simply reformatting status and header, and then adding them to the resp, but this seems a bit backward as I would be doing something that aiohttp certainly does the opposite of in the background -- which is why I am coming here asking for guidance. Something like so would be a lifesaver:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        resp = await response.read_raw() # or response.raw() or response.all(), any variation really.

Thank you in advance, and any guidance would be appreciated!

@synchronizing synchronizing changed the title Retrieving raw response (headers + body) data from ClientResponse? Retrieving raw response (status + headers + body) data from ClientResponse? Jun 29, 2019
@synchronizing
Copy link
Author

synchronizing commented Jun 29, 2019

Closing for now -- will just write a function to put together the original raw response and re-open this for opinion.

@synchronizing
Copy link
Author

The following works:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        status = response.status
        reason = response.reason
        headers = response.headers
        response = await response.read()

resp = f"HTTP/1.1 {status} {reason}\r\n".encode("latin-1")

for header in headers:
    resp += f"{header}: {headers[header]}\r\n".encode("latin-1")

resp += b"\r\n" + response

However, I do believe a .read_raw() would still be useful. Re-opening issue so devs can see -- if this is something you guys would rather not bother with due to ClientSession being intended as a higher abstraction, I understand -- feel free to close if that's the case. Thank you anyhow!

@marethyu
Copy link

marethyu commented Oct 7, 2020

@synchronizing please reopen this issue. I'm desperate.

@synchronizing
Copy link
Author

synchronizing commented Oct 8, 2020

@synchronizing please reopen this issue. I'm desperate.

Re-opening so devs can take a look in the future. In the meantime, I wound up creating my own module, httpsuite to take care of manipulating raw http requests. It might be of use to you (?).

from httpsuite import Request
import json

request = Request(
    method="GET",
    target="/",
    protocol="HTTP/1.1",
    headers={"Host": "www.google.com", "Connection": "keep-alive",},
    body=json.dumps({"hello": "world"}),
)

I was thinking of adding a conversion methods for the requests and aiohttp library, but haven't gotten around to it. The code here does do what I was originally looking for, however.

@synchronizing synchronizing reopened this Oct 8, 2020
@marethyu
Copy link

marethyu commented Oct 8, 2020

Never mind lol. I already found a workaround for my problem. But, don't close this issue just leave it open. Nice library btw.

@webknjaz
Copy link
Member

webknjaz commented Oct 8, 2020

@synchronizing wow, nice lib! I may end up using it for the HTTP test suite in CherryPy... By the way, have you checked if https://github.com/python-hyper/h11 has the APIs you need?

@webknjaz
Copy link
Member

webknjaz commented Oct 8, 2020

As for exposing read_raw(), it's a complicated question. We have an async method for reading the body before it may be quite big and so the end-user is given a chance to process the data in chunks. Imagine an HTTP server sending you over a 50GB big file. If there was no way to split it up, it'd probably eat up all the memory with no way for the caller to pause this process until the MemoryError bubbles up. This is not a decision that the framework can make on behalf of the caller.

@synchronizing
Copy link
Author

https://github.com/python-hyper/h11 has the APIs you need?

Oh man, I spent hours looking for a (as my Google query went) "manipulate raw HTTP python module," and at the time that I needed it I couldn't find anything. httpsuite also has its own limitations, and definitely no where near h11 in terms of development and features. It was a quick weekend project for something I needed for a larger project I haven't been able to work on for a little while- might explore h11 further, thank you for the share.

Imagine an HTTP server sending you over a 50GB big file. If there was no way to split it up, it'd probably eat up all the memory with no way for the caller to pause this process until the MemoryError bubbles up. This is not a decision that the framework can make on behalf of the caller.

Makes complete sense, and an edge case I never thought of. The original purpose for my question (not to fall into the XY problem, and in case you are curious) was for the creation of a man-in-the-middle service for auto-proxy rotation. I wrote a preliminary version of an mitm, and a shitty fix for an existing project that serves as a public proxy brokerage called ProxyBroker. Long story short, ProxyBroker does not support HTTPS and uses the request/responses between the users/destination services to check HTTP status codes (to see if proxy replied accordingly). At the time, in 2009, I wrote httpsuite and mitm to try to mitigate both of these issues. If you have any ideas how one might go about it with a bit more swagger it would be a helpful suggestion from an online stranger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants