Retrieving raw response (status + headers + body) data from ClientResponse? #3877

synchronizing · 2019-06-29T06:24:35Z

I was wondering if there is a way to effectively get the entire raw response from a ClientResponse class. Take the following code for example:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        status = response.status
        header = response.raw_headers
        resp = await response.read()

What I am trying to effectively do is get the complete and raw return from the server -- the status, header, and the response in one. I have checked out the docs, and I see there is no mention for raw complete data, only portions; status, header, and body.

Now, I did think of simply reformatting status and header, and then adding them to the resp, but this seems a bit backward as I would be doing something that aiohttp certainly does the opposite of in the background -- which is why I am coming here asking for guidance. Something like so would be a lifesaver:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        resp = await response.read_raw() # or response.raw() or response.all(), any variation really.

Thank you in advance, and any guidance would be appreciated!

The text was updated successfully, but these errors were encountered:

synchronizing · 2019-06-29T07:08:44Z

Closing for now -- will just write a function to put together the original raw response and re-open this for opinion.

synchronizing · 2019-06-29T07:16:45Z

The following works:

async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        status = response.status
        reason = response.reason
        headers = response.headers
        response = await response.read()

resp = f"HTTP/1.1 {status} {reason}\r\n".encode("latin-1")

for header in headers:
    resp += f"{header}: {headers[header]}\r\n".encode("latin-1")

resp += b"\r\n" + response

However, I do believe a .read_raw() would still be useful. Re-opening issue so devs can see -- if this is something you guys would rather not bother with due to ClientSession being intended as a higher abstraction, I understand -- feel free to close if that's the case. Thank you anyhow!

marethyu · 2020-10-07T01:47:26Z

@synchronizing please reopen this issue. I'm desperate.

synchronizing · 2020-10-08T00:07:48Z

@synchronizing please reopen this issue. I'm desperate.

Re-opening so devs can take a look in the future. In the meantime, I wound up creating my own module, httpsuite to take care of manipulating raw http requests. It might be of use to you (?).

from httpsuite import Request
import json

request = Request(
    method="GET",
    target="/",
    protocol="HTTP/1.1",
    headers={"Host": "www.google.com", "Connection": "keep-alive",},
    body=json.dumps({"hello": "world"}),
)

I was thinking of adding a conversion methods for the requests and aiohttp library, but haven't gotten around to it. The code here does do what I was originally looking for, however.

marethyu · 2020-10-08T01:46:18Z

Never mind lol. I already found a workaround for my problem. But, don't close this issue just leave it open. Nice library btw.

webknjaz · 2020-10-08T11:45:03Z

@synchronizing wow, nice lib! I may end up using it for the HTTP test suite in CherryPy... By the way, have you checked if https://github.com/python-hyper/h11 has the APIs you need?

webknjaz · 2020-10-08T11:54:39Z

As for exposing read_raw(), it's a complicated question. We have an async method for reading the body before it may be quite big and so the end-user is given a chance to process the data in chunks. Imagine an HTTP server sending you over a 50GB big file. If there was no way to split it up, it'd probably eat up all the memory with no way for the caller to pause this process until the MemoryError bubbles up. This is not a decision that the framework can make on behalf of the caller.

synchronizing · 2020-10-09T21:58:08Z

https://github.com/python-hyper/h11 has the APIs you need?

Oh man, I spent hours looking for a (as my Google query went) "manipulate raw HTTP python module," and at the time that I needed it I couldn't find anything. httpsuite also has its own limitations, and definitely no where near h11 in terms of development and features. It was a quick weekend project for something I needed for a larger project I haven't been able to work on for a little while- might explore h11 further, thank you for the share.

Imagine an HTTP server sending you over a 50GB big file. If there was no way to split it up, it'd probably eat up all the memory with no way for the caller to pause this process until the MemoryError bubbles up. This is not a decision that the framework can make on behalf of the caller.

Makes complete sense, and an edge case I never thought of. The original purpose for my question (not to fall into the XY problem, and in case you are curious) was for the creation of a man-in-the-middle service for auto-proxy rotation. I wrote a preliminary version of an mitm, and a shitty fix for an existing project that serves as a public proxy brokerage called ProxyBroker. Long story short, ProxyBroker does not support HTTPS and uses the request/responses between the users/destination services to check HTTP status codes (to see if proxy replied accordingly). At the time, in 2009, I wrote httpsuite and mitm to try to mitigate both of these issues. If you have any ideas how one might go about it with a bit more swagger it would be a helpful suggestion from an online stranger.

synchronizing changed the title ~~Retrieving raw response (headers + body) data from ClientResponse?~~ Retrieving raw response (status + headers + body) data from ClientResponse? Jun 29, 2019

synchronizing closed this as completed Jun 29, 2019

synchronizing reopened this Jun 29, 2019

synchronizing closed this as completed Jul 3, 2019

synchronizing reopened this Oct 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieving raw response (status + headers + body) data from ClientResponse? #3877

Retrieving raw response (status + headers + body) data from ClientResponse? #3877

synchronizing commented Jun 29, 2019 •

edited

synchronizing commented Jun 29, 2019 •

edited

synchronizing commented Jun 29, 2019

marethyu commented Oct 7, 2020

synchronizing commented Oct 8, 2020 •

edited

marethyu commented Oct 8, 2020

webknjaz commented Oct 8, 2020

webknjaz commented Oct 8, 2020

synchronizing commented Oct 9, 2020

Retrieving raw response (status + headers + body) data from ClientResponse? #3877

Retrieving raw response (status + headers + body) data from ClientResponse? #3877

Comments

synchronizing commented Jun 29, 2019 • edited

synchronizing commented Jun 29, 2019 • edited

synchronizing commented Jun 29, 2019

marethyu commented Oct 7, 2020

synchronizing commented Oct 8, 2020 • edited

marethyu commented Oct 8, 2020

webknjaz commented Oct 8, 2020

webknjaz commented Oct 8, 2020

synchronizing commented Oct 9, 2020

synchronizing commented Jun 29, 2019 •

edited

synchronizing commented Jun 29, 2019 •

edited

synchronizing commented Oct 8, 2020 •

edited