New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
response.json()
call makes unnecessary memory allocation
#5968
Comments
Hmm should it be simply changed to |
@wRAR, I implemented the change you suggested and ran a test. It appears to be working now. The For the test, I used the script that @GeorgeA92 shared:
|
I think that @jxlil We expect to see prepared pull request and related new tests added according to our contribution policy |
@GeorgeA92 I've just created a PR with the changes |
Summary
scrapy/scrapy/http/response/text.py
Lines 74 to 82 in 52c0726
As result of
response.text
call insideresponse.json
- in addition to parsed object stored inresponse._cached_decoded_json
application will hold in RAM bothbytes
andstr
(fromresponse.text
) representation of response by the end of parse method (and related mw's methods).response.body
is already enough to receive parsed python dict object so we don't need toresponse.text
(and definitely we don't need to hold response as str in RAM)scrapy/scrapy/http/response/text.py
Lines 84 to 93 in 52c0726
Also we don't need to apply encoding identifier logic from
response.text
as according to JSON specs we expect only unicode input.reproducible code sample
log output
Motivation
Calling of
response.json()
mentioned on docs offen referred as prefferred and easy scrapy's built-in way to immediately receive Python object(dict) from json response (without external import ofjson
etc.) - easy to use method but as I described here it is not efficient.Describe alternatives you've considered
response.json
I usejson.loads(response.body)
+ it also require to make additional import of json.response.body
orresponse.text
- I can safelydel response.body
anddel response._cached_ubody
(if exists)Additional context
Influence on RAM memory allocations on per-response basis (and influence of specific
response.text
bytes->str conversion call) briefly mentioned on scrapy/parsel#210 (comment)The text was updated successfully, but these errors were encountered: