Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory footprint when reading large JSON requests. #5147

Merged
merged 3 commits into from May 29, 2017

Conversation

imdark
Copy link
Contributor

@imdark imdark commented May 17, 2017

Note: Before submitting this pull request, please review our contributing guidelines.

Description

Large encoded string take a very long time to to release from memory, but if we just pass the stream directly into json.load we get much better memory performance. refs #5146

Large encoded string take a very long time to to release from memory, but if we just pass the stream directly into json.load we get much better memory performance.
@@ -22,6 +22,7 @@

from rest_framework import renderers
from rest_framework.exceptions import ParseError
import codecs
Copy link
Contributor

@rooterkyberian rooterkyberian May 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: Import should be grouped in order: standard lib, 3rd party, local
https://www.python.org/dev/peps/pep-0008/#imports

@@ -61,8 +62,8 @@ def parse(self, stream, media_type=None, parser_context=None):
encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)

try:
data = stream.read().decode(encoding)
return json.loads(data)
decoded_stream = codecs.decode(stream, encoding)
Copy link
Contributor

@rooterkyberian rooterkyberian May 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be

decoded_stream = codecs.getreader(encoding)(stream)

?

I don't think codes.decode deals with file like objects

@tomchristie tomchristie changed the title in order to solve the memory leak at #5146 Improve memory footprint when reading large JSON requests. May 20, 2017
@tomchristie
Copy link
Member

tomchristie commented May 20, 2017

Retitled the issue slightly.

@tomchristie tomchristie added this to the 3.6.4 Release milestone May 20, 2017
@tomchristie
Copy link
Member

tomchristie commented May 20, 2017

@rooterkyberian Thanks for the review! Much appreciated!

@imdark
Copy link
Contributor Author

imdark commented May 25, 2017

thanks for the review, i fixed the issues

imdark added 2 commits May 25, 2017
modified to use a reader since direct decoding is not supported
@tomchristie tomchristie merged commit 823eea2 into encode:master May 29, 2017
1 check passed
@tomchristie
Copy link
Member

tomchristie commented May 29, 2017

Ace, thanks!

@vstoykov
Copy link
Contributor

vstoykov commented Aug 28, 2017

I'm not sure that this is improving the "memory performance" at all. In the past I also think that if you use json.load with a stream is better for the memoty than json.loads with a text but when I look at the code in python https://github.com/python/cpython/blob/3.6/Lib/json/__init__.py#L274-L299 it looks like it read the whole file at once.
And basically what the codec will do if you call read without parameters is to decode the whole stream at once. https://github.com/python/cpython/blob/3.6/Lib/codecs.py#L450-L526

This is effectively the same thing as before. Did I miss something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants