Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fixed #7581 -- Add explicit support for streaming responses. #407

Closed
wants to merge 6 commits into
from

Conversation

Projects
None yet
4 participants

Refactored HttpResponse into HttpResponseBase, HttpResponse and
HttpStreamingResponse.

HttpResponse exposes a content attribute, which is normalised to a
byte string.

HttpStreamingResponse exposes a streaming_content attribute, which
is normalised to an iterator that yields byte strings.

Bundled middleware that need to access a response's content now
examines the response (looks for a content or streaming_content
attribute) to determine it's capabilities.

The django.views.static.serve view now returns a streaming response,
which is MUCH faster.

The conditional_content_removal() function in django.http.utils,
which is applied as a "response fix" (compulsory middleware) now
understands streaming responses, and was missing tests that have now
been added.

@akaariai akaariai commented on an outdated diff Sep 29, 2012

django/views/static.py
@@ -62,8 +63,7 @@ def serve(request, path, document_root=None, show_indexes=False):
if not was_modified_since(request.META.get('HTTP_IF_MODIFIED_SINCE'),
statobj.st_mtime, statobj.st_size):
return HttpResponseNotModified()
- with open(fullpath, 'rb') as f:
- response = HttpResponse(f.read(), content_type=mimetype)
+ response = HttpStreamingResponse(open(fullpath, 'rb'), content_type=mimetype)
@akaariai

akaariai Sep 29, 2012

Member

I think we need to have some way to close the file - either wrap it into something explicitly closing the file when fully read, or check for hasattr(._content, "close") in stream_content finally block.

mrmachine and others added some commits Sep 22, 2012

Fixed #7581 -- Add explicit support for streaming responses.
Refactored `HttpResponse` into `HttpResponseBase`, `HttpResponse` and
`HttpStreamingResponse`.

`HttpResponse` exposes a `content` attribute, which is normalised to a
byte string.

`HttpStreamingResponse` exposes a `streaming_content` attribute, which
is normalised to an iterator that yields byte strings.

Bundled middleware that need to access a response's content now
examines the response (looks for a `content` or `streaming_content`
attribute) to determine it's capabilities.

The `django.views.static.serve` view now returns a streaming response,
which is MUCH faster.

The `conditional_content_removal()` function in `django.http.utils`,
which is applied as a "response fix" (compulsory middleware) now
understands streaming responses, and was missing tests that have now
been added.
Improve compatibility of `serve` view.
- Add `CompatibleHttpStreamingResponse` class, which still has a
  `content` attribute, but raises `PendingDeprecationWarning` when it
  is accessed.

- Streams responses from `serve` view only if no middleware accesses
  `content` attribute, by using `CompatibleHttpStreamingResponse`.

- Don't allow streaming responses to be iterated more than once when a
  non-iterable is given as content.

- Automatically call `close()` method when streaming response iterator
  is exhausted.

- Assume that responses have a `content` attribute if they don't have
  `streaming_content` attribute. By checking for `streaming_content`
  first, we prioritise streaming when applying middleware to
  `CompatibleHttpStreamingResponse` objects.
Remove unnecessary changes from core streaming response feature.
- Remove `CompatibleHttpStreamingResponse` class.
- Don't update `static` view to return streaming responses.
- Remove support for `HttpStreamingResponse.write()` method.
- Update cache tests to make it clear that you cannot cache streaming
  responses.
Review changes to pull/407
  - Some docs rewording
  - Python 3 compatibility
  - Improvement to streaming GZIP handling
Improve `close()` method. Add `__str__` and/or `__bytes__` methods.
Close file-like content when assigned to regular responses, instead
of waiting until the response is iterated. Only store references to
file-like objects (with a `close()` method) when assigning content to
a streaming response, to be closed when the response is iterated.

When streaming response is converted to bytes, return an HTTP message
(like regular response does), but without the body. E.g. headers only.
Without defining __bytes__ method, converting a streaming response
to bytes in Python 3.x raises an exception, and this behaviour is close
to regular responses.
Member

akaariai commented Oct 2, 2012

I have reviewed this patch and to me it seems this should be ready for commit. There are three issues worth mentioning:

  1. Content given to HttpResponse and StreamingResponse will be automatically closed after they are consumed. For HttpResponse this could be considered backwards breaking change.
  2. Content given to HttpResponse will be consumed immediately. Those using iterator as content to HttpResponse will need to update their code to use StreamingResponse instead. The streaming ability of HttpResponse with iterator wasn't officially documented, but it did work earlier.
  3. I wonder if we want to add a little snippet about how to disable middleware conditionally (by subclassing the middleware).

The problem I have with this patch is that I just don't feel comfortable pushing this patch in at last minute without other reviews. I don't know the HTTP protocol nearly well enough to understand every detail of the patch.

Make `HttpResponse` compatible with unofficially supported streaming …
…behaviour.

- Don't automatically close file-like objects assigned as content to
  `HttpResponse` objects.

- Don't consume iterators assigned as content to `HttpResponse` until `content`
  is accessed, `write() is called, or `tell()` is called.

I've updated the patch to maintain full compatibility with the existing unofficially supported streaming behaviour with HttpResponse (issues 1 and 2 mentioned by akaariai). This should also make issue 3 moot, at least in the context of streaming responses.

@claudep claudep commented on the diff Oct 19, 2012

django/http/__init__.py
+ self._container = iter(value)
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
+ else:
+ # Assign the original iterable directly to `self._container`
+ # instead of converting to a list, in order to preserve the
+ # unofficially supported streaming behaviour, which only works
+ # when `response.content` is not accessed by middleware before
+ # the response itself is iterated.
+ self._container = value
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
@claudep

claudep Oct 19, 2012

Member

These last four lines are duplicated in both conditions, should therefore come after the if block.

@claudep claudep commented on the diff Oct 19, 2012

django/http/__init__.py
else:
- self._container = [value]
- self._base_content_is_iter = False
+ # Make sure `self._container` is always in a consistent format, and
+ # that multi/single access to [streaming_]content is guaranteed.
+ if stream:
+ self._container = iter([value])
+ else:
+ self._container = [value]
@claudep

claudep Oct 19, 2012

Member

I'm not sure it's useful to implement _set_container in the base class, as the path seems different depending on stream.

Member

claudep commented Oct 19, 2012

I don't find those hasattr(response, 'streaming_content') very elegant. Shouldn't we define a streaming class attribute, then test if response.streaming?

@claudep claudep commented on the diff Oct 19, 2012

django/http/utils.py
if request.method == 'HEAD':
- response.content = ''
+ if hasattr(response, 'streaming_content'):
+ response.streaming_content = []
+ else:
+ response.content = ''
return response
@claudep

claudep Oct 19, 2012

Member

I think it would be cleaner to define an API to remove content on HttpResponse classes (like truncate(update_length=True))

@aaugustin aaugustin commented on the diff Oct 20, 2012

django/http/__init__.py
+ self._container = iter([value])
+ else:
+ self._container = [value]
+
+ def make_bytes(self, value):
+ # Convert integer values to text.
+ if isinstance(value, int):
+ value = six.text_type(value)
+ # If the value is text, it will need to be encoded.
+ if isinstance(value, six.text_type):
+ # If the response already has a content-encoding, encode as ascii.
+ if self.has_header('Content-Encoding'):
+ value = value.encode('ascii')
+ # Otherwise encode with the character set of the response.
+ else:
+ value = value.encode(self._charset)
@aaugustin

aaugustin Oct 20, 2012

Owner

This behavior doesn't exist in the current implementation of HttpResponse.

Owner

aaugustin commented Oct 20, 2012

I've prepared a simplified version of this patch and will attach it to the ticket shortly.

@aaugustin aaugustin closed this Oct 20, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment