Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Fixed #7581 -- Add explicit support for streaming responses. #407

Closed
wants to merge 6 commits into from

4 participants

@mrmachine

Refactored HttpResponse into HttpResponseBase, HttpResponse and
HttpStreamingResponse.

HttpResponse exposes a content attribute, which is normalised to a
byte string.

HttpStreamingResponse exposes a streaming_content attribute, which
is normalised to an iterator that yields byte strings.

Bundled middleware that need to access a response's content now
examines the response (looks for a content or streaming_content
attribute) to determine it's capabilities.

The django.views.static.serve view now returns a streaming response,
which is MUCH faster.

The conditional_content_removal() function in django.http.utils,
which is applied as a "response fix" (compulsory middleware) now
understands streaming responses, and was missing tests that have now
been added.

django/views/static.py
@@ -62,8 +63,7 @@ def serve(request, path, document_root=None, show_indexes=False):
if not was_modified_since(request.META.get('HTTP_IF_MODIFIED_SINCE'),
statobj.st_mtime, statobj.st_size):
return HttpResponseNotModified()
- with open(fullpath, 'rb') as f:
- response = HttpResponse(f.read(), content_type=mimetype)
+ response = HttpStreamingResponse(open(fullpath, 'rb'), content_type=mimetype)
@akaariai Collaborator

I think we need to have some way to close the file - either wrap it into something explicitly closing the file when fully read, or check for hasattr(._content, "close") in stream_content finally block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
mrmachine and others added some commits
@mrmachine mrmachine Fixed #7581 -- Add explicit support for streaming responses.
Refactored `HttpResponse` into `HttpResponseBase`, `HttpResponse` and
`HttpStreamingResponse`.

`HttpResponse` exposes a `content` attribute, which is normalised to a
byte string.

`HttpStreamingResponse` exposes a `streaming_content` attribute, which
is normalised to an iterator that yields byte strings.

Bundled middleware that need to access a response's content now
examines the response (looks for a `content` or `streaming_content`
attribute) to determine it's capabilities.

The `django.views.static.serve` view now returns a streaming response,
which is MUCH faster.

The `conditional_content_removal()` function in `django.http.utils`,
which is applied as a "response fix" (compulsory middleware) now
understands streaming responses, and was missing tests that have now
been added.
0eeb4b0
@mrmachine mrmachine Improve compatibility of `serve` view.
- Add `CompatibleHttpStreamingResponse` class, which still has a
  `content` attribute, but raises `PendingDeprecationWarning` when it
  is accessed.

- Streams responses from `serve` view only if no middleware accesses
  `content` attribute, by using `CompatibleHttpStreamingResponse`.

- Don't allow streaming responses to be iterated more than once when a
  non-iterable is given as content.

- Automatically call `close()` method when streaming response iterator
  is exhausted.

- Assume that responses have a `content` attribute if they don't have
  `streaming_content` attribute. By checking for `streaming_content`
  first, we prioritise streaming when applying middleware to
  `CompatibleHttpStreamingResponse` objects.
c4eb7ae
@mrmachine mrmachine Remove unnecessary changes from core streaming response feature.
- Remove `CompatibleHttpStreamingResponse` class.
- Don't update `static` view to return streaming responses.
- Remove support for `HttpStreamingResponse.write()` method.
- Update cache tests to make it clear that you cannot cache streaming
  responses.
0e5d045
@akaariai akaariai Review changes to pull/407
  - Some docs rewording
  - Python 3 compatibility
  - Improvement to streaming GZIP handling
ec30636
@mrmachine mrmachine Improve `close()` method. Add `__str__` and/or `__bytes__` methods.
Close file-like content when assigned to regular responses, instead
of waiting until the response is iterated. Only store references to
file-like objects (with a `close()` method) when assigning content to
a streaming response, to be closed when the response is iterated.

When streaming response is converted to bytes, return an HTTP message
(like regular response does), but without the body. E.g. headers only.
Without defining __bytes__ method, converting a streaming response
to bytes in Python 3.x raises an exception, and this behaviour is close
to regular responses.
61c1ca0
@akaariai
Collaborator

I have reviewed this patch and to me it seems this should be ready for commit. There are three issues worth mentioning:

  1. Content given to HttpResponse and StreamingResponse will be automatically closed after they are consumed. For HttpResponse this could be considered backwards breaking change.
  2. Content given to HttpResponse will be consumed immediately. Those using iterator as content to HttpResponse will need to update their code to use StreamingResponse instead. The streaming ability of HttpResponse with iterator wasn't officially documented, but it did work earlier.
  3. I wonder if we want to add a little snippet about how to disable middleware conditionally (by subclassing the middleware).

The problem I have with this patch is that I just don't feel comfortable pushing this patch in at last minute without other reviews. I don't know the HTTP protocol nearly well enough to understand every detail of the patch.

Tai Lee Make `HttpResponse` compatible with unofficially supported streaming …
…behaviour.

- Don't automatically close file-like objects assigned as content to
  `HttpResponse` objects.

- Don't consume iterators assigned as content to `HttpResponse` until `content`
  is accessed, `write() is called, or `tell()` is called.
da7a8dc
@mrmachine

I've updated the patch to maintain full compatibility with the existing unofficially supported streaming behaviour with HttpResponse (issues 1 and 2 mentioned by akaariai). This should also make issue 3 moot, at least in the context of streaming responses.

@claudep claudep commented on the diff
django/http/__init__.py
((31 lines not shown))
+ self._container = iter(value)
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
+ else:
+ # Assign the original iterable directly to `self._container`
+ # instead of converting to a list, in order to preserve the
+ # unofficially supported streaming behaviour, which only works
+ # when `response.content` is not accessed by middleware before
+ # the response itself is iterated.
+ self._container = value
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
@claudep Collaborator
claudep added a note

These last four lines are duplicated in both conditions, should therefore come after the if block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@claudep claudep commented on the diff
django/http/__init__.py
((47 lines not shown))
else:
- self._container = [value]
- self._base_content_is_iter = False
+ # Make sure `self._container` is always in a consistent format, and
+ # that multi/single access to [streaming_]content is guaranteed.
+ if stream:
+ self._container = iter([value])
+ else:
+ self._container = [value]
@claudep Collaborator
claudep added a note

I'm not sure it's useful to implement _set_container in the base class, as the path seems different depending on stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@claudep
Collaborator

I don't find those hasattr(response, 'streaming_content') very elegant. Shouldn't we define a streaming class attribute, then test if response.streaming?

@claudep claudep commented on the diff
django/http/utils.py
((11 lines not shown))
if request.method == 'HEAD':
- response.content = ''
+ if hasattr(response, 'streaming_content'):
+ response.streaming_content = []
+ else:
+ response.content = ''
return response
@claudep Collaborator
claudep added a note

I think it would be cleaner to define an API to remove content on HttpResponse classes (like truncate(update_length=True))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@aaugustin aaugustin commented on the diff
django/http/__init__.py
((53 lines not shown))
+ self._container = iter([value])
+ else:
+ self._container = [value]
+
+ def make_bytes(self, value):
+ # Convert integer values to text.
+ if isinstance(value, int):
+ value = six.text_type(value)
+ # If the value is text, it will need to be encoded.
+ if isinstance(value, six.text_type):
+ # If the response already has a content-encoding, encode as ascii.
+ if self.has_header('Content-Encoding'):
+ value = value.encode('ascii')
+ # Otherwise encode with the character set of the response.
+ else:
+ value = value.encode(self._charset)
@aaugustin Owner

This behavior doesn't exist in the current implementation of HttpResponse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@aaugustin
Owner

I've prepared a simplified version of this patch and will attach it to the ticket shortly.

@aaugustin aaugustin closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 30, 2012
  1. @mrmachine @akaariai

    Fixed #7581 -- Add explicit support for streaming responses.

    mrmachine authored akaariai committed
    Refactored `HttpResponse` into `HttpResponseBase`, `HttpResponse` and
    `HttpStreamingResponse`.
    
    `HttpResponse` exposes a `content` attribute, which is normalised to a
    byte string.
    
    `HttpStreamingResponse` exposes a `streaming_content` attribute, which
    is normalised to an iterator that yields byte strings.
    
    Bundled middleware that need to access a response's content now
    examines the response (looks for a `content` or `streaming_content`
    attribute) to determine it's capabilities.
    
    The `django.views.static.serve` view now returns a streaming response,
    which is MUCH faster.
    
    The `conditional_content_removal()` function in `django.http.utils`,
    which is applied as a "response fix" (compulsory middleware) now
    understands streaming responses, and was missing tests that have now
    been added.
  2. @mrmachine @akaariai

    Improve compatibility of `serve` view.

    mrmachine authored akaariai committed
    - Add `CompatibleHttpStreamingResponse` class, which still has a
      `content` attribute, but raises `PendingDeprecationWarning` when it
      is accessed.
    
    - Streams responses from `serve` view only if no middleware accesses
      `content` attribute, by using `CompatibleHttpStreamingResponse`.
    
    - Don't allow streaming responses to be iterated more than once when a
      non-iterable is given as content.
    
    - Automatically call `close()` method when streaming response iterator
      is exhausted.
    
    - Assume that responses have a `content` attribute if they don't have
      `streaming_content` attribute. By checking for `streaming_content`
      first, we prioritise streaming when applying middleware to
      `CompatibleHttpStreamingResponse` objects.
  3. @mrmachine @akaariai

    Remove unnecessary changes from core streaming response feature.

    mrmachine authored akaariai committed
    - Remove `CompatibleHttpStreamingResponse` class.
    - Don't update `static` view to return streaming responses.
    - Remove support for `HttpStreamingResponse.write()` method.
    - Update cache tests to make it clear that you cannot cache streaming
      responses.
  4. @akaariai

    Review changes to pull/407

    akaariai authored
      - Some docs rewording
      - Python 3 compatibility
      - Improvement to streaming GZIP handling
Commits on Oct 1, 2012
  1. @mrmachine

    Improve `close()` method. Add `__str__` and/or `__bytes__` methods.

    mrmachine authored
    Close file-like content when assigned to regular responses, instead
    of waiting until the response is iterated. Only store references to
    file-like objects (with a `close()` method) when assigning content to
    a streaming response, to be closed when the response is iterated.
    
    When streaming response is converted to bytes, return an HTTP message
    (like regular response does), but without the body. E.g. headers only.
    Without defining __bytes__ method, converting a streaming response
    to bytes in Python 3.x raises an exception, and this behaviour is close
    to regular responses.
Commits on Oct 5, 2012
  1. Make `HttpResponse` compatible with unofficially supported streaming …

    Tai Lee authored
    …behaviour.
    
    - Don't automatically close file-like objects assigned as content to
      `HttpResponse` objects.
    
    - Don't consume iterators assigned as content to `HttpResponse` until `content`
      is accessed, `write() is called, or `tell()` is called.
This page is out of date. Refresh to see the latest.
View
212 django/http/__init__.py
@@ -1,5 +1,6 @@
from __future__ import absolute_import, unicode_literals
+import collections
import copy
import datetime
from email.header import Header
@@ -521,18 +522,23 @@ def parse_cookie(cookie):
class BadHeaderError(ValueError):
pass
-class HttpResponse(object):
- """A basic HTTP response, with content and dictionary-accessed headers."""
+class HttpResponseBase(object):
+ """
+ A base HTTP response class with dictionary-accessed headers.
+
+ This class should not be used directly. Use the HttpResponse and
+ HttpStreamingResponse subclasses, instead.
+ """
status_code = 200
- def __init__(self, content='', content_type=None, status=None,
- mimetype=None):
+ def __init__(self, content_type=None, status=None, mimetype=None):
# _headers is a mapping of the lower-case name to the original case of
# the header (required for working with legacy systems) and the header
# value. Both the name of the header and its value are ASCII strings.
self._headers = {}
self._charset = settings.DEFAULT_CHARSET
+ self._file_like_objects = []
if mimetype:
warnings.warn("Using mimetype keyword argument is deprecated, use"
" content_type instead", PendingDeprecationWarning)
@@ -540,26 +546,19 @@ def __init__(self, content='', content_type=None, status=None,
if not content_type:
content_type = "%s; charset=%s" % (settings.DEFAULT_CONTENT_TYPE,
self._charset)
- # content is a bytestring. See the content property methods.
- self.content = content
self.cookies = SimpleCookie()
if status:
self.status_code = status
self['Content-Type'] = content_type
- def serialize(self):
- """Full HTTP message, including headers, as a bytestring."""
+ def serialize_headers(self):
+ """HTTP headers as a bytestring."""
headers = [
('%s: %s' % (key, value)).encode('us-ascii')
for key, value in self._headers.values()
]
- return b'\r\n'.join(headers) + b'\r\n\r\n' + self.content
-
- if six.PY3:
- __bytes__ = serialize
- else:
- __str__ = serialize
+ return b'\r\n'.join(headers)
def _convert_to_charset(self, value, charset, mime_encode=False):
"""Converts headers key/value to ascii/latin1 native strings.
@@ -683,62 +682,179 @@ def delete_cookie(self, key, path='/', domain=None):
self.set_cookie(key, max_age=0, path=path, domain=domain,
expires='Thu, 01-Jan-1970 00:00:00 GMT')
- @property
- def content(self):
- if self.has_header('Content-Encoding'):
- def make_bytes(value):
- if isinstance(value, int):
- value = six.text_type(value)
- if isinstance(value, six.text_type):
- value = value.encode('ascii')
- # force conversion to bytes in case chunk is a subclass
- return bytes(value)
- return b''.join(make_bytes(e) for e in self._container)
- return b''.join(force_bytes(e, self._charset) for e in self._container)
-
- @content.setter
- def content(self, value):
- if hasattr(value, '__iter__') and not isinstance(value, (bytes, six.string_types)):
- self._container = value
- self._base_content_is_iter = True
+ def _set_container(self, value, stream):
+ # The `stream` argument should be `True` if the response must be
+ # streamed, in which case we make sure the assigned content can only be
+ # iterated once.
+ if isinstance(value, collections.Iterable) \
+ and not isinstance(value, (bytes, six.text_type)):
+ if stream:
+ # Ensure iterable values can only be iterated once by wrapping
+ # them in a new iterator.
+ self._container = iter(value)
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
+ else:
+ # Assign the original iterable directly to `self._container`
+ # instead of converting to a list, in order to preserve the
+ # unofficially supported streaming behaviour, which only works
+ # when `response.content` is not accessed by middleware before
+ # the response itself is iterated.
+ self._container = value
+ # Keep a reference to file-like objects, so we can close them
+ # later.
+ if hasattr(value, 'close') and callable(value.close):
+ self._file_like_objects.append(value)
@claudep Collaborator
claudep added a note

These last four lines are duplicated in both conditions, should therefore come after the if block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
else:
- self._container = [value]
- self._base_content_is_iter = False
+ # Make sure `self._container` is always in a consistent format, and
+ # that multi/single access to [streaming_]content is guaranteed.
+ if stream:
+ self._container = iter([value])
+ else:
+ self._container = [value]
@claudep Collaborator
claudep added a note

I'm not sure it's useful to implement _set_container in the base class, as the path seems different depending on stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+ def make_bytes(self, value):
+ # Convert integer values to text.
+ if isinstance(value, int):
+ value = six.text_type(value)
+ # If the value is text, it will need to be encoded.
+ if isinstance(value, six.text_type):
+ # If the response already has a content-encoding, encode as ascii.
+ if self.has_header('Content-Encoding'):
+ value = value.encode('ascii')
+ # Otherwise encode with the character set of the response.
+ else:
+ value = value.encode(self._charset)
@aaugustin Owner

This behavior doesn't exist in the current implementation of HttpResponse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ # Finally, force conversion to bytes in case value is a subclass.
+ return bytes(value)
def __iter__(self):
self._iterator = iter(self._container)
return self
def __next__(self):
- chunk = next(self._iterator)
- if isinstance(chunk, int):
- chunk = six.text_type(chunk)
- if isinstance(chunk, six.text_type):
- chunk = chunk.encode(self._charset)
- # force conversion to bytes in case chunk is a subclass
- return bytes(chunk)
+ return self.make_bytes(next(self._iterator))
next = __next__ # Python 2 compatibility
def close(self):
- if hasattr(self._container, 'close'):
- self._container.close()
+ while self._file_like_objects:
+ self._file_like_objects.pop().close()
# The remaining methods partially implement the file-like object interface.
# See http://docs.python.org/lib/bltin-file-objects.html
+
def write(self, content):
- if self._base_content_is_iter:
- raise Exception("This %s instance is not writable" % self.__class__)
- self._container.append(content)
+ # This should be implemented in subclasses that support the feature.
+ raise Exception("This %s instance is not writable" % self.__class__)
def flush(self):
pass
def tell(self):
- if self._base_content_is_iter:
- raise Exception("This %s instance cannot tell its position" % self.__class__)
+ # This should be implemented in subclasses that support the feature.
+ raise Exception(
+ "This %s instance cannot tell its position" % self.__class__)
+
+class HttpResponse(HttpResponseBase):
+ """
+ An HTTP response class with content that can be read, appended to or
+ replaced. Converts iterator content to a list, so that it can be iterated
+ multiple times.
+ """
+
+ def __init__(self, content='', *args, **kwargs):
+ super(HttpResponse, self).__init__(*args, **kwargs)
+ # Content is a bytestring. See the `content` property methods.
+ self.content = content
+
+ def serialize(self):
+ """Full HTTP message, including headers, as a bytestring."""
+ return self.serialize_headers() + b'\r\n\r\n' + self.content
+
+ if six.PY3:
+ __bytes__ = serialize
+ else:
+ __str__ = serialize
+
+ def _consume_content(self):
+ # Convert `self._container` to a list, so we can append to it and
+ # iterate it multiple times. Call this first in any method that needs
+ # to iterate or append to the content container.
+ if not isinstance(self._container, list):
+ self._container = list(self._container)
+
+ @property
+ def content(self):
+ self._consume_content()
+ # Iterate over `self._container` and call `self.make_bytes()` on each
+ # chunk instead of just joining on `self`, because the new iterator
+ # that would be created can't be pickled by cache middleware.
+ return b''.join(self.make_bytes(chunk) for chunk in self._container)
+
+ @content.setter
+ def content(self, value):
+ self._set_container(value, stream=False)
+
+ def write(self, content):
+ self._consume_content()
+ self._container.append(content)
+
+ def tell(self):
+ self._consume_content()
return sum([len(chunk) for chunk in self])
+class HttpStreamingResponse(HttpResponseBase):
+ """
+ A streaming HTTP response class with an iterator as content that should
+ only be iterated once, when the response is streamed to the client.
+
+ However, the content can be appended to or replaced with a new iterator
+ that wraps the original content (or yields entirely new content).
+ """
+
+ def __init__(self, content='', *args, **kwargs):
+ super(HttpStreamingResponse, self).__init__(*args, **kwargs)
+ # Streaming_content is an iterator that yields bytestrings. See the
+ # `streaming_content` property methods.
+ self.streaming_content = content
+
+ if six.PY3:
+ __bytes__ = HttpResponseBase.serialize_headers
+ else:
+ __str__ = HttpResponseBase.serialize_headers
+
+ @property
+ def content(self):
+ raise AttributeError(
+ "This %s instance has no `content` attribute. Use "
+ "`streaming_content` instead." % self.__class__)
+
+ @property
+ def streaming_content(self):
+ # Iterate over `self._container` and call `self.make_bytes()` on each
+ # chunk instead of just returning `iter(self)`, so we can safely wrap
+ # `streaming_content` in a new generator without raising a
+ # "ValueError: generator already executing" exception.
+ return (self.make_bytes(chunk) for chunk in self._container)
+
+ @streaming_content.setter
+ def streaming_content(self, value):
+ self._set_container(value, stream=True)
+
+ def __next__(self):
+ # Close all file-like objects assigned as content, after the iterator
+ # has been exhausted.
+ try:
+ return self.make_bytes(next(self._iterator))
+ except StopIteration:
+ self.close()
+ raise
+
+ next = __next__ # Python 2 compatibility
+
class HttpResponseRedirectBase(HttpResponse):
allowed_schemes = ['http', 'https', 'ftp']
View
12 django/http/utils.py
@@ -26,10 +26,16 @@ def conditional_content_removal(request, response):
responses. Ensures compliance with RFC 2616, section 4.3.
"""
if 100 <= response.status_code < 200 or response.status_code in (204, 304):
- response.content = ''
- response['Content-Length'] = 0
+ if hasattr(response, 'streaming_content'):
+ response.streaming_content = []
+ else:
+ response.content = ''
+ response['Content-Length'] = 0
if request.method == 'HEAD':
- response.content = ''
+ if hasattr(response, 'streaming_content'):
+ response.streaming_content = []
+ else:
+ response.content = ''
return response
@claudep Collaborator
claudep added a note

I think it would be cleaner to define an API to remove content on HttpResponse classes (like truncate(update_length=True))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
def fix_IE_for_attach(request, response):
View
15 django/middleware/common.py
@@ -113,14 +113,15 @@ def process_response(self, request, response):
if settings.USE_ETAGS:
if response.has_header('ETag'):
etag = response['ETag']
- else:
+ elif not hasattr(response, 'streaming_content'):
etag = '"%s"' % hashlib.md5(response.content).hexdigest()
- if response.status_code >= 200 and response.status_code < 300 and request.META.get('HTTP_IF_NONE_MATCH') == etag:
- cookies = response.cookies
- response = http.HttpResponseNotModified()
- response.cookies = cookies
- else:
- response['ETag'] = etag
+ if 'etag' in locals():
+ if 200 <= response.status_code < 300 and request.META.get('HTTP_IF_NONE_MATCH') == etag:
+ cookies = response.cookies
+ response = http.HttpResponseNotModified()
+ response.cookies = cookies
+ else:
+ response['ETag'] = etag
return response
View
26 django/middleware/gzip.py
@@ -1,6 +1,6 @@
import re
-from django.utils.text import compress_string
+from django.utils.text import compress_sequence, compress_string
from django.utils.cache import patch_vary_headers
re_accepts_gzip = re.compile(r'\bgzip\b')
@@ -13,7 +13,8 @@ class GZipMiddleware(object):
"""
def process_response(self, request, response):
# It's not worth attempting to compress really short responses.
- if len(response.content) < 200:
+ if not hasattr(response, 'streaming_content') \
+ and len(response.content) < 200:
return response
patch_vary_headers(response, ('Accept-Encoding',))
@@ -32,15 +33,22 @@ def process_response(self, request, response):
if not re_accepts_gzip.search(ae):
return response
- # Return the compressed content only if it's actually shorter.
- compressed_content = compress_string(response.content)
- if len(compressed_content) >= len(response.content):
- return response
+ if hasattr(response, 'streaming_content'):
+ # Delete the `Content-Length` header for streaming content, because
+ # we won't know the compressed size until we stream it.
+ response.streaming_content = \
+ compress_sequence(response.streaming_content)
+ del response['Content-Length']
+ else:
+ # Return the compressed content only if it's actually shorter.
+ compressed_content = compress_string(response.content)
+ if len(compressed_content) >= len(response.content):
+ return response
+ response.content = compressed_content
+ response['Content-Length'] = str(len(response.content))
if response.has_header('ETag'):
response['ETag'] = re.sub('"$', ';gzip"', response['ETag'])
-
- response.content = compressed_content
response['Content-Encoding'] = 'gzip'
- response['Content-Length'] = str(len(response.content))
+
return response
View
3  django/middleware/http.py
@@ -10,7 +10,8 @@ class ConditionalGetMiddleware(object):
"""
def process_response(self, request, response):
response['Date'] = http_date()
- if not response.has_header('Content-Length'):
+ if not hasattr(response, 'streaming_content') \
+ and not response.has_header('Content-Length'):
response['Content-Length'] = str(len(response.content))
if response.has_header('ETag'):
View
3  django/utils/cache.py
@@ -95,7 +95,8 @@ def get_max_age(response):
pass
def _set_response_etag(response):
- response['ETag'] = '"%s"' % hashlib.md5(response.content).hexdigest()
+ if not hasattr(response, 'streaming_content'):
+ response['ETag'] = '"%s"' % hashlib.md5(response.content).hexdigest()
return response
def patch_response_headers(response, cache_timeout=None):
View
33 django/utils/text.py
@@ -288,6 +288,39 @@ def compress_string(s):
zfile.close()
return zbuf.getvalue()
+# WARNING - be aware that compress_sequence does not achieve the same
+# level of compression as compress_string.
+class StreamingBuffer(object):
+ def __init__(self):
+ self.vals = []
+
+ def write(self, val):
+ self.vals.append(val)
+
+ def read(self):
+ ret = b''.join(self.vals)
+ self.vals = []
+ return ret
+
+ def flush(self):
+ return
+
+ def close(self):
+ return
+
+def compress_sequence(sequence):
+ buf = StreamingBuffer()
+ zfile = GzipFile(mode='wb', compresslevel=6, fileobj=buf)
+ # Output headers...
+ yield buf.read()
+ for item in sequence:
+ zfile.write(item)
+ zfile.flush()
+ yield buf.read()
+ zfile.close()
+ val = buf.read()
+ yield val
+
ustring_re = re.compile("([\u0080-\uffff])")
def javascript_quote(s, quote_double_quotes=False):
View
78 docs/ref/request-response.txt
@@ -560,12 +560,23 @@ Passing iterators
~~~~~~~~~~~~~~~~~
Finally, you can pass ``HttpResponse`` an iterator rather than passing it
-hard-coded strings. If you use this technique, follow these guidelines:
+hard-coded strings. If you use this technique, the iterator should return
+strings.
-* The iterator should return strings.
-* If an :class:`HttpResponse` has been initialized with an iterator as its
- content, you can't use the :class:`HttpResponse` instance as a file-like
- object. Doing so will raise ``Exception``.
+.. versionchanged:: 1.5
+
+In previous versions, passing an iterator as content to :class:`HttpResponse`
+would create a streaming response if (and only if) no middleware accessed the
+:attr:`HttpResponse.content` attribute before the response could be returned.
+
+If you want to guarantee that your response will stream to the client, you
+should use the new :class:`HttpStreamingResponse` class, instead.
+
+.. versionchanged:: 1.5
+
+You can now use the :meth:`HttpResponse.write()` method even when passing an
+iterator as content to :class:`HttpResponse`, because Django will consume and
+cache the iterator on first access.
Setting headers
~~~~~~~~~~~~~~~
@@ -778,3 +789,60 @@ types of HTTP responses. Like ``HttpResponse``, these subclasses live in
method, Django will treat it as emulating a
:class:`~django.template.response.SimpleTemplateResponse`, and the
``render`` method must itself return a valid response object.
+
+HttpStreamingResponse objects
+=============================
+
+.. versionadded:: 1.5
+
+.. class:: HttpStreamingResponse
+
+The :class:`HttpStreamingResponse` class should be used when you need to stream
+a response. You might want to do this if generating the full response takes too
+much time so that you need to deliever it progressively, or to avoid bringing
+the entire content into memory.
+
+The :class:`HttpStreamingResponse` is not a subclass of :class:`HttpResponse`,
+because it features a very slightly different API. However, it is almost
+identical, with the following notable differences:
+
+* It should be given an iterator that yields strings as content. Any other
+ content will be converted to an iterator for consistent behaviour.
+
+* You cannot access its content, except by iterating the response. This
+ should only occur when the response is returned to the client.
+
+* It has no ``content`` attribute. Instead, it has a
+ :attr:`~HttpStreamingResponse.streaming_content` attribute.
+
+* You cannot use the file-like object ``tell()`` or ``write()`` methods.
+ Doing so will raise ``Exception``.
+
+* Any iterators that have a ``close()`` method and are assigned as content will
+ be closed automatically after the response has been iterated.
+
+Note that :class:`HttpStreamingResponse` should only be used in situations
+where it is absolutely required that the whole content isn't iterated before
+transferring the data to the client. Because the content can't be accessed,
+many middlewares can't function normally. For example the ``ETag`` and
+``Content-Length`` headers can't be generated for streaming responses. Caching
+done in middleware will be disabled.
+
+Attributes
+----------
+
+.. attribute:: HttpStreamingResponse.streaming_content
+
+ An iterator representing the content. It yields strings, encoded from
+ Unicode objects if necessary.
+
+ This attribute exists so that you can determine the capabilities of a
+ response object, and wrap or replace the content. For example::
+
+ if hasattr(response, 'streaming_content'):
+ # Note that the wrapper for streaming content should not consume
+ # the content prematurely - usually it should be an iterator
+ # itself.
+ response.streaming_content = wrap_streaming_content(response.streaming_content)
+ else:
+ response.content = wrap_content(response.content)
View
32 docs/releases/1.5.txt
@@ -121,6 +121,38 @@ GeoDjango
* Support for GDAL < 1.5 has been dropped.
+Explicit support for streaming responses with ``HttpStreamingResponse`` class
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Previously, you could generate a streaming response by passing an iterator as
+content to :class:`~django.http.HttpResponse`. This was not explicitly
+documented, and was not always guaranteed to generate a streaming response
+because any middleware that accessed the :attr:`~django.http.HttpResponse.content`
+attribute would consume the iterator prematurely.
+
+In Django 1.5 you can explicitly generate a streaming response with the new
+:class:`~django.http.HttpStreamingResponse` class. This new class exposes a
+:class:`~django.http.HttpStreamingResponse.streaming_content` attribute. You
+can safely wrap or replace the ``streaming_content``.
+
+Unlike the :class:`~django.http.HttpResponse` class, the new response class
+does not have a ``content`` attribute. As a result, middleware should no longer
+assume that all responses will have a ``content`` attribute. Middleware should
+inspect the response and behave accordingly if they need access to content::
+
+ if hasattr(response, 'streaming_content'):
+ response.streaming_content = wrap_streaming_content(response.streaming_content)
+ else:
+ response.content = wrap_content(response.content)
+
+Note that ``streaming_content`` should be assumed to be too large to hold in
+memory, and thus middleware should not access in ways that break streaming.
+
+When passing an iterator as content to the :class:`~django.http.HttpResponse`
+class, Django will consume the iterator immediately to prevent buggy behaviour
+when the :attr:`~django.http.HttpResponse.content` attribute is accessed
+multiple times.
+
Minor features
~~~~~~~~~~~~~~
View
26 tests/regressiontests/cache/tests.py
@@ -19,7 +19,8 @@
from django.core.cache.backends.base import (CacheKeyWarning,
InvalidCacheBackendError)
from django.db import router
-from django.http import HttpResponse, HttpRequest, QueryDict
+from django.http import (HttpResponse, HttpRequest, HttpStreamingResponse,
+ QueryDict)
from django.middleware.cache import (FetchFromCacheMiddleware,
UpdateCacheMiddleware, CacheMiddleware)
from django.template import Template
@@ -1416,6 +1417,29 @@ def set_cache(request, lang, msg):
# reset the language
translation.deactivate()
+ @override_settings(
+ CACHE_MIDDLEWARE_KEY_PREFIX="test",
+ CACHE_MIDDLEWARE_SECONDS=60,
+ USE_ETAGS=True,
+ )
+ def test_middleware_with_streaming_response(self):
+ # cache with non empty request.GET
+ request = self._get_request_cache(query_string='foo=baz&other=true')
+
+ # first access, cache must return None
+ get_cache_data = FetchFromCacheMiddleware().process_request(request)
+ self.assertEqual(get_cache_data, None)
+
+ # pass streaming response through UpdateCacheMiddleware.
+ content = 'Check for cache with QUERY_STRING and streaming content'
+ response = HttpStreamingResponse(content)
+ UpdateCacheMiddleware().process_response(request, response)
+
+ # second access, cache must still return None, because we can't cache
+ # streaming response.
+ get_cache_data = FetchFromCacheMiddleware().process_request(request)
+ self.assertEqual(get_cache_data, None)
+
@override_settings(
CACHES={
View
0  tests/regressiontests/http_utils/__init__.py
No changes.
View
0  tests/regressiontests/http_utils/models.py
No changes.
View
50 tests/regressiontests/http_utils/tests.py
@@ -0,0 +1,50 @@
+from __future__ import unicode_literals
+from django.http import HttpRequest, HttpResponse, HttpStreamingResponse, utils
+from django.test import TestCase
+
+class HttpUtilTests(TestCase):
+ def test_conditional_content_removal(self):
+ """
+ Tests that content is removed from regular and streaming responses with
+ a status_code of 100-199, 204, 304 or a method of "HEAD".
+ """
+ req = HttpRequest()
+
+ for status_code in (100, 150, 199, 204, 304):
+ # regular response.
+ res = HttpResponse('abc')
+ self.assertEqual(res.content, b'abc')
+ res.status_code = status_code
+ utils.conditional_content_removal(req, res)
+ self.assertEqual(res.content, b'')
+
+ # streaming response.
+ res = HttpResponse(['abc'])
+ self.assertEqual(b''.join(res), b'abc')
+ res = HttpResponse(['abc'])
+ res.status_code = status_code
+ utils.conditional_content_removal(req, res)
+ self.assertEqual(b''.join(res), b'')
+
+ # do nothing for other status codes.
+ res = HttpResponse('abc')
+ self.assertEqual(res.content, b'abc')
+ res.status_code = 200
+ utils.conditional_content_removal(req, res)
+ self.assertEqual(res.content, b'abc')
+
+ # HEAD reqeusts.
+ req.method = 'HEAD'
+
+ # regular response.
+ res = HttpResponse('abc')
+ self.assertEqual(res.content, b'abc')
+ utils.conditional_content_removal(req, res)
+ self.assertEqual(res.content, b'')
+
+ # streaming response.
+ res = HttpResponse(['abc'])
+ self.assertEqual(b''.join(res), b'abc')
+ res = HttpResponse(['abc'])
+ utils.conditional_content_removal(req, res)
+ self.assertEqual(b''.join(res), b'')
View
0  tests/regressiontests/httpwrappers/abc.txt
No changes.
View
161 tests/regressiontests/httpwrappers/tests.py
@@ -2,12 +2,14 @@
from __future__ import unicode_literals
import copy
+import os
import pickle
+import tempfile
from django.core.exceptions import SuspiciousOperation
from django.http import (QueryDict, HttpResponse, HttpResponseRedirect,
HttpResponsePermanentRedirect, HttpResponseNotAllowed,
- HttpResponseNotModified,
+ HttpResponseNotModified, HttpStreamingResponse,
SimpleCookie, BadHeaderError,
parse_cookie)
from django.test import TestCase
@@ -329,6 +331,15 @@ def test_iter_content(self):
self.assertRaises(UnicodeEncodeError,
getattr, r, 'content')
+ # content can safely be accessed multiple times.
+ r = HttpResponse(iter(['hello', 'world']))
+ self.assertEqual(r.content, r.content)
+ self.assertEqual(r.content, b'helloworld')
+
+ # additional content can be written to the response.
+ r.write('!')
+ self.assertEqual(r.content, b'helloworld!')
+
def test_file_interface(self):
r = HttpResponse()
r.write(b"hello")
@@ -337,7 +348,9 @@ def test_file_interface(self):
self.assertEqual(r.tell(), 17)
r = HttpResponse(['abc'])
- self.assertRaises(Exception, r.write, 'def')
+ r.write('def')
+ self.assertEqual(r.tell(), 6)
+ self.assertEqual(r.content, b'abcdef')
def test_unsafe_redirect(self):
bad_urls = [
@@ -351,7 +364,6 @@ def test_unsafe_redirect(self):
self.assertRaises(SuspiciousOperation,
HttpResponsePermanentRedirect, url)
-
class HttpResponseSubclassesTests(TestCase):
def test_redirect(self):
response = HttpResponseRedirect('/redirected/')
@@ -379,6 +391,149 @@ def test_not_allowed(self):
content_type='text/html')
self.assertContains(response, 'Only the GET method is allowed', status_code=405)
+class HttpStreamingResponseTests(TestCase):
+ def test_streaming_response(self):
+ r = HttpStreamingResponse(iter(['hello', 'world']))
+
+ # iterating over the response itself yields bytestring chunks.
+ chunks = list(r)
+ self.assertEqual(chunks, [b'hello', b'world'])
+ for chunk in chunks:
+ self.assertIsInstance(chunk, six.binary_type)
+
+ # and the response can only be iterated once.
+ self.assertEqual(list(r), [])
+
+ # even when a sequence that can be iterated many times, like a list,
+ # is given as content.
+ r = HttpStreamingResponse(['abc', 'def'])
+ self.assertEqual(list(r), [b'abc', b'def'])
+ self.assertEqual(list(r), [])
+
+ # and even when a non-iterable is given as content.
+ r = HttpStreamingResponse(123)
+ self.assertEqual(list(r), [b'123'])
+ self.assertEqual(list(r), [])
+
+ # streaming responses don't have a `content` attribute.
+ self.assertFalse(hasattr(r, 'content'))
+
+ # and you can't accidentally assign to a `content` attribute.
+ with self.assertRaises(AttributeError):
+ r.content = 'xyz'
+
+ # but they do have a `streaming_content` attribute.
+ self.assertTrue(hasattr(r, 'streaming_content'))
+
+ # that exists so we can check if a response is streaming, and wrap or
+ # replace the content iterator.
+ r.streaming_content = iter(['abc', 'def'])
+ r.streaming_content = (chunk.upper() for chunk in r.streaming_content)
+ self.assertEqual(list(r), [b'ABC', b'DEF'])
+
+ # coercing a streaming response to bytes doesn't return a complete HTTP
+ # message like a regular response does. it only gives us the headers.
+ r = HttpStreamingResponse(iter(['hello', 'world']))
+ self.assertEqual(
+ six.binary_type(r), b'Content-Type: text/html; charset=utf-8')
+
+ # and this won't consume its content.
+ self.assertEqual(list(r), [b'hello', b'world'])
+
+ # additional content cannot be written to the response.
+ r = HttpStreamingResponse(iter(['hello', 'world']))
+ with self.assertRaises(Exception):
+ r.write('!')
+
+ # and we can't tell the current position.
+ with self.assertRaises(Exception):
+ r.tell()
+
+ def test_response(self):
+ # provide compatibility for unofficially supported streaming behaviour
+ # by not consuming an iterator assigned to HttpResponse as content
+ # until first access of `content`, first call to `write()` or first
+ # call to `tell()`.
+
+ # access content.
+ r = HttpResponse(iter(['hello', 'world']))
+ self.assertNotIsInstance(r._container, list)
+ r.content
+ self.assertIsInstance(r._container, list)
+
+ # call write().
+ r = HttpResponse(iter(['hello', 'world']))
+ self.assertNotIsInstance(r._container, list)
+ r.write('!')
+ self.assertIsInstance(r._container, list)
+
+ # call tell().
+ r = HttpResponse(iter(['hello', 'world']))
+ self.assertNotIsInstance(r._container, list)
+ r.tell()
+ self.assertIsInstance(r._container, list)
+
+class FileCloseTests(TestCase):
+ def test_response(self):
+ filename = os.path.join(os.path.dirname(__file__), 'abc.txt')
+
+ # file isn't closed until we close the response.
+ file1 = open(filename)
+ r = HttpResponse(file1)
+ self.assertFalse(file1.closed)
+ r.close()
+ self.assertTrue(file1.closed)
+
+ # don't automatically close file when we finish iterating the response.
+ file1 = open(filename)
+ r = HttpResponse(file1)
+ self.assertFalse(file1.closed)
+ list(r)
+ self.assertFalse(file1.closed)
+ r.close()
+ self.assertTrue(file1.closed)
+
+ # when multiple file are assigned as content, make sure they are all
+ # closed with the response.
+ file1 = open(filename)
+ file2 = open(filename)
+ r = HttpResponse(file1)
+ r.content = file2
+ self.assertFalse(file1.closed)
+ self.assertFalse(file2.closed)
+ r.close()
+ self.assertTrue(file1.closed)
+ self.assertTrue(file2.closed)
+
+ def test_streaming_response(self):
+ filename = os.path.join(os.path.dirname(__file__), 'abc.txt')
+
+ # file isn't closed until we close the response.
+ file1 = open(filename)
+ r = HttpStreamingResponse(file1)
+ self.assertFalse(file1.closed)
+ r.close()
+ self.assertTrue(file1.closed)
+
+ # automatically close file when we finish iterating the response.
+ file1 = open(filename)
+ r = HttpStreamingResponse(file1)
+ self.assertFalse(file1.closed)
+ list(r)
+ self.assertTrue(file1.closed)
+
+ # when multiple file are assigned as content, make sure they are all
+ # closed with the response.
+ file1 = open(filename)
+ file2 = open(filename)
+ r = HttpStreamingResponse(file1)
+ r.streaming_content = file2
+ self.assertFalse(file1.closed)
+ self.assertFalse(file2.closed)
+ r.close()
+ self.assertTrue(file1.closed)
+ self.assertTrue(file2.closed)
+
class CookieTests(unittest.TestCase):
def test_encode(self):
"""
View
43 tests/regressiontests/middleware/tests.py
@@ -8,7 +8,7 @@
from django.conf import settings
from django.core import mail
from django.http import HttpRequest
-from django.http import HttpResponse
+from django.http import HttpResponse, HttpStreamingResponse
from django.middleware.clickjacking import XFrameOptionsMiddleware
from django.middleware.common import CommonMiddleware
from django.middleware.http import ConditionalGetMiddleware
@@ -322,6 +322,12 @@ def test_content_length_header_added(self):
self.assertTrue('Content-Length' in self.resp)
self.assertEqual(int(self.resp['Content-Length']), content_length)
+ def test_content_length_header_not_added(self):
+ resp = HttpStreamingResponse('content')
+ self.assertFalse('Content-Length' in resp)
+ resp = ConditionalGetMiddleware().process_response(self.req, resp)
+ self.assertFalse('Content-Length' in resp)
+
def test_content_length_header_not_changed(self):
bad_content_length = len(self.resp.content) + 10
self.resp['Content-Length'] = bad_content_length
@@ -351,6 +357,29 @@ def test_if_none_match_and_different_etag(self):
self.resp = ConditionalGetMiddleware().process_response(self.req, self.resp)
self.assertEqual(self.resp.status_code, 200)
+ @override_settings(USE_ETAGS=True)
+ def test_etag(self):
+ req = HttpRequest()
+ res = HttpResponse('content')
+ self.assertTrue(
+ CommonMiddleware().process_response(req, res).has_header('ETag'))
+
+ @override_settings(USE_ETAGS=True)
+ def test_etag_streaming_response(self):
+ req = HttpRequest()
+ res = HttpStreamingResponse(iter(['content']))
+ res['ETag'] = 'tomatoes'
+ self.assertEqual(
+ CommonMiddleware().process_response(req, res).get('ETag'),
+ 'tomatoes')
+
+ @override_settings(USE_ETAGS=True)
+ def test_no_etag_streaming_response(self):
+ req = HttpRequest()
+ res = HttpStreamingResponse(iter(['content']))
+ self.assertFalse(
+ CommonMiddleware().process_response(req, res).has_header('ETag'))
+
# Tests for the Last-Modified header
def test_if_modified_since_and_no_last_modified(self):
@@ -511,6 +540,7 @@ class GZipMiddlewareTest(TestCase):
short_string = b"This string is too short to be worth compressing."
compressible_string = b'a' * 500
uncompressible_string = b''.join(six.int2byte(random.randint(0, 255)) for _ in xrange(500))
+ sequence = [b'a' * 500, b'b' * 200, b'a' * 300]
def setUp(self):
self.req = HttpRequest()
@@ -525,6 +555,8 @@ def setUp(self):
self.resp.status_code = 200
self.resp.content = self.compressible_string
self.resp['Content-Type'] = 'text/html; charset=UTF-8'
+ self.stream_resp = HttpStreamingResponse(self.sequence)
+ self.stream_resp['Content-Type'] = 'text/html; charset=UTF-8'
@staticmethod
def decompress(gzipped_string):
@@ -539,6 +571,15 @@ def test_compress_response(self):
self.assertEqual(r.get('Content-Encoding'), 'gzip')
self.assertEqual(r.get('Content-Length'), str(len(r.content)))
+ def test_compress_streaming_response(self):
+ """
+ Tests that compression is performed on responses with streaming content.
+ """
+ r = GZipMiddleware().process_response(self.req, self.stream_resp)
+ self.assertEqual(self.decompress(b''.join(r)), b''.join(self.sequence))
+ self.assertEqual(r.get('Content-Encoding'), 'gzip')
+ self.assertFalse(r.has_header('Content-Length'))
+
def test_compress_non_200_response(self):
"""
Tests that compression is performed on responses with a status other than 200.
Something went wrong with that request. Please try again.