Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fixed #21179 -- Added a small section in the "Outputting CSV with Django... #2358

Closed
wants to merge 8 commits into
from

Conversation

Projects
None yet
2 participants
Contributor

zedr commented Feb 22, 2014

This is my attempt to fix #21179.

The example shows how generators can be used with the csv.writer class stream large CSV files.

I have added a few comments on this example on the ticket's page: https://code.djangoproject.com/ticket/21179

zedr added some commits Feb 22, 2014

@zedr zedr Fixed #21179 -- Added a small section in the "Outputting CSV with Dja…
…ngo" page that suggests using the StreamingHttpResponse class

The example shows how generators can be used with the csv.writer class stream large CSV files.
9473565
@zedr zedr Merge branch 'master' of git://github.com/django/django 3d26680

@bmispelon bmispelon and 1 other commented on an outdated diff Feb 22, 2014

docs/howto/outputting-csv.txt
+
+ import csv
+
+ from django.http import StreamingHttpResponse
+
+ class Echo(object):
+ """An object that implements just the write method of the file-like
+ interface.
+ """
+ def write(self, value):
+ """Write the value by returning it, instead of storing in a buffer."""
+ return value
+
+ def some_streaming_csv_view(request):
+ """A view that streams a large CSV file."""
+ rows = (["Row {0}".format(idx), str(idx)] for idx in xrange(100))
@bmispelon

bmispelon Feb 22, 2014

Member

xrange is Python2 only. I believe our documentation has started to transition to Python3 by default so this should be changed. You could also use django.utils.six which provides a compatibility layer.

@bmispelon

bmispelon Feb 22, 2014

Member

100 items is not very impressive. How about a billion instead?

@zedr

zedr Feb 22, 2014

Contributor

I changed this to 65536, which is the maximum number of rows for many popular spreadsheet programs on 32 bit systems.

@bmispelon bmispelon commented on an outdated diff Feb 22, 2014

docs/howto/outputting-csv.txt
@@ -54,6 +54,36 @@ mention:
about escaping strings with quotes or commas in them. Just pass
``writerow()`` your raw strings, and it'll do the right thing.
+Streaming large files
+~~~~~~~~~~~~~~~~~~~~~
+If you need to work with very large files, you might want to consider using Django's
@bmispelon

bmispelon Feb 22, 2014

Member

When dealing with large static files, you should actually not be using Django in the first place.

I'd reword it to something like "When working with views that can generate big responses, ..."

@bmispelon bmispelon commented on an outdated diff Feb 22, 2014

docs/howto/outputting-csv.txt
@@ -54,6 +54,36 @@ mention:
about escaping strings with quotes or commas in them. Just pass
``writerow()`` your raw strings, and it'll do the right thing.
+Streaming large files
+~~~~~~~~~~~~~~~~~~~~~
+If you need to work with very large files, you might want to consider using Django's
+:class:`django.http.StreamingHttpResponse` objects instead.
+
+In this example, we want to make full use of Python generators to efficiently
+handle the assembly and transmission of a large CSV files::
@bmispelon

bmispelon Feb 22, 2014

Member

It should either be "a large CSV file" or "large CSV files"

@zedr zedr Updated and improved fix for #21179
I've updated the text following several suggestions by bmispelon, and also made the code Python 3 compatible.
642d9b3
Contributor

zedr commented Feb 22, 2014

I've updated the pull request.

@zedr zedr and 1 other commented on an outdated diff Feb 23, 2014

docs/howto/outputting-csv.txt
@@ -54,6 +54,37 @@ mention:
about escaping strings with quotes or commas in them. Just pass
``writerow()`` your raw strings, and it'll do the right thing.
+Streaming large files
+~~~~~~~~~~~~~~~~~~~~~
+When dealing with views that generate very big responses, you might want to consider using Django's
+:class:`django.http.StreamingHttpResponse` objects instead.
+
+In this example, we want to make full use of Python generators to efficiently
+handle the assembly and transmission of a large CSV file::
+
+ import csv
+
+ from django.utils.six.moves import xrange
@zedr

zedr Feb 23, 2014

Contributor

Python 3 compat

@bmispelon

bmispelon Feb 23, 2014

Member

Since Python3 is the default, using from django.utils.six.moves import range would be better.

@zedr zedr and 1 other commented on an outdated diff Feb 23, 2014

docs/howto/outputting-csv.txt
+ import csv
+
+ from django.utils.six.moves import xrange
+ from django.http import StreamingHttpResponse
+
+ class Echo(object):
+ """An object that implements just the write method of the file-like
+ interface.
+ """
+ def write(self, value):
+ """Write the value by returning it, instead of storing in a buffer."""
+ return value
+
+ def some_streaming_csv_view(request):
+ """A view that streams a large CSV file."""
+ rows = (["Row {0}".format(idx), str(idx)] for idx in xrange(65536))
@zedr

zedr Feb 23, 2014

Contributor

65536 is the maximum number of rows allowed for a sheet by most 32 bit spreadsheet applications.

@bmispelon

bmispelon Feb 23, 2014

Member

A comment as to why this number was chosen would be good to have.

@zedr

zedr Feb 23, 2014

Contributor

Good point. I'll add it. Thanks for suggesting it.

Rigel Di Scala added some commits Feb 23, 2014

Rigel Di Scala Update outputting-csv.txt
Further improved the fix for #21179, by adding a comment that explains why the number 65536 (the highest number that can be represented by a 16 bit unsigned integer) was chosen for the example.
cd5c020
Rigel Di Scala Update outputting-csv.txt
Improved the fix for #21179, by switching from xrange() to range(), and restating the import as a Python 2 compatibility import.
88a9588
zedr Fixed #22085 - Add a feature for setting non-expiring keys as the def…
…ault.

This feature allows the default `TIMEOUT` Cache argument to be set to `None`,
so that Cache instances can set a non-expiring key as the default,
instead of using the default value of 5 minutes.

Previously, this was possible only by passing `None` as an argument to
the set() method of objects of type `BaseCache` (and subtypes).
9a00e4a
zedr Removed the import of `DEFAULT_CACHE_ALIAS` that redefines a previous…
… import

The GetCacheTests test case import the constant `DEFAULT_CACHE_ALIAS`
inside one of its test methods, redefining a previous import. Removing
this second import does not break the test.
f4432b9
zedr Merge branch 'master' of https://github.com/zedr/django into t22117 c77f3eb
Member

bmispelon commented Mar 4, 2014

There's some commits in there that belonged to another pull request of yours (#2365).

The easiest way to fix this would probably be to close this pull request and open a new one based off a clean branch (see https://docs.djangoproject.com/en/1.6/internals/contributing/writing-code/working-with-git/#working-on-a-ticket if you need some pointers) .

Contributor

zedr commented Mar 4, 2014

Sorry about that. I'll cherry pick the right commits and re-submit a pull request

Member

bmispelon commented Mar 4, 2014

No worries. Now you know first-hand why we always recommend to work on a branch :)

@bmispelon bmispelon closed this Mar 4, 2014

Contributor

zedr commented Mar 4, 2014

Resubmitted as a new pull request: #2397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment