WIP: Use sparse_list to save memory (see #1117) #1451

acdha · 2016-11-09T00:51:38Z

The sparse list implementation avoids allocating the result cache elements
until something triggers loading the actual content.

This has a slight memory increase for very small querysets – roughly a custom
class + dict() vs. a list() – but the actual result data will usually be much
larger and the difference is cancelled out by the time the result cache is at a
thousand items.

This branch is currently broken using the released version of sparse_list – see
johnsyweb/python_sparse_list#5

The sparse list implementation avoids allocating the result cache elements until something triggers loading the actual content. This has a slight memory increase for very small querysets – roughly a custom class + dict() vs. a list() – but the actual result data will usually be much larger and the difference is cancelled out by the time the result cache is at a thousand items. This branch is currently broken using the released version of sparse_list – see johnsyweb/python_sparse_list#5

This bit of code really needs to be rewritten to avoid the loop but this patch at least hits the point of being testable on Python 2

speedplane · 2016-11-09T23:08:15Z

haystack/query.py

@@ -233,7 +234,7 @@ def _fill_cache(self, start, end, **kwargs):
        # an array of 100,000 ``None``s consumed less than .5 Mb, which ought


I would change this comment here explaining that a sparse list is used because the total results set can be many orders of magnitude larger than the set of results that we pull at any given time.

Following a rapid upstream release, this should get the tests in django-haystack#1117 to pass against Python 3.

acdha changed the title ~~Use sparse_list to save memory (see #1117)~~ WIP: Use sparse_list to save memory (see #1117) Nov 9, 2016

acdha added 2 commits November 8, 2016 19:52

Avoid infinite loops on Python 2

3aba1df

This bit of code really needs to be rewritten to avoid the loop but this patch at least hits the point of being testable on Python 2

acdha force-pushed the queryset-cache-using-sparse_list branch from 6c87a62 to 3aba1df Compare November 9, 2016 00:52

acdha mentioned this pull request Nov 9, 2016

Bad Assumption in SearchQuerySet._fill_cache Leads to Performance Degradation #1117

Open

speedplane reviewed Nov 9, 2016

View reviewed changes

Require sparse_list 0.6 or greater

d8c47f1

Following a rapid upstream release, this should get the tests in django-haystack#1117 to pass against Python 3.

Phyks mentioned this pull request Apr 12, 2019

Result cache consumes silly amounts of memory+time for large result counts #1219

Open

wetneb mentioned this pull request Jul 29, 2019

Queryset cache using sparse list update #1682

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Use sparse_list to save memory (see #1117) #1451

WIP: Use sparse_list to save memory (see #1117) #1451

acdha commented Nov 9, 2016

speedplane Nov 9, 2016

		@@ -233,7 +234,7 @@ def _fill_cache(self, start, end, **kwargs):
		# an array of 100,000 ``None``s consumed less than .5 Mb, which ought

WIP: Use sparse_list to save memory (see #1117) #1451

Are you sure you want to change the base?

WIP: Use sparse_list to save memory (see #1117) #1451

Conversation

acdha commented Nov 9, 2016

speedplane Nov 9, 2016

Choose a reason for hiding this comment