New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix feed aggregator n+1 issue & cache the model data [#1261] #1274
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @marksweb. Sounds nice. Can I ask you to add some tests so we can avoid regressions in the future?
@carltongibson yeah very good point 👍 |
@carltongibson Ok I've added a test so now those 4 queries I mentioned are set in stone/tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @marksweb — Thanks for this, I've left some comments.
Q: is the select_related in ft.items() not enough to solve the N+1 issue?
Can we perhaps try a small commit adding that (with an appropriate assertNumQueries) and then look at caching separately? I think it might be better to cache the page — or the full template fragment, rather than each of the FeedItems fetches separately. 🤔
@staticmethod | ||
def cached_by_feed_type_id(feed_type_id): | ||
""" | ||
Return the feed items (from the cache if possible) for a given feed type ID | ||
|
||
:param feed_type_id: the FeedType ID of the items to retrieve | ||
:type feed_type_id: int | ||
:return: the FeedItems related to the FeedType, | ||
from the cache if possible, else from the database. | ||
Will return None if not found | ||
:rtype: Queryset or NoneType | ||
""" | ||
key = CACHED_FEEDITEMS_KEY.format(feed_type_id=feed_type_id) | ||
|
||
items = cache.get(key) | ||
if items: | ||
if isinstance(items, NotFound): | ||
return FeedItem.objects.none() | ||
return items | ||
|
||
items = FeedItem.objects.filter( | ||
feed__feed_type_id=feed_type_id | ||
).select_related('feed', 'feed__feed_type') | ||
|
||
if not items: | ||
cache.set(key, NotFound(), CACHED_FEEDITEMS_LENGTH) | ||
return FeedItem.objects.none() | ||
|
||
cache.set(key, items, CACHED_FEEDITEMS_LENGTH) | ||
return items |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a table-level method to me. It should likely live on the manager, rather than being a staticmethod on the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carltongibson Sure I can move to the manager. This was a matter of habit that I did it this way.
@@ -5,16 +5,18 @@ | |||
|
|||
from .forms import FeedModelForm | |||
from .models import APPROVED_FEED, Feed, FeedItem, FeedType | |||
from .utils import get_feed_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would inline the get_feed_data()
function next to where it is used, rather than hiding it away in the utils
module.
for ft in FeedType.objects.all(): | ||
feeds.append((ft, ft.items()[0:5])) | ||
ctx = {'feedtype_list': feeds} | ||
ctx = {'feed_data': get_feed_data()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: is the select_related in ft.items()
not enough to solve the N+1 issue?
(Then why not just cache_page
rather than the cached_by_feed_type_id
calls, which we're still iterating over. The cache will be a bit faster than the DB sure, but it doesn't seem a good ROI here. 🤔)
def test_community_index_number_of_queries(self): | ||
""" Intended to prevent an n+1 issue on the community index view """ | ||
url = reverse('community-index') | ||
with self.assertNumQueries(4): | ||
self.client.get(url) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd like to see the test closer on the new methods, rather than just round the view. It's hard to be sure exactly what's covered.
{% for feedtype, latest_feeds in feedtype_list %} | ||
{% for feedtype, latest_items in feed_data %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equally, we could cache this template fragment. 🤔
@carltongibson I can certainly keep this simple for now and remove all the caching side of the changes. As I said, that drops the queries to 7 & I just saw an easy opportunity to save a few more. I'll remove the caching and that can be addressed separately. |
Cool. Thanks. I'm not against the caching... — I'm just not quite sure about the approach here (vs page or template cache, for example.) Better if we handle that separately. 🎁 |
@carltongibson Yeah that's no problem. I just don't do template caching outside of a cached loader so happy to adjust to suit the project. I get used to caching at that query level because data might get pulled out in various places. Shall I raise a new PR on a clean branch to make the |
OK, yeah, sure. Pull the smaller change to a fresh branch and we can rebase after that. |
This addresses the regression in #1261 and adds caching for feed items.
Items are cached for 24 hours and invalidated on saving or deleting
FeedType
andFeedItem
.Prior to this the community page had 10 queries. The initial
select_related
brought it down to 7 and then the caching brings it down to 4.