Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix feed aggregator n+1 issue & cache the model data [#1261] #1274

Closed
wants to merge 6 commits into from

Conversation

marksweb
Copy link
Contributor

This addresses the regression in #1261 and adds caching for feed items.

Items are cached for 24 hours and invalidated on saving or deleting FeedType and FeedItem.

Prior to this the community page had 10 queries. The initial select_related brought it down to 7 and then the caching brings it down to 4.

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @marksweb. Sounds nice. Can I ask you to add some tests so we can avoid regressions in the future?

@marksweb
Copy link
Contributor Author

@carltongibson yeah very good point 👍

@marksweb
Copy link
Contributor Author

@carltongibson Ok I've added a test so now those 4 queries I mentioned are set in stone/tests

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @marksweb — Thanks for this, I've left some comments.

Q: is the select_related in ft.items() not enough to solve the N+1 issue?

Can we perhaps try a small commit adding that (with an appropriate assertNumQueries) and then look at caching separately? I think it might be better to cache the page — or the full template fragment, rather than each of the FeedItems fetches separately. 🤔

Comment on lines +134 to +163
@staticmethod
def cached_by_feed_type_id(feed_type_id):
"""
Return the feed items (from the cache if possible) for a given feed type ID

:param feed_type_id: the FeedType ID of the items to retrieve
:type feed_type_id: int
:return: the FeedItems related to the FeedType,
from the cache if possible, else from the database.
Will return None if not found
:rtype: Queryset or NoneType
"""
key = CACHED_FEEDITEMS_KEY.format(feed_type_id=feed_type_id)

items = cache.get(key)
if items:
if isinstance(items, NotFound):
return FeedItem.objects.none()
return items

items = FeedItem.objects.filter(
feed__feed_type_id=feed_type_id
).select_related('feed', 'feed__feed_type')

if not items:
cache.set(key, NotFound(), CACHED_FEEDITEMS_LENGTH)
return FeedItem.objects.none()

cache.set(key, items, CACHED_FEEDITEMS_LENGTH)
return items
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a table-level method to me. It should likely live on the manager, rather than being a staticmethod on the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carltongibson Sure I can move to the manager. This was a matter of habit that I did it this way.

@@ -5,16 +5,18 @@

from .forms import FeedModelForm
from .models import APPROVED_FEED, Feed, FeedItem, FeedType
from .utils import get_feed_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would inline the get_feed_data() function next to where it is used, rather than hiding it away in the utils module.

Comment on lines -15 to +19
for ft in FeedType.objects.all():
feeds.append((ft, ft.items()[0:5]))
ctx = {'feedtype_list': feeds}
ctx = {'feed_data': get_feed_data()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is the select_related in ft.items() not enough to solve the N+1 issue?

(Then why not just cache_page rather than the cached_by_feed_type_id calls, which we're still iterating over. The cache will be a bit faster than the DB sure, but it doesn't seem a good ROI here. 🤔)

Comment on lines +92 to +97
def test_community_index_number_of_queries(self):
""" Intended to prevent an n+1 issue on the community index view """
url = reverse('community-index')
with self.assertNumQueries(4):
self.client.get(url)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like to see the test closer on the new methods, rather than just round the view. It's hard to be sure exactly what's covered.

{% for feedtype, latest_feeds in feedtype_list %}
{% for feedtype, latest_items in feed_data %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equally, we could cache this template fragment. 🤔

@marksweb
Copy link
Contributor Author

Can we perhaps try a small commit adding that (with an appropriate assertNumQueries) and then look at caching separately? I think it might be better to cache the page — or the full template fragment, rather than each of the FeedItems fetches separately. 🤔

@carltongibson I can certainly keep this simple for now and remove all the caching side of the changes. As I said, that drops the queries to 7 & I just saw an easy opportunity to save a few more.

I'll remove the caching and that can be addressed separately.

@carltongibson
Copy link
Member

Cool. Thanks.

I'm not against the caching... — I'm just not quite sure about the approach here (vs page or template cache, for example.) Better if we handle that separately.

🎁

@marksweb
Copy link
Contributor Author

@carltongibson Yeah that's no problem. I just don't do template caching outside of a cached loader so happy to adjust to suit the project. I get used to caching at that query level because data might get pulled out in various places.

Shall I raise a new PR on a clean branch to make the select_related change and leave this here for discussion on caching?

@carltongibson
Copy link
Member

OK, yeah, sure. Pull the smaller change to a fresh branch and we can rebase after that.

@marksweb marksweb changed the title Fix feed aggregator n+1 issue [#1261] Fix feed aggregator n+1 issue & cache the model data [#1261] Nov 30, 2022
@marksweb marksweb closed this Sep 20, 2023
@marksweb marksweb deleted the fix/1261/feeds-n+1 branch September 20, 2023 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants