Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize and simplify filterable list preview rendering #8450

Merged
merged 14 commits into from
Jun 14, 2024

Conversation

chosak
Copy link
Member

@chosak chosak commented May 31, 2024

This PR rewrites the way that our filterable list results are rendered, in an attempt to simplify and optimize. The short summary of what this does is:

  • Optimizes the way we paginate results from Elasticsearch to avoid pulling back a huge list of page IDs
  • Implements a serializer for filterable list results so that the Jinja is dealing with basic data types
  • Removes the need for our custom "post preview cache"

Details

Filterable list queries in the existing implementation before this PR include the following steps:

  1. We do an Elasticsearch query to retrieve pages matching the user's filtering criteria.
  2. We read back the full list of matching page IDs from ES (this could be thousands of IDs for e.g. the activity log or blog).
  3. We then paginate this giant list and retrieve only a subset (one results page's worth) of Page objects from the Wagtail database.
  4. The results template iterates over each page and renders a post preview.
  5. For each post preview, our Jinja logic makes various additional DB queries to pull back page authors, categories, and tags.
  6. To mitigate the above, we have an old "post preview cache" built on custom code that caches the HTML of the post preview itself.

Because our Django caches are implemented in the database, this means that at best, even after retrieving the pages from the DB, each post preview render requires an additional DB query to pull out the cached HTML. On a page like the blog with 25 results, that means 25 DB queries just to pull those back.

If you start a fresh local Django development server with no caching (our default developer setup),using Django Debug Toolbar (ENABLE_DEBUG_TOOLBAR=1 ./runserver.sh), and visit http://localhost:8000/about-us/blog/, on the main branch (in an incognito window) we currently make 422 DB queries. If you enable local caching (ENABLE_DEBUG_TOOLBAR=1 ENABLE_DEFAULT_CACHE=1 ENABLE_POST_PREVIEW_CACHE=1 ./runserver.sh) and load that page twice (to fill the caches), we then make 105 DB queries for that page.

With these changes, we do the following instead:

  1. We do an Elasticsearch query to retrieve pages matching the user's filtering criteria.
  2. We paginate the results in Elasticsearch and pull back only the page IDs we need for a single page of results.
  3. We retrieve only these Page objects from the Wagtail database, and use prefetch_related to pull back all page authors, categories, and tags at once.
  4. We use a new DRF serializer class to serialize these Page objects into basic Python types.
  5. The results template then iterates over these basic objects and renders them without having to do any additional Python logic or DB queries. We no longer need a post preview cache.

On this branch, using ENABLE_DEBUG_TOOLBAR=1 ./runserver.sh, visiting http://localhost:8000/about-us/blog/ makes 36 queries. If you enable the default cache using ENABLE_DEBUG_TOOLBAR=1 ENABLE_DEFAULT_CACHE=1 ./runserver.sh and load the page twice, we make only 33 queries.

(I will note that our basic page rendering of something simple like http://localhost:8000/ takes ~17 queries or so. A bunch of these (~7) are unavoidable Wagtail queries for page lookup. We add lookups for our sharing site, page banners, feature flags, breadcrumbs, and translations, depending on page type. There might be more to do to squeeze these in future should we want to look more at performance, but we definitely have a few places like FLs where we routinely make 100+ queries per page.)

How to test this PR

The best way to test this is to compare various filterable list pages like the blog with production, as there should be no user-facing changes. A good test URL is the events archive page where we list events with both a map or an image.

Notes and todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code follows the standards laid out in the CFPB development guidelines
  • Future todos are captured in comments and/or tickets
  • Project documentation has been updated, potentially one or more of:

Comment on lines +214 to +222
if self._count is None:
self._count = self.search_obj.count()
return self._count
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes in this file are to cache the ES results count where possible. Currently we make multiple calls to retrieve the count but we don't need to do that unless the filtering criteria changes.

@@ -13,6 +13,19 @@
from v1.util.ref import get_category_children


class SearchResultsPaginator(Paginator):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic lets us move the pagination upstream into the Elasticsearch query instead of pulling back the full list of page IDs from ES and paginating that. I dug into the django-opensearch-dsl code and there doesn't seem to be anything built-in to support this.

Adding prefetch_related here also lets us prefetch the page related objects to avoid the N+1 queries problem.

cfgov/v1/serializers.py Show resolved Hide resolved
object_list = (
object_list.to_queryset()
.specific()
.live()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willbarton calling this out specifically as a last-minute way to filter out any not-live pages that may still be in the ES index for some reason.

Although the current code does try to filter out not-live pages as part of the ES query, it doesn't actually check the status at the place those results get converted to Page objects.

@csebianlander
Copy link
Contributor

Everything looks OK to me here but I'm definitely not the best one to do a more in-depth review of the code!

Copy link
Member

@wpears wpears left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make sense to me, not really steeped in the intricacies enough to have a specific opinion on some of the details. I did check that the prefetch_related bit is being used correctly everywhere (no surprises that it is)

@willbarton
Copy link
Member

willbarton commented Jun 5, 2024

We use a new DRF serializer class to serialize these Page objects into basic Python types.

I'm curious about this choice — after prefetching related, what does this step do for us?

My impression from reading the PR is that this moves the logic around page type out of the preview template, is that the intention?

@chosak
Copy link
Member Author

chosak commented Jun 5, 2024

We use a new DRF serializer class to serialize these Page objects into basic Python types.

I'm curious about this choice — after prefetching related, what does this step do for us?

My impression from reading the PR is that this moves the logic around page type out of the preview template, is that the intention?

Yes, that's correct. This work was motivated by my experience using DRF on TCCP.

There are benefits and drawbacks, but the main idea is to better separate the 3 pieces of code: fetching data from the DB, preparing that data for rendering, actually rendering.

Django's template language is powerful and lets you do things like {% for category in page.categories.all() %}{{ category.name }}{% endfor %} in the template; this reads simpler but lately I've been feeling like this actually hides complexity and introduces challenges with testing and optimizing. In the given example, the category objects aren't actually fetched from the DB until the template is rendered, and the actual thing we want to render (the category name) similarly isn't queried until that time. In order to test this code we have to put data in the DB and render the entire template, verifying that the names appear properly. There's a tight coupling between the backend database and the template code.

By pre-serializing the objects, we can pre-prepare the list of category names (if that's all we need) and the template can be {% for category_name in category_names %}{{ category_name }}{% endfor %}. Someone editing the template only needs to know that the template is receiving a list of category name strings, and doesn't have to worry about DB queries or backend code. This makes the template easier to test. We can test the serializer independently, verifying that it converts the DB objects to simple types (I still need to write those tests for this PR).

Conceptually this also makes it easier to migrate to or even just think about a more loosely coupled backend API + frontend rendering model, where the backend produces some data and the frontend renders it, and neither need to worry about each other.

@willbarton
Copy link
Member

Conceptually this also makes it easier to migrate to or even just think about a more loosely coupled backend API + frontend rendering model, where the backend produces some data and the frontend renders it, and neither need to worry about each other.

That's what I was wondering about, and I love this.

@willbarton
Copy link
Member

@willbarton if you think this is a promising approach I will finish it out but I wanted to run it by you before completing.

@chosak I definitely think so!

@chosak chosak force-pushed the optimize/filterable-results branch 3 times, most recently from 88dbd7f to 2ecccd3 Compare June 12, 2024 19:20
@chosak
Copy link
Member Author

chosak commented Jun 12, 2024

This PR is now ready for review. I've added full test coverage and done more regression testing against various pages to ensure that all functionality continues to work. The coverage failure is because the PR changed settings/local.py, but this isn't a file that we had coverage for before.

I want to restate known user-facing changes here:

  1. Currently filterable list pages link to their own filtered URL e.g. href="/about-us/blog/?topics=foo. With this change we only link to the relative filtered URL e.g. href="?topics=foo". This was already mentioned to @csebianlander above.
  2. Currently post preview tags render in a random order -- compare the order of tags as entered in the Wagtail editor and the order when rendered (example). This PR changes them to be always alphabetical (and does the same for categories as well). This also impacts the order of categories/tags when rendered in RSS feeds.

@chosak chosak marked this pull request as ready for review June 12, 2024 19:38
Copy link
Member

@willbarton willbarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the pattern of using DRF serializers to pull logic out of our templates. We should do more of this ❤️.

This works as advertised in all my testing!

cfgov/v1/feeds.py Show resolved Hide resolved
try:
return Menu.objects.get(language=default_language)
except Menu.DoesNotExist:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes you have here are fine, and no action is necessary on this comment, but I love nothing about this function. I wonder if there's a way to avoid repeating the try/except.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, thanks to the code snippet you shared with me and I've pushed as a commit to this PR!

@chosak chosak force-pushed the optimize/filterable-results branch from c01dd46 to b68b70c Compare June 14, 2024 15:20
@chosak chosak added this pull request to the merge queue Jun 14, 2024
Merged via the queue into main with commit 942be70 Jun 14, 2024
16 of 17 checks passed
@chosak chosak deleted the optimize/filterable-results branch June 14, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants