Skip to content
This repository has been archived by the owner on Aug 26, 2022. It is now read-only.

bug 1269104: Refactor and optimize revision history #4297

Merged
merged 7 commits into from Jul 7, 2017

Conversation

jwhitlock
Copy link
Contributor

Optimize the database queries for the document's revision history page in a few ways:

  • Only request columns from the document, revision, and user tables used to display the revision history, reducing the data transferred and making the calls faster.
  • Iterate over pairs of revision IDs and the previous revision Ids, avoiding the revision.previous_revision work-around.
  • Add a load_previous=True parameter to the Jinja helper format_comment, which can be set False to avoid querying the database when previous_revision is None.

The effects of the changes are:

  • Reduces the number of DB queries from 15 to 10 (plus some waffle queries if cache is cold)
  • Speeds up the DB queries by 2.5x in the local environment,
  • Makes the UI more consistent by adding a "Previous" link for the first translation revision
  • Removes the need for the custom TransformManager and TransformQueryset for Revision, used to populate the previous_revision element.

@codecov-io
Copy link

codecov-io commented Jun 30, 2017

Codecov Report

Merging #4297 into master will increase coverage by 0.47%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #4297      +/-   ##
=========================================
+ Coverage   87.83%   88.3%   +0.47%     
=========================================
  Files         162     163       +1     
  Lines       10045   10205     +160     
  Branches     1371    1413      +42     
=========================================
+ Hits         8823    9012     +189     
+ Misses        993     967      -26     
+ Partials      229     226       -3
Impacted Files Coverage Δ
kuma/wiki/managers.py 77.92% <ø> (-1.1%) ⬇️
kuma/wiki/models.py 89.03% <ø> (-0.02%) ⬇️
kuma/wiki/templatetags/jinja_helpers.py 88.42% <100%> (-0.1%) ⬇️
kuma/wiki/views/list.py 76.84% <100%> (+6.13%) ⬆️
kuma/scrape/scraper.py 100% <0%> (ø) ⬆️
kuma/scrape/storage.py 100% <0%> (ø) ⬆️
kuma/scrape/sources/revision.py 100% <0%> (ø) ⬆️
kuma/scrape/sources/__init__.py 100% <0%> (ø) ⬆️
kuma/scrape/sources/document_history.py 100% <0%> (ø) ⬆️
kuma/scrape/sources/base.py 100% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6cb8620...d6730e8. Read the comment docs.

Minimize database queries for the document, revisions, and creators,
to only request data used in template rendering, and to reduce the total
number of DB queries from 15 to 10. This gives a 2.5x speed-up in the
local environment, and gets us a step closer to allowing scrapers.

New template context also adds "Previous" link for first translation
revision, when the English base revision is known, for a consistent UI.
The TransformManager was used for Revisions for the .transform method,
which supported adding .previous_revision to the objects in a queryset.
This is no longer used, so the standard Django manager can be used
again.
return revisions
# Load document with only fields for history display
try:
document = (Document.objects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use get_object_or_404 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I switched to get_object_or_404 with a longer queryset, and now Http404 is no longer imported.

@safwanrahman
Copy link
Contributor

@jwhitlock Thanks a lot for this PR. It surely needs a closer review!

@escattone escattone self-assigned this Jul 6, 2017
Copy link
Contributor

@escattone escattone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jwhitlock, the code is simpler, more readable, faster, and the new and improved tests provide 100% line coverage of the revisions view. What's not to like?!

I have two minor requests for you to consider.

# Load document with only fields for history display
doc_query = (Document.objects
.only('id', 'locale', 'slug', 'title',
'parent__slug', 'parent__locale')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding 'current_revision_id' to the list of arguments for only, it will remove the need for one additional query (according to my tests) due to this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, got too excited about the changes in 960d66f...

Keyword Arguments:
rev - The revision
previous_revision - The previous revision (default None)
load_previous - Try loading previous revision if None (default True)
"""
prev_rev = getattr(rev, 'previous_revision', previous_revision)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any other code within Kuma that adds a previous_revision attribute to a revision (and uses format_comment), so it seems safe to change this to prev_rev = previous_revision (or perhaps even better, remove this line and rename the previous_revision argument to prev_rev)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! More code to delete! 🎉

Pre-fetch document.current_revision_id to avoid loading it from the
database in wiki/list/revisions.html
Revision.previous_revision was optionally added by the TransformManager,
which has been removed.  Instead, use previous_revision directly.
Copy link
Contributor

@escattone escattone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jwhitlock!

@escattone escattone merged commit 127f92f into mdn:master Jul 7, 2017
@jwhitlock jwhitlock deleted the optimize_history_1269104 branch July 10, 2017 18:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants