Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff improvements (performance + included_fields) #776

Merged
merged 7 commits into from Sep 22, 2021
Merged

Diff improvements (performance + included_fields) #776

merged 7 commits into from Sep 22, 2021

Conversation

RealOrangeOne
Copy link
Member

@RealOrangeOne RealOrangeOne commented Jan 6, 2021

Description

Reduce the number of queries needed when calling diff_against (down to 0).

Also implement included_fields, so only certain fields will be compared. In this new implementation of diffing, old fields won't be compared at all, and can be deferred from querying if needed.

Related Issue

N/A for performance improvement.

included_fields was mentioned in #576 (comment), and as I was here, I added. Can extract into a separate PR if needed.

Motivation and Context

model_to_dict will do queries for M2M fields against real tables. As the history tables don't use actual relationships, it's safe to use those for the lookups without additional queries. Removing this removes accidental queries on models with M2M fields, and stops loading them into memory unnecessarily.

Additionally, diffing involved getting real instances of the model using .instance, which was unnecessary. The diffing can be using the history instances directly, saving a lot of computation. The fields list is pulled off the real model which means history_*fields are still excluded correctly. .instance will sometimes do queries, hence removing the need for this can additionally improve performance.

There are also additional tests to check that diffing is pure and doesn't touch the DB, to help prevent regressions.

How Has This Been Tested?

  • Tests have been added to cover changes

  • Has been run against live project unittests successfully

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have run the make format command to format my code
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have added my name and/or github handle to AUTHORS.rst
  • I have added my change to CHANGES.rst
  • All new and existing tests passed.

@codecov
Copy link

codecov bot commented Jan 6, 2021

Codecov Report

Merging #776 (72199cb) into master (650213f) will increase coverage by 0.10%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #776      +/-   ##
==========================================
+ Coverage   97.69%   97.79%   +0.10%     
==========================================
  Files          19       19              
  Lines         999      999              
  Branches      151      151              
==========================================
+ Hits          976      977       +1     
  Misses         10       10              
+ Partials       13       12       -1     
Impacted Files Coverage Δ
simple_history/models.py 98.17% <100.00%> (+0.30%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 650213f...72199cb. Read the comment docs.

@RealOrangeOne RealOrangeOne marked this pull request as ready for review January 12, 2021 17:21
@RealOrangeOne
Copy link
Member Author

Turns out the current code also causes additional queries for excluded fields, which isn't good. Would be great to get this merged and released soon, as the performance improvements are vast!

sebashwa
sebashwa previously approved these changes Sep 14, 2021
@jeking3
Copy link
Contributor

jeking3 commented Sep 16, 2021

If you could rebase on master and make sure all the new branches are covered by tests, will re-review.

Copy link
Contributor

@jeking3 jeking3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave the KeyError handling in place for invalid fields and add this test to the HistoricalRecordsTest:

+        with self.assertRaises(KeyError):
+            delta = new_record.diff_against(old_record, included_fields=["this_field_does_not_exist"])

Also rebase on current master, then if we get a clean run we'll be all set.

This stops unnecesary queries from `model_to_dict` for M2M values
Queries are always run when using a base model for some reason.
This not only means comparison is done on primitive values, but as a side effect removes the extra query for base models
@jeking3 jeking3 merged commit 65c66b6 into jazzband:master Sep 22, 2021
jeking3 pushed a commit that referenced this pull request Sep 22, 2021
@RealOrangeOne RealOrangeOne deleted the diff-improvements branch September 22, 2021 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants