Diff improvements (performance + `included_fields`) #776

RealOrangeOne · 2021-01-06T17:06:35Z

Description

Reduce the number of queries needed when calling diff_against (down to 0).

Also implement included_fields, so only certain fields will be compared. In this new implementation of diffing, old fields won't be compared at all, and can be deferred from querying if needed.

Related Issue

N/A for performance improvement.

included_fields was mentioned in #576 (comment), and as I was here, I added. Can extract into a separate PR if needed.

Motivation and Context

model_to_dict will do queries for M2M fields against real tables. As the history tables don't use actual relationships, it's safe to use those for the lookups without additional queries. Removing this removes accidental queries on models with M2M fields, and stops loading them into memory unnecessarily.

Additionally, diffing involved getting real instances of the model using .instance, which was unnecessary. The diffing can be using the history instances directly, saving a lot of computation. The fields list is pulled off the real model which means history_*fields are still excluded correctly. .instance will sometimes do queries, hence removing the need for this can additionally improve performance.

There are also additional tests to check that diffing is pure and doesn't touch the DB, to help prevent regressions.

How Has This Been Tested?

Tests have been added to cover changes
Has been run against live project unittests successfully

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

I have run the make format command to format my code
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
I have added my name and/or github handle to AUTHORS.rst
I have added my change to CHANGES.rst
All new and existing tests passed.

codecov · 2021-01-06T17:10:35Z

Codecov Report

Merging #776 (72199cb) into master (650213f) will increase coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #776      +/-   ##
==========================================
+ Coverage   97.69%   97.79%   +0.10%     
==========================================
  Files          19       19              
  Lines         999      999              
  Branches      151      151              
==========================================
+ Hits          976      977       +1     
  Misses         10       10              
+ Partials       13       12       -1

Impacted Files	Coverage Δ
simple_history/models.py	`98.17% <100.00%> (+0.30%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 650213f...72199cb. Read the comment docs.

RealOrangeOne · 2021-02-04T16:02:42Z

Turns out the current code also causes additional queries for excluded fields, which isn't good. Would be great to get this merged and released soon, as the performance improvements are vast!

jeking3 · 2021-09-16T02:49:07Z

If you could rebase on master and make sure all the new branches are covered by tests, will re-review.

jeking3

I would leave the KeyError handling in place for invalid fields and add this test to the HistoricalRecordsTest:

+        with self.assertRaises(KeyError):
+            delta = new_record.diff_against(old_record, included_fields=["this_field_does_not_exist"])

Also rebase on current master, then if we get a clean run we'll be all set.

This stops unnecesary queries from `model_to_dict` for M2M values

Queries are always run when using a base model for some reason.

This not only means comparison is done on primitive values, but as a side effect removes the extra query for base models

RealOrangeOne marked this pull request as ready for review January 12, 2021 17:21

sebashwa mentioned this pull request Sep 14, 2021

Avoid m2m selects when diffing #881

Closed

sebashwa previously approved these changes Sep 14, 2021

View reviewed changes

jeking3 requested changes Sep 20, 2021

View reviewed changes

RealOrangeOne added 7 commits September 22, 2021 12:40

Simply check the history attributes rather than serializing.

ccdd0ef

This stops unnecesary queries from `model_to_dict` for M2M values

Add support for included_fields when diffing

9b40852

Add assertNumQueries assertions to ensure no queries are run

2b76768

Queries are always run when using a base model for some reason.

Document new included_fields argument to diff_against

28bb44c

Update changelog with #776 changes

b108f34

Reuse model_to_dict

ed7dd7d

This not only means comparison is done on primitive values, but as a side effect removes the extra query for base models

Fail hard if an unknown field is provided

72199cb

RealOrangeOne dismissed sebashwa’s stale review via 72199cb September 22, 2021 11:42

RealOrangeOne requested a review from jeking3 September 22, 2021 11:54

jeking3 approved these changes Sep 22, 2021

View reviewed changes

jeking3 merged commit 65c66b6 into jazzband:master Sep 22, 2021

jeking3 pushed a commit that referenced this pull request Sep 22, 2021

Update changelog with #776 changes

afd572b

RealOrangeOne deleted the diff-improvements branch September 22, 2021 18:44

dracos mentioned this pull request Nov 25, 2021

Fix diff_against on model with non-editable fields #923

Merged

11 tasks

dave-v mentioned this pull request May 24, 2022

diff_against fails when HistoricalRecords.excluded_fields are set #992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diff improvements (performance + `included_fields`) #776

Diff improvements (performance + `included_fields`) #776

RealOrangeOne commented Jan 6, 2021 •

edited

codecov bot commented Jan 6, 2021 •

edited

RealOrangeOne commented Feb 4, 2021

jeking3 commented Sep 16, 2021

jeking3 left a comment

Diff improvements (performance + included_fields) #776

Diff improvements (performance + included_fields) #776

Conversation

RealOrangeOne commented Jan 6, 2021 • edited

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

codecov bot commented Jan 6, 2021 • edited

Codecov Report

RealOrangeOne commented Feb 4, 2021

jeking3 commented Sep 16, 2021

jeking3 left a comment

Choose a reason for hiding this comment

Diff improvements (performance + `included_fields`) #776

Diff improvements (performance + `included_fields`) #776

RealOrangeOne commented Jan 6, 2021 •

edited

codecov bot commented Jan 6, 2021 •

edited