Skip to content
This repository has been archived by the owner on Apr 14, 2019. It is now read-only.

WIP: add generic_dedupe module for tracking dedupe changes. #211

Closed
wants to merge 3 commits into from
Closed

WIP: add generic_dedupe module for tracking dedupe changes. #211

wants to merge 3 commits into from

Conversation

bcipolli
Copy link
Collaborator

To do:

  • Unit tests

This is an alternate approach to #201 - tracking deduping, allowing manual deduping through the admin interface, and making deduping reversible.

The approach here is:

  • Add a true_model_id field for any entry that can be deduped.
  • When true_model_id is set, all foreign keys onto that model are automagically migrated to the "true model", and a DedupeLogEntry is added for each foreign key that was changed.
  • If true_model_id is changed, then the old DedupeLogEntry values are used to revert to the original model, and the new true_model_id is applied.

I've tested this manually and it seems to work (set / reset / unset). I will work on unit tests, but I wanted to get a basic version out there for discussion first.

The code isn't huge, but it's complicated and likely incomplete. The goal here is to start with an MVP that works for our core use-cases.

The models used in this code are:

  • DedupeMixin - adds the true_model_id field and provides a filtered_objects manager that gives a list of rows that are core data / not unwanted variants.
  • DedupeLogEntry - extends LogEntry and keeps track of info to make dedupe reversible.

Finally, some screen shots:

Setting "true model" to an alternate city:
image

Dedupe log entries auto-created, showing the foreign keys set by the above manual setting.
image

Unset the "true model", and ...
image

The foreign keys are reset (not shown here), and the log entries are deleted:
image

@bcipolli bcipolli changed the title ENH: add generic_dedupe module for tracking dedupe changes. WIP: add generic_dedupe module for tracking dedupe changes. Feb 27, 2016
@bcipolli bcipolli added this to the v0.2-data milestone Feb 27, 2016
@bcipolli bcipolli self-assigned this Feb 27, 2016
@bcipolli
Copy link
Collaborator Author

I have made a develop branch, where we can merge features like these for further use and testing.

@bcipolli bcipolli closed this Feb 28, 2016
@bcipolli bcipolli mentioned this pull request Feb 28, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant