Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track history of M2M fields #399

Closed
rossmechanic opened this issue May 31, 2018 · 21 comments
Closed

Track history of M2M fields #399

rossmechanic opened this issue May 31, 2018 · 21 comments

Comments

@rossmechanic
Copy link
Collaborator

Initial work done in #370 by @avalanchy. Open to discussion here of how we might want to track history on M2M fields and see if that's a feature we want to accept.

@atodorov
Copy link
Contributor

atodorov commented Jul 8, 2018

@rossmechanic I like the m2m_fields parameter to HistoricalRecords from the pull request.

I am not certain I understand self.use_last_historical_record() though.

Re: help-wanted - I can cherry-pick this PR and work on additional enhancements if you and @avalanchy don't have the time for it. FYI this is a feature that I need implemented.

@avalanchy
Copy link

avalanchy commented Jul 9, 2018

@atodorov: I am very happy that you want to work on it and I will be glad to help you with code review and even testing.

The use_last_historical_record could be considered as a separate patch. It's a syntactic sugar for an existing feature self.skip_history_when_saving = True. With this function you don't have to set the attribute, and delete it after you are done.

When it comes to your lack of understanding: I recall the moment when I designed this feature, back then I assumed (but didn't implement yet) that modifying m2m relation will result in creating a new historical entry (Example 1).

Example 1: Single entry, then single attachment

Adding new Attachment to an Entry (HE stands for HistoricalEntry, E_A is a trough model).

Entry.objects.create()

In database we have:

E1
HE1

Adding an attachment should result with the new version because it's not anymore the same entry, but now it's entry with an attachment.

a = Attachments.objects.create()
entry.attachemnts.add(a)
E1 - E1_A1 - A1
HE1
HE2 - HE2_A1 - A1

That seems fine right? Adding an attachment created a full snapshot and 2 historical objects - regular and trough.

Example 2: Creating a new instance with m2m fields already

Suppose we want to create an entry with attachments from the very beginning

e = Entry.objects.create()
a = Attachment.objects.create()
e.attachments.add(a)
E1 - E1_A1 - A1
HE1
HE2 - HE2_A1 - A1

We ended up with two historical entries, but in fact, we did not want to have a historical entry without attachment.

To avoid that situation I introduced use_last_historical_record.

e = Entry.objects.create()
a = Attachment.objects.create()
with self.use_last_historical_record():
    e.attachments.add(a)
E1 - E1_A1 - A1
HE1 - HE1_A1 - A1

This way we can create m2m relation without adding an extra historical object and end up with a single snapshot.

I think that the operation is complicated and the interface does not explain it, so I think that figuring out something more obvious will be great.

@atodorov
Copy link
Contributor

atodorov commented Jul 9, 2018

Re: use_last_historical_record:

I think that the operation is complicated and the interface does not explain it, so I think that figuring out something more obvious will be great.

@avalanchy I think this should be the default behavior (not creating an intermediate history record) but how would we accomplish that without an extra function I have no idea. I agree this can be discussed separately though. For my use-case I won't care much if there was an extra record in the database or not.

@rossmechanic, @treyhunner - how should we proceed with the feature ? What sort of feedback and discussion would you like to see before this is merged ?

@rossmechanic
Copy link
Collaborator Author

I personally think that including the extra intermediary record makes the most sense, as that is actually what happens (The entry is saved, and then the attachment is added to the record).

So from what I can tell, this m2m functionality would only be for m2m_fields without a custom through table correct?

Also, @avalanchy in your diagrams, you have HE2 - HE2_A1 - A1, which has a historical record HE2_A1 pointing to another historical record. How would that be possible?

From a high level, I like @avalanchy 's approach, and I'm happy for @atodorov to put up a PR that we can discuss so that we get to functionality that makes the most sense. I would leave out the use_last_historical_record functionality from this PR and we can add it as a follow-on if it's something that people want. How does that sound?

@avalanchy
Copy link

@atodorov: How about to define custom trough table with addtional history_date field that has auto_now_add=True. This way we could update historical model and historical trough model independently.

The disadvantage of this solution is that querying for objects at a specific point in time will be more complicated. Also admin panel will require adjustments.

@atodorov
Copy link
Contributor

How about to define custom trough table with addtional history_date field that has auto_now_add=True. This way we could update historical model and historical trough model independently.

I am not sure I follow this. Please give an example.

OTOH I do have a current project where for every m2m relationship we have the through model defined manually. I don't really like that since Django is supposed to take care of it and not expose that to the developer.

@nessalc
Copy link

nessalc commented Jan 8, 2019

I agree with @atodorov that Django is supposed to take care of through models, but even if I were to create the model as @avalanchy suggests, I'm not sure how it would work, either. Could you explain this further?

@avalanchy, or anyone else: have you continued to work on the code, or is the "state-of-the-art" at this point still what's in 5436bac?

@vhelke
Copy link

vhelke commented Aug 8, 2019

What's the current suggested way of tracking history for M2M relationships?

@AdamBolfik
Copy link

I need this functionality and am going to have to implement it one way or another pretty soon. I'd like to make it as a contribution to this project, though...

Can we get a final decision on this?

@bmedx
Copy link
Contributor

bmedx commented Dec 18, 2019

@rossmechanic we are also in the situation where this functionality is suddenly critical in the near term. I know this feature has been batted around for a few years, we have the resources to implement it but want to make sure our changes are something you will take back upstream. We would like to build off of @avalanchy 's work, exclusively for models that aren't using through, and without the use_last_historical_record functionality.

Aside from the TODOs in that PR, tests, admin functionality, etc. and meeting the coding standards for the project is there anything else you would like to see in a PR to get this working?

@bmedx
Copy link
Contributor

bmedx commented Feb 5, 2020

@rossmechanic I have a working implementation in the above fork PR. ☝️ I'm writing up documentation for the current state of things, but would appreciate it if one of the maintainers could take a look. I'd especially be grateful for which edge cases would make sense to add further tests for.

Things to note:

  • Django many-to-many fields are hard coded to cascade deletes, there is no way to override them short of monkey patching, so I had to substantially deviate from @avalanchy 's PR.
  • The implementation keeps the m2m rows in lock step with history rows. If you create a history row, every m2m row gets saved off with it. If you change the m2m, you get a new history row. This obviously has some significant performance implications, especially in cases where there are multiple m2m's with history. The is done in the interest of following Django's decision to treat the m2m relations as a field and to centralize history access.
  • Some second order features aren't updated to include m2ms yet (bulk history create, revert to previous history state), some are (diff).
  • Through models aren't supported, since they have the ability to maintain their own history.

@Aameer
Copy link

Aameer commented Mar 10, 2020

Hey Guys, any ETA on when/if this could be merged into master? tracking M2M is one of the core requirements for me to use django-simple-history in my project and judging from this thread it seems to be the case for many others as well. Adding this feature would certainly boost the usage of the library. Thanks.

@loctimize
Copy link

Same here. Any update would be welcome. Thanks

@Aameer
Copy link

Aameer commented Mar 30, 2020

Hey Guys, any ETA on when/if this could be merged into master? tracking M2M is one of the core requirements for me to use django-simple-history in my project and judging from this thread it seems to be the case for many others as well. Adding this feature would certainly boost the usage of the library. Thanks.

for now I ended up using django-reversion

@vesnikos
Copy link

not be able to track M2M fields also breaks HistoricalChanges.diff_against function.

@jsnb
Copy link

jsnb commented Dec 9, 2020

Following up on a work around proposed here: #16 (comment)

Does this mean, that the suggested solution "custom through model", consists in creating a custom "through model" as explained here for instance https://docs.djangoproject.com/en/1.10/topics/db/models/#intermediary-manytomany and configure the historical records both on containing model (the model that declares the ManyToMany) and the custom through model ?

from django.db import models
from simple_history.models import HistoricalRecords

class Person(models.Model):
    name = models.CharField(max_length=128)

    def __str__(self):              # __unicode__ on Python 2
        return self.name

class Group(models.Model):
    name = models.CharField(max_length=128)
    members = models.ManyToManyField(Person, through='Membership')
    history = HistoricalRecords()

    def __str__(self):              # __unicode__ on Python 2
        return self.name

class Membership(models.Model):
    person = models.ForeignKey(Person, on_delete=models.CASCADE)
    group = models.ForeignKey(Group, on_delete=models.CASCADE)
    date_joined = models.DateField()
    invite_reason = models.CharField(max_length=64)
    history = HistoricalRecords()

How would be the manner to query the Group history and retreive the members ?

While not ideal I find this solution would be totally acceptable only it doesn't seem to record all the history on the through table. When calling group_instance.members.add(person_instance) the history table fails to record the "create" action where as group_instance.members.remove(person_instance) does record the deletes.
Wondering if there is a solution out there that doesn't involve creating the instance on the through table directly (eg Membership.objects.create(person=person_instance, group=group_instance)

@jeking3
Copy link
Contributor

jeking3 commented Sep 3, 2021

I need support for this so I will be evaluating the different solutions presented.

@jeking3
Copy link
Contributor

jeking3 commented Sep 7, 2021

After looking into solutions and doing some searching (found https://stackoverflow.com/questions/37650362/understanding-manytomany-fields-in-django-with-a-through-model), I decided to remove ManyToManyFields in my model, version the "through" table, and provide the necessary glue in either end to return QuerySets that make sense. Some syntactic sugar is lost this way but it was pretty easy to do in my models.

@joaquimds
Copy link

joaquimds commented Oct 26, 2021

I have kept the ManyToManyFields in my models, and created a Through model with a HistoricalRecords field for each one. This works when creating the Through instances explicitly, but not when using the add or set methods on the ManyToManyField. This is because internally django uses a bulk_create to perform these operations.

(See django.db.models.fields.related_descriptors, ManyRelatedManager, _add_items.)

This means the save() method is not called on the Through instance, so no record is added to the historical table.

I have fixed this by adding listeners to the m2m_changed signal:

m2m_changed.connect(save_m2m_history, sender=MyModel)

def save_m2m_history(my_model_instance, m2m_action, **kwargs):
    """
    Create historical records of m2m instances
    """
    # Only create records after the m2m field is updated (ignore "pre" signals)
    if m2m_action.startswith("post"):
        for m2m_instance in my_model_instance.m2m_field.all():
            m2m_instance.save()

The benefit of doing it this way (as opposed to @jeking3's approach) is that it's easier to use many-to-many form components, e.g. ModelMultipleChoiceField. I think!

ddabble added a commit to MAKENTNU/web that referenced this issue Apr 6, 2022
`django-simple-history` doesn't currently track changes to many-to-many relationships - which was the main point of this commit
(i.e. to track the changes to `own_permissions`) - but once jazzband/django-simple-history#399 is resolved
(likely through jazzband/django-simple-history#932), it will.
@ddabble
Copy link
Member

ddabble commented Sep 28, 2022

Was this resolved by #932? Or is merging #1042 necessary as well? In the former case, this issue can be closed 🙂

@ddabble
Copy link
Member

ddabble commented Nov 21, 2022

Closing, as the issue seems resolved after both the previously mentioned PRs have been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests