Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete all but last 3 auto-saves #2219

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

mael-fosso
Copy link
Contributor

Short description

Delete all but the last 3 autos-saves and renumber all affected versions to be continuous.

Proposed changes

  • we implemented a static method cleanup_auto_save(). This method clears all auto-save except the last three (per default).
  • Then we used the bulk_update() method to update the version of the pages.

Side effects

  • when we renumbering auto-save bulk_update() we previously got IntegrityErrors , probably because postgresql enforced unique constraints before the update statement was finished. We implemented the workaround using individual Data base statement in a For-loop, but revisiting this now we cannot reproduces this. We reverted to bulk_update() but please test this thoroughly to sure this doesn't break.

Resolved issues

Fixes: #2067


Pull Request Review Guidelines

@mael-fosso mael-fosso requested a review from a team as a code owner April 9, 2023 18:14
@codeclimate
Copy link

codeclimate bot commented Apr 9, 2023

Code Climate has analyzed commit db6622f and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 56.5% (50% is the threshold).

This pull request will bring the total coverage in the repository to 81.6% (-0.1% change).

View more on Code Climate.

@mael-fosso mael-fosso force-pushed the feature/delete-all-but-last-3-saves branch 2 times, most recently from f9bb858 to 3621193 Compare April 12, 2023 20:28
@MizukiTemma
Copy link
Member

Maybe a changelog entry? 🙂

@PeterNerlich
Copy link
Contributor

Right, that one thing that never comes to mind after getting everything to work...

@mael-fosso You can find all the information you need here in the docs, if you want to try it on your own. It's pretty straight forward!

@MizukiTemma
Copy link
Member

MizukiTemma commented Apr 14, 2023

A question for the issue itsself: max. 3 auto save versions in all over version history or between 2 nearest manual saves?

Example: publish - auto save - publish - auto save - publish - auto save - auto save - publish

Should all 4 auto saves remain there in this case because there is no more than 3 auto saves in each manual save interval?

The current code keeps up to 3 auto save versions in all over version history (the first auto save (version 2 in the example) will be deleted).
Though I don' think the current behaviour harms anything because all manual saves stay there, if max 3 in each interval is explicitly wished we have to change how to calculate the earlist and affected_versions.

@PeterNerlich
Copy link
Contributor

The current code keeps up to 3 auto save versions in all over version history

Yes, that is how we understood the issue description. In my opinion, this also makes most sense, since the motivation is to cut down on auto saves nobody cares about anymore that clutter the UI as well as DB space. I imagine auto saves to only be relevant when they are very recent, and mostly useless when sandwiched between old Draft or Published versions. I don't expect users to want to view or restore them in the latter case.

@membralala
Copy link
Contributor

Hey @mael-fosso and @PeterNerlich, thanks for tackling this issue! 🙂

Should all 4 auto saves remain there in this case because there is no more than 3 auto saves in each manual save interval?

I think one could argue, that an autosave which contains some relevant changes for a page, will be published or explicitly saved as draft at some point, therefore I guess it makes sense to leave only 3 auto-saves overall.

I know, that this was part of the issue requirements, but from a UX perspective I'm not sure about the change of version numbers as I can imagine, that users will refer to them and for a user it might not be very obvious now, when and why version numbers will be squashed. Not sure though, how much of an issue this is in reality.

Also I realized, that auto saves will not be reduced to a total of 3 when using the "Restore and Publish" button in the page revision view. I would have expected this behaviour here as well.

Copy link
Contributor

@membralala membralala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from my comment before, I tried my best to give some ideas and suggestions to improve readability of the code. However, I still stayed quite close to the implementation you provided and chance is high, that there are some django wizards in our team that have even a lot better ideas (which I'm then curious to learn from as well 👀 🧙 )

integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
Copy link
Member

@timobrembeck timobrembeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! 👍

In general, this works as expected, except I could reproduce these errors:

when we renumbering auto-save bulk_update() we previously got IntegrityErrors , probably because postgresql enforced unique constraints before the update statement was finished. We implemented the workaround using individual Data base statement in a For-loop, but revisiting this now we cannot reproduces this. We reverted to bulk_update() but please test this thoroughly to sure this doesn't break.

I also got the IntegrityError when there are gaps of non-deleted versions between a bunch of auto saves (e.g. [draft, autosave, draft, autosave, autosave, autosave, autosave]), I think this is due to the fact that the UniqueConstraint is not deferrable:

By default constraints are not deferred. A deferred constraint will not be enforced until the end of the transaction. An immediate constraint will be enforced immediately after every command.

https://docs.djangoproject.com/en/3.2/ref/models/constraints/#deferrable

We could solve this by setting deferrable=Deferrable.DEFERRED, but I think this can only be done globally (not only for this method), and comes with performance penalties, so I would suggest to leave it in a loop and don't use bulk_update in this case.

integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
@PeterNerlich
Copy link
Contributor

I also got the IntegrityError when there are gaps of non-deleted versions between a bunch of auto saves (e.g. [draft, autosave, draft, autosave, autosave, autosave, autosave])

Okay, so we still do have this issue.

We could solve this by setting deferrable=Deferrable.DEFERRED, but I think this can only be done globally (not only for this method), and comes with performance penalties, so I would suggest to leave it in a loop and don't use bulk_update in this case.

Well, it has to be set on the constraint, in the model. It would not affect any other UNIQUE constraints, only the one of the page-language-version combination. I'm unable to judge a) how great the performance impact on transactions involving a deferred constraint actually is and b) where in the cms / how often overall such a transaction occurs.

Assuming this is only happening every time someone drafts or publishes some page content, with every version renumbering bulk_action() call involving 4 DB entries on average (3 auto saves and one Draft/Published – all old autosaves have already been deleted at that point, and I think this is not including the newly generated one from clicking Draft/Publish), and if deferring the constraint would make all transactions take twice as long, then I think it might be worth it...? Obviously, I have no idea what any of those averages actually are for us. Also, I can't believe making 4 individual DB calls instead would be much more performant.

@timobrembeck
Copy link
Member

Well, it has to be set on the constraint, in the model. It would not affect any other UNIQUE constraints, only the one of the page-language-version combination. I'm unable to judge a) how great the performance impact on transactions involving a deferred constraint actually is and b) where in the cms / how often overall such a transaction occurs.

Assuming this is only happening every time someone drafts or publishes some page content, with every version renumbering bulk_action() call involving 4 DB entries on average (3 auto saves and one Draft/Published – all old autosaves have already been deleted at that point, and I think this is not including the newly generated one from clicking Draft/Publish), and if deferring the constraint would make all transactions take twice as long, then I think it might be worth it...? Obviously, I have no idea what any of those averages actually are for us. Also, I can't believe making 4 individual DB calls instead would be much more performant.

Yes I agree, for this specific use case the deferrable constraint would definitely be more performant than the loop, but I'm not completely sure about side effects on the other views.
I'd just argue that for the user, the difference between 4ms and 1ms is not really noticeable, so I think it's not worth it evaluating the performance issues with deferrable constraints at this point in time. At least with the loop, we know that the performance is not ideal, but also not a big problem.

@PeterNerlich PeterNerlich force-pushed the feature/delete-all-but-last-3-saves branch 2 times, most recently from 8512c2e to b94002b Compare May 14, 2023 10:42
@PeterNerlich PeterNerlich force-pushed the feature/delete-all-but-last-3-saves branch from 4d5025c to 1ca3751 Compare May 16, 2023 07:33
@timobrembeck
Copy link
Member

@mael-fosso @PeterNerlich what's the state of this PR? Please re-request review if all requested changes have been addressed.

@PeterNerlich
Copy link
Contributor

The state is: Waiting for Mael to finish his exams for this semester and to find time for continuing this. Mizuki said today their exam period was still going for two weeks, so I hope we'll be able to resume work here in 2–3 weeks.

@PeterNerlich PeterNerlich marked this pull request as draft July 19, 2023 10:16
@timobrembeck
Copy link
Member

@mael-fosso Do you want to continue working on this PR?

@MizukiTemma MizukiTemma force-pushed the feature/delete-all-but-last-3-saves branch from 1ca3751 to 990bdbb Compare November 24, 2023 15:35
@timobrembeck timobrembeck marked this pull request as ready for review February 1, 2024 13:26
Copy link
Member

@timobrembeck timobrembeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your changes! 👍
Even though this PR was stale for a long time, I think there isn't much work left to do to finish it now.
See my comments below for a few final touches.

integreat_cms/cms/models/abstract_content_translation.py Outdated Show resolved Hide resolved
integreat_cms/cms/models/abstract_content_translation.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
integreat_cms/cms/views/pages/page_form_view.py Outdated Show resolved Hide resolved
Copy link
Member

@MizukiTemma MizukiTemma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks generally good, I have only once concern: more than 3 auto save versions remain in the version history after a page was published by the bulk action, though I'm not sure whether to accept this or not.

How to reproduce:

  1. Trigger auto save n (>3 ) times in a brand new page
  2. Go back to the page tree without manual save
  3. Publish the page by bulk action
  4. See in the revision n auto save versions and the last public version

@JoeyStk
Copy link
Contributor

JoeyStk commented Feb 14, 2024

Oh, you're right. I think something similar also happens if you press the button "Publish auto save" in the revision view. I assume that also if you click on that button the saves should be squashed. Is my assumption correct or am I missing something?

@MizukiTemma
Copy link
Member

MizukiTemma commented Feb 14, 2024

@JoeyStk
Yes, the same behaviour in the revision.
For consistent behaviour it's better to adjust the bulk action and version restoration so there are max. auto save versions just like after normal manual save.

@MizukiTemma
Copy link
Member

MizukiTemma commented Feb 19, 2024

@JoeyStk
I tried cleanup_autosaves in the publishing bulk action and found the following problem:

  1. Create a brand new page and trigger more than 3 times auto save.
  2. See the translation status in the tree shows "up to date" (because the first auto save is not minor edit).
  3. Publish the page translation by bulk action
  4. See the auto save versions are squashed but the translation status in the tree shows "missing"

The cause is that the bulk action inherit "minor edit" status from the lastest auto save and the first auto save (not minor edit) is gone by squashing.

Possible solutions are:

  1. Leave it so (but it's confusing with the translation status in my opinion)
  2. No squashing for bulk action (not consistent with normal publishing)
  3. Change the behaviour of bulk action so it always publishes as non minor edit (Is change of existing behaviour acceptable?)

So far as I read in #2337 and #2341 it seems there was no specification that the status of minor edit must be inherited from the latest version, and the option 3 looks the easiest for me 🤔

I'm glad to hear your opinion :)

@PeterNerlich
Copy link
Contributor

PeterNerlich commented Feb 23, 2024

The cause is that the bulk action inherit "minor edit" status from the lastest auto save and the first auto save (not minor edit) is gone by squashing.

Possible solutions are:

  1. Leave it so (but it's confusing with the translation status in my opinion)
  2. No squashing for bulk action (not consistent with normal publishing)
  3. Change the behaviour of bulk action so it always publishes as non minor edit (Is change of existing behaviour acceptable?)

I think there is a fourth option, and it's the one that feels the most natural to me, because it keeps what I interpret to be the intention of the current system intact:

  1. Ensure in cleanup_autosaves that if any of the deleted versions were not a minor edit, the oldest kept version is set to be not a minor edit either.

This is not perfect, actually. We need to tweak this a bit, so cleanup_autosaves produces the proper results from any valid database state. We need to limit the search for a version not set as minor edit to the last chain of consecutive auto saves to delete that is also immediately adjacent to the oldest auto save to keep:

… → auto (major) → Manual → auto → auto → auto → auto

should, when publishing a new version, turn into

… → Manual → auto → auto → auto → Public

and not

… → Manual → auto (major) → auto → auto → Public

@MizukiTemma MizukiTemma force-pushed the feature/delete-all-but-last-3-saves branch from 7a8bafb to ec5b8ee Compare March 1, 2024 14:34
Co-authored-by: Mael Fosso <fosso@integreat-app.de>
Co-authored-by: Philip Popien <popien.philip@gmail.com>
Co-authored-by: Timo Ludwig <timo.ludwig@tuerantuer.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
❓ question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean-up version history
6 participants