Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend revision): add clean-up note revisions job #5349

Open
wants to merge 18 commits into
base: develop
Choose a base branch
from

Conversation

yamashush
Copy link
Contributor

@yamashush yamashush commented Jan 14, 2024

Component/Part

backend revision

Description

This PR adds note revisions clean-up job.

A new config option HD_REVISION_RETENTION_DAYS that sets the number of days a revision should be kept. A regular nest cronjob will clean all note revisions that are due for deletion.
If the config option is not set or set to 0, all revisions will be kept forever.

Steps

  • Added implementation
  • Added / updated tests
  • Added / updated documentation
  • I read the contribution documentation and
    made sure that:
    • My commits are signed-off to accept the DCO.
    • This PR targets the correct branch: master for 1.x & docs, develop for 2.x

Related Issue(s)

@yamashush yamashush changed the title [wip] feat(backend revision): clean-up note revisions [wip] feat(backend revision): add clean-up note revisions job Jan 16, 2024
@yamashush yamashush marked this pull request as ready for review January 16, 2024 13:48
@yamashush yamashush changed the title [wip] feat(backend revision): add clean-up note revisions job feat(backend revision): add clean-up note revisions job Jan 16, 2024
@ErikMichelson ErikMichelson added the type: feature Adds or requests new functionality label Jan 17, 2024
@ErikMichelson ErikMichelson added this to the Version 2.0 milestone Jan 17, 2024
@mrdrogdrog
Copy link
Member

Please rebase your branch to the latest version of develop to fix the CI.

@yamashush
Copy link
Contributor Author

Please rebase your branch to the latest version of develop to fix the CI.

Thank you for letting me know.
After rebase, the test seems to work now!

@yamashush yamashush force-pushed the remove-old-revision branch 2 times, most recently from 5219200 to 20dbecc Compare January 19, 2024 11:31
Copy link

codecov bot commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.82%. Comparing base (e7f33c9) to head (733afe0).
Report is 39 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5349      +/-   ##
===========================================
+ Coverage    57.66%   57.82%   +0.16%     
===========================================
  Files          418      419       +1     
  Lines        12058    12064       +6     
  Branches      1007      903     -104     
===========================================
+ Hits          6953     6976      +23     
- Misses        5048     5088      +40     
+ Partials        57        0      -57     
Flag Coverage Δ
backend 87.67% <100.00%> (+0.04%) ⬆️
backend-e2e-tests 74.25% <76.92%> (-0.10%) ⬇️
backend-unit-tests 85.85% <84.61%> (+0.12%) ⬆️
e2e-tests 74.25% <76.92%> (-0.10%) ⬇️
frontend 39.54% <ø> (+0.02%) ⬆️
frontend-unit-tests 39.54% <ø> (+0.02%) ⬆️
unit-tests 51.96% <84.61%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@ErikMichelson ErikMichelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is a good starting point, thanks for that!

However when deleting old revisions, we need to somehow compensate the missing information, don't we? A revision holds the diff (called patch in the entity) to its preceding revision. This is used by the revision viewer to show the differences. When deleting old revisions, this results in an invalid diff. So the oldest not removed revision needs its diff be updated.

Or should we do it differently, @hedgedoc/backend @hedgedoc/frontend ?

| ---------------------------- | ------- | ------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `HD_REVISION_RETENTION_DAYS` | 0 | | The number of days a revision should be kept. If the config option is not set or set to 0, all revisions will be kept forever. |
<!--
| `HD_IMAGE_PROXY` | - | `https://image-proxy.example.com` | **ToDo:** Add description |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this env variable in the revision section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because I thought it would be better to separate it by entity.
But maybe it should be in the notes section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ErikMichelson I moved it from revesion section to note section!

@yamashush
Copy link
Contributor Author

However when deleting old revisions, we need to somehow compensate the missing information, don't we? A revision holds the diff (called patch in the entity) to its preceding revision. This is used by the revision viewer to show the differences. When deleting old revisions, this results in an invalid diff. So the oldest not removed revision needs its diff be updated.

I see, revisions have not only content but also diff (patch).
Thank you for making me realize that!!

I was referring to the code below that is called from the revision modal.
Does it seem like the diff(patch) needs to be updated here as well?

return await this.revisionRepository.remove(oldRevisions);

@ErikMichelson ErikMichelson linked an issue Mar 25, 2024 that may be closed by this pull request
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
note.publicId,
revision2.content,
revision3.content
)

Check notice

Code scanning / CodeQL

Semicolon insertion Note

Avoid automated semicolon insertion (94% of all statements in
the enclosing function
have an explicit semicolon).
note.publicId,
revision1.content,
revision2.content
)

Check notice

Code scanning / CodeQL

Semicolon insertion Note

Avoid automated semicolon insertion (94% of all statements in
the enclosing function
have an explicit semicolon).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be something that can be fixed throughout the whole project by just running the lint-fix command (yarn lint:fix).

Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
Signed-off-by: yamashush <38120991+yamashush@users.noreply.github.com>
@yamashush
Copy link
Contributor Author

@ErikMichelson I have implemented updating patch (content diff), so could you please review it again?

Copy link
Member

@ErikMichelson ErikMichelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution and sorry again that this took a while.

I've annotated some minor things.

},
);
expect(() => noteConfig()).toThrow(
'"forbiddenNoteIds[0]" is not allowed to be empty',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we expect a different error in this case?

const restore = mockedEnv(
{
/* eslint-disable @typescript-eslint/naming-convention */
HD_FORBIDDEN_NOTE_IDS: invalidforbiddenNoteIds.join(' , '),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should use not the invalid mock data to only test one error case at once.

note.publicId,
revision1.content,
revision2.content
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be something that can be fixed throughout the whole project by just running the lint-fix command (yarn lint:fix).

Comment on lines +519 to +533
expect(revision3.patch).toMatchInlineSnapshot(`
"Index: test-note
===================================================================
--- test-note
+++ test-note
@@ -0,0 +1,6 @@
+---
+title: new title
+description: new description
+tags: [ "tag1" ]
+---
+new content
"
`);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like already stated in the other PR, I think extracting the embedded snapshots into their own "normal snapshot" file would be good.


// Delete all old revisions everyday on 0:00 AM
@Cron('0 0 * * *')
async handleCron(): Promise<void> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename these methods (handleCron and handleTimeout) to indicate that they're used in the context of revision cleanup. Maybe something like handleRevisionCleanupTimeout etc?

return await this.removeOldRevisions();
}

async removeOldRevisions(): Promise<void> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a block of ESDoc here

if (!oldRevisions.length) {
continue;

} else if (oldRevisions.length == revisions.length - 1 ){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit comparisons always (===).

.filter(
(revision) =>
new Date(revision.createdAt).getTime() <=
currentTime - revisionRetentionDays * 24 * 60 * 60 * 1000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve performance, this second value could be calculated outside the loop, so before the .filter method call.

const beUpdatedRevision = revisions.slice(-1)[0]
beUpdatedRevision.patch = createPatch(
note.publicId,
'', // donnt exist older revision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'', // donnt exist older revision
'', // there is no older revision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature Adds or requests new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configurable note revision clean-up
3 participants