Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log settings - data not deleted on sites with large # of old logs #16971

Closed
ankush opened this issue May 24, 2022 · 9 comments · Fixed by #17159
Closed

Log settings - data not deleted on sites with large # of old logs #16971

ankush opened this issue May 24, 2022 · 9 comments · Fixed by #17159
Assignees
Labels
Milestone

Comments

@ankush
Copy link
Member

ankush commented May 24, 2022

Log Settings
image

Email Queue, records from eleven months ago
image

Originally posted by @muchai in #16118 (comment)


Email queue cleanup - #16973 should fix it for old sites
Activity log cleanup - drop on v14; v13 requires a separate fix (Should we do it or just stop deleting it?)

@muchai
Copy link

muchai commented May 24, 2022

In reply to: Can you check "scheduled job log" for job type "log_settings.run_log_clean_up` ?

image

@ankush
Copy link
Member Author

ankush commented May 24, 2022

@muchai as mentioned on this comment: #16118 (comment)

on v13 it's only supposed to clear 0 priority items from queue but this issue exist on v14 too. (just checked on few old sites)

Basically, the background job starts, attempts to delete huge amount of data, times out in 300 seconds and doesn't do anything. This goes on everyday.

@ankush ankush added the valid label May 24, 2022
@ankush ankush added this to the v14.0 milestone May 24, 2022
@muchai
Copy link

muchai commented May 24, 2022

Any harm in deleting non 0 priority items? Just wondering.

The problem is really the huge amount of space consumed by the email queue data.

@ankush
Copy link
Member Author

ankush commented May 24, 2022

Any harm in deleting non 0 priority items? Just wondering.

Not really, it's technically a "breaking change" as the behaviour was to only delete items that are priority=0, so pushing this in v13 might upset some users / break some workflows.

TBH I don't know any valid use of old email queues.

@ankush ankush changed the title email queue deletion might be broken for sites with huge old data Log settings - data not deleted on sites with large # of old logs May 24, 2022
@ankush
Copy link
Member Author

ankush commented May 24, 2022

@muchai you're free to delete items from your site with two SQL queries ;)

DELETE FROM `tabEmail Queue` WHERE `modified`<NOW()-INTERVAL '30' DAY;
DELETE FROM `tabEmail Queue Recipient` WHERE `modified`<NOW()-INTERVAL '30' DAY;

@muchai
Copy link

muchai commented May 24, 2022

Any harm in deleting non 0 priority items? Just wondering.

Not really, it's technically a "breaking change" as the behaviour was to only delete items that are priority=0, so pushing this in v13 might upset some users / break some workflows.

TBH I don't know any valid use of old email queues.

Especially old email queues, from workflows. Those are the most in our case.

@muchai
Copy link

muchai commented May 24, 2022

@muchai you're free to delete items from your site with two SQL queries ;)

DELETE FROM `tabEmail Queue` WHERE `modified`<NOW()-INTERVAL '30' DAY;
DELETE FROM `tabEmail Queue Recipient` WHERE `modified`<NOW()-INTERVAL '30' DAY;

Sure, thank you. Been using this approach. We've recently moved most of our sites/customers to FC though (could move most critical to private bench, for now they're on public bench on FC). So will wait till it's fixed for now. Hopefully v14 is around the corner :)

@ankush
Copy link
Member Author

ankush commented Jun 6, 2022

I think we will just have to disable auto-deletion on very large tables and write a util to manually clean up old data

Tested this suggestion by mariadb https://mariadb.com/kb/en/big-deletes/#deleting-more-than-half-a-table

Works perfectly and in reasonable time (even over YUGE databases)

CREATE TABLE `tabActivity Log clean` LIKE `tabActivity Log`;

INSERT INTO `tabActivity Log clean`
  SELECT * FROM `tabActivity Log`
     WHERE `tabActivity Log`.`modified` > NOW()-INTERVAL '90' DAY;

RENAME TABLE `tabActivity Log` TO `tabActivity Log backup`, `tabActivity Log clean` TO `tabActivity Log`;

DROP TABLE `tabActivity Log backup`;

@ankush ankush self-assigned this Jun 6, 2022
@ankush ankush linked a pull request Jun 13, 2022 that will close this issue
2 tasks
@ankush
Copy link
Member Author

ankush commented Jun 14, 2022

added a patch here: #17159

Whenever site migrates to v14 old stuck log deletion will be performed in one swoop during mgiration. Couldn't think of any other better way to handle this on v13 gracefully.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants