New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should have a scheduled task for pruning log_topics #5657
Comments
Sure, I'll be in favor of it. Seems simple enough to be incorporated. |
Makes sense to me. |
There's a script in SVN that does this that could be used as a start...
…On Wed, May 8, 2019 at 9:13 AM Jon Stovell ***@***.***> wrote:
Makes sense to me.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5657 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADJNN5TRLJP2TKZY4QQF2DPUL3ZBANCNFSM4HLDFRYQ>
.
|
I wasn't aware of this... I was trying to make a fresh copy of my DB for testing, but had issues, in part, because this table had millions of rows in it... |
To be honest, if this is going to become a scheduled task I recommend:
Large forums could be killed otherwise if this isn't done with millions of entries. |
Agreed. I'd go a step further & say that folks should have to specifically enable it for upgrades & installs. It should not "just run". But I do think it's important to have as an option. Though I do have a suspicion that most very large forums already know about it... |
I've been reading code & testing ideas here... The utility shared online (link above): Note this is exactly what our 'Mark all messages read' button does at the bottom of our board index. I have 242 boards on my forum. Every time a user presses that button, 484 records are stored/updated. The utility does this for all members who have ever viewed a topic. So... On my test forum with 2.5M log_topics records, this utility gets rid of all log_topic records. But... It creates ~8M log_board & log_mark_read records. Making the problem 3x worse. What's in SVN: The algorithm is simple and clean - for each unique member/board found in log_topics it updates log_mark_read & clears out log_topics. I've confirmed this approach works and decreases the record count, since it replaces multiple topic records with one board record. Proposal: I suggest a 2-tier approach:
I am testing a script that does this and the 2.5M mostly useless records turn into about 250K real records. MY BIGGEST QUESTION: In my test environments, if I truncate log_boards, all the "new" statuses still work perfectly. I haven't yet found a meaningful use of the log_board data. Feels like I'm missing something... |
I have a proof of concept that I've been testing out there: This does the full two-stage approach listed above. At the moment, works for both 2.0 & 2.1. This cleans up millions of records in a way that is virtually unseen by the users. I've run it in my production forum. |
This won't piss anyone off at all. /s |
It's configurable. IF there's been any traffic on the forum at all, most posts will truly be unread anyway. |
Description
Question: Should we have a scheduled task for pruning log_topics?
I believe 2.1 has the same behavior as 2.0, in that, log_topics can grow out of hand. In 2.0, we have a script that is shared (informally, in a post) to clean that up.
Additional information/references
https://www.simplemachines.org/community/index.php?topic=212330.msg1667071#msg1667071
The text was updated successfully, but these errors were encountered: