-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune old events from sqlite database #4737
Conversation
Adds a new config `dbHistoryDays` which defaults to undefined. At start up, each database handler reads the config. If the value is set, then a reoccuring task is scheduled to clean up old events. Events older than `dbHistoryDays` are targeted but only a few thousand events are cleaned up per iteration to avoid freezing the server. Adds unit tests to validate the cleanup logic as well. Fixes thelounge#2822
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for taking this on.
Now, I've some general comments as to how I think this could work, plus a few minor comments in the code directly.
I think this is a bit too limited as is.
For starters, I want the history retention to be configurable by the user, not just the admin.
While in many cases that's the same, it isn't in all cases and not having to edit the server config for something like that is better.
What that means is that we need this to be a hierarchical configuration. Admins sets global limits as they see fit, user can lower that limit as they feel like.
We should also put that configuration into its own top level config object so that we can add more related settings in the future, for instance I would like to add the ability to only delete unimportant status messages (leave, part, nick change, away, back etc) while retaining the actual useful stuff indefinitely (privmsg, notice).
Does this make sense?
Also note, you should probably rebase on top of #4718 as I'll merge that soon (after a bugfix release for 4.4.0)
defaults/config.js
Outdated
// Undefined/-1/0 is treated an unlimited. | ||
// The limit is seen as a soft target but not an exact goal. Only a few | ||
// thousand rows are pruned at a time to avoid slowing down the service. | ||
dbHistoryDays: undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be moved into its own top level object, else we end up with a bunch of similar names which are all very long for no real reason.
const deleteBefore = Date.now() - keepNdays * millisecondsInDay; | ||
await this.serialize_run( | ||
`DELETE FROM messages WHERE rowid in ( | ||
SELECT rowid FROM messages WHERE time < ? ORDER BY time ASC LIMIT ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you enforcing an ordering here?
While I get the intention behind it, usually the messages in question are not visible to the client.
You could just let the DB do it in whatever order it sees fit, generally you'd assume that you can keep up within a single invocation right?
We have an index on the time, so it's probably relatively easy to order, but still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we do delete in batches of only 1000 events, I felt wrong creating a "hole" in history. Since the index is already there, I think there is no harm in doing it this way right?
I'm not sure why a user would set it lower than the admin global limit. What is the usecase for this? Would any users want less history to search/scroll? If they want to remove history, they can do this per channel already? |
Admin says, yeah well, storage isn't an issue. Easy enough to imagine, no? Similarly, User A is unlimited as that's my main account, User B is a test account for dev purposes and I just want some history, without actually using much space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update.
In my opinion this should still be restructured a bit though, tell me if you disagree with my comments and I try to explain my reasoning a bit better.
this.mergedConfig = this.mergeClientConfig(clientStorageConfig); | ||
} | ||
|
||
mergeClientConfig(clientStorageConfig: SqliteConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the right place for that kind of logic.
The config needs to be merged by the core code, not the sqlite specific storage backend.
Yes, currently sqlite is our only implementation of the interface, but that doesn't mean that we should violate the layers.
maxDaysHistory: number | undefined; | ||
} | ||
| undefined; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that type seems odd to me?
You shouldn't set that to an object or undefined, things that want it optional shall say so when they mean it.
Meaning, the user of the type should do
export type ConfigType = {
// ...
sqliteConfig?: SqliteConfig;
// ...
}
Or whatever, but not that funny type that is intrinsically this way.
That's a hassle to use
// | ||
// `maxDaysHistory`: | ||
// Defines the maximum number of days of history to keep in the database. | ||
// Undefined/-1/0 is treated an unlimited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly undefined
We should have one way and only one way of configuring something.
@@ -37,7 +37,7 @@ describe("SQLite Message Storage (stateful tests)", function () { | |||
// Delete database file from previous test run | |||
await cleanup(); | |||
|
|||
store = new MessageStorage("testUser"); | |||
store = new MessageStorage("testUser", undefined as any); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such casts are very brittle... if you can avoid so please do and in this case you trivially could by providing a correct default value
I appreciate the feedback but this has slipped in priority for me. Happy for someone to take over the PR or to close the PR. |
thanks for your efforts |
Adds a new config
dbHistoryDays
which defaults to undefined.At start up, each database handler reads the config. If the value is set, then a reoccuring task is scheduled to clean up old events. Events older than
dbHistoryDays
are targeted but only a few thousand events are cleaned up per iteration to avoid freezing the server.Adds unit tests to validate the cleanup logic as well.
Fixes #2822