-
-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of received_activity
table
#4110
Comments
The simplest is probably the best here, which would be to reduce the removal interval to maybe > 3 days? Other than that, @dullbananas is doing some work to remove some of the pointless integer primary keys, but I really doubt that's taking up that much. The number of pointless rows is the issue. |
Receiving double events "normally" shouldn't happen at all anymore in 0.19 with the persistent queue (normal server restarts will not resend any activities, just crashes), and if it does it will only affect the most recent ~100 events max. For non-lemmy AP instances, idk, but it's probably still in the range of minutes / hours. If there's a bug or more likely someone manually modifies / deletes the The linked bloom PG index isn't really relevant for space saving since it's just an index on top of a table (so the data still needs to be in the table). |
By storing only a partial hash of ap_id instead of the full url, as well as dropping id column, the size of each row is reduced by half. Also reduce cleanup interval from 3 months to 1 month.
* Also order reports by oldest first (ref LemmyNet#4123) (LemmyNet#4129) * Support signed fetch for federation (fixes LemmyNet#868) (LemmyNet#4125) * Support signed fetch for federation (fixes LemmyNet#868) * taplo * add federation queue state to get_federated_instances api (LemmyNet#4104) * add federation queue state to get_federated_instances api * feature gate * move retry sleep function * move stuff around * Add UI setting for collapsing bot comments. Fixes LemmyNet#3838 (LemmyNet#4098) * Add UI setting for collapsing bot comments. Fixes LemmyNet#3838 * Fixing clippy check. * Only keep sent and received activities for 7 days (fixes LemmyNet#4113, fixes LemmyNet#4110) (LemmyNet#4131) * Only check auth secure on release mode. (LemmyNet#4127) * Only check auth secure on release mode. * Fixing wrong js-client. * Adding is_debug_mode var. * Fixing the desktop image on the README. (LemmyNet#4135) * Delete dupes and add possibly missing unique constraint on person_aggregates. * Fixing clippy lints. --------- Co-authored-by: Nutomic <me@nutomic.com> Co-authored-by: phiresky <phireskyde+git@gmail.com>
* post_saved * fmt * remove unique and not null * put person_id first in primary key and remove index * use post_saved.find * change captcha_answer * remove removal of not null * comment_aggregates * comment_like * comment_saved * aggregates * remove "\" * deduplicate site_aggregates * person_post_aggregates * community_moderator * community_block * community_person_ban * custom_emoji_keyword * federation allow/block list * federation_queue_state * instance_block * local_site_rate_limit, local_user_language, login_token * person_ban, person_block, person_follower, post_like, post_read, received_activity * community_follower, community_language, site_language * fmt * image_upload * remove unused newtypes * remove more indexes * use .find * merge * fix site_aggregates_site function * fmt * Primary keys dess (#17) * Also order reports by oldest first (ref #4123) (#4129) * Support signed fetch for federation (fixes #868) (#4125) * Support signed fetch for federation (fixes #868) * taplo * add federation queue state to get_federated_instances api (#4104) * add federation queue state to get_federated_instances api * feature gate * move retry sleep function * move stuff around * Add UI setting for collapsing bot comments. Fixes #3838 (#4098) * Add UI setting for collapsing bot comments. Fixes #3838 * Fixing clippy check. * Only keep sent and received activities for 7 days (fixes #4113, fixes #4110) (#4131) * Only check auth secure on release mode. (#4127) * Only check auth secure on release mode. * Fixing wrong js-client. * Adding is_debug_mode var. * Fixing the desktop image on the README. (#4135) * Delete dupes and add possibly missing unique constraint on person_aggregates. * Fixing clippy lints. --------- Co-authored-by: Nutomic <me@nutomic.com> Co-authored-by: phiresky <phireskyde+git@gmail.com> * fmt * Update community_block.rs * Update instance_block.rs * Update person_block.rs * Update person_block.rs --------- Co-authored-by: Dessalines <dessalines@users.noreply.github.com> Co-authored-by: Nutomic <me@nutomic.com> Co-authored-by: phiresky <phireskyde+git@gmail.com>
Requirements
Is your proposal related to a problem?
The
received_activity
table on lemmy.ml currently has 11 GB. The purpose of this table is to check whenever a new activity is received, to see if we received the same activity before. In that case it is rejected. If its a new activity, it is inserted to the table and processed. Rows are removed after a three month interval. This is really basic functionality and shouldnt take so much space.The table is currently defined like this:
Here are a few ideas we can use to save space:
id
column, its unnecessarypublished
from full datetime with millisecond precision, store only date or unix timestamp (integer)ap_id
never needs to be read, only checked if a given value exists. This is an ideal use case for bloom filters which are supported in postgrescc @dessalines @phiresky
The text was updated successfully, but these errors were encountered: