Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of community followers inbox query #3482

Merged
merged 5 commits into from Jul 5, 2023

Conversation

phiresky
Copy link
Collaborator

@phiresky phiresky commented Jul 4, 2023

i found a huge perf issue. get_follower_inboxes fetches all followers of a community (including thousands of copies of the whole damn community object, person object, and community_follower object) for every single action (e.g. like). it transfers that list to rust (like 20000 elements with huge objects each, including private keys and whatever). rust then finds all the federated instances it has to send the like to from that list (like 100 strings).

so basically the query uselessly transfers huge amounts of data just to get a list of instance_count urls.

on lemmy.world we looked at pg_stats_statements (thanks jelloeater) and that community_follower query absolutely dominates the stats, eating up 8 days of query time per day. Since yesterday, it has fetched over 4 billion rows.
that query is also timing out a lot due to the statement_timeout=10s we set.

This PR improves on the situation by doing the filtering on the database side. It's not optimal since the result can't be perfectly indexed, but it seems the main issue is the huge amount of data transfer between rust and PG. after some quick testing: the new query takes around 100ms and returns only tiny amounts of data. it should basically fix a huge amount of lemmys current perf problems.

i also replaced the second query in followers which was doing a similar amount of unnecessary work (but much more seldom).

i did not extensively test this for correctness.

@db0
Copy link
Contributor

db0 commented Jul 4, 2023

This is probably THE most important thing to add in 0.18.1. My remote DB is hitting close to 1Gbs of data transfers.

@RocketDerp

This comment was marked as abuse.

@@ -11,19 +17,37 @@ use lemmy_db_schema::{

type CommunityFollowerViewTuple = (Community, Person);

sql_function!(fn coalesce(x: diesel::sql_types::Nullable<diesel::sql_types::Text>, y: diesel::sql_types::Text) -> diesel::sql_types::Text);
Copy link
Member

@Nutomic Nutomic Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if this could be made generic to work for multiple types, or better yet built into diesel.

@Nutomic
Copy link
Member

Nutomic commented Jul 5, 2023

Great find! I think the apub crate should generally not use any views, so there might be more to optimize.

Edit: Federation tests recently keep failing in CI, but I confirmed that its passing locally for this PR.

@phiresky
Copy link
Collaborator Author

phiresky commented Jul 5, 2023

Just a few charts after deploying this + #3483:
2023-07-05-12-39-56

image

2023-07-05-12-36-59

Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, excellent find. These are the things we could never discover without more eyes on them than just us two. Thanks a ton!

We need to see what's going on with the federation tests failing.

@dessalines dessalines enabled auto-merge (squash) July 5, 2023 14:50
@dessalines dessalines merged commit 45b1a0d into LemmyNet:main Jul 5, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants