Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache DB queries by using Postgres materialized views #411

Closed
dessalines opened this issue Jan 12, 2020 · 2 comments · Fixed by #418
Closed

Cache DB queries by using Postgres materialized views #411

dessalines opened this issue Jan 12, 2020 · 2 comments · Fixed by #418
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@dessalines
Copy link
Member

dessalines commented Jan 12, 2020

#405

After doing a lot of DB testing (check the server/src/query_testing folder ) it seems the main slowdown for the views, especially the post_view, is that it needs to run the function `hot_rank(sum of a likes join), published_time)

That hot_rank is causing the front page fetch to be over 300 ms. The goal is to try to get every fetch, especially the front page under 15 ms.

Performance :

☭ ./generate_explain_reports.sh
comment_view.json:    "Execution Time": 130.076
community_view.json:    "Execution Time": 53.884
post.json:    "Execution Time": 0.602
post_view.json:    "Execution Time": 374.246
site_view.json:    "Execution Time": 1.597
user_view.json:    "Execution Time": 181.696

This should also create indexes on the views, and make sure there are triggers to make sure the materialized views are refreshed any time there are data updates to the underlying primary tables. Also be sure to use concurrently.

Guide:
https://stackoverflow.com/questions/23906977/refresh-a-materialized-view-automatically-using-a-rule-or-notify

@dessalines dessalines added enhancement New feature or request Component: Backend labels Jan 12, 2020
@dessalines dessalines self-assigned this Jan 12, 2020
@dessalines
Copy link
Member Author

dessalines commented Jan 14, 2020

Here's a performance comparison before and after the materialized views, for some common fetches:

☭ ./generate_explain_reports.sh
comment_view.json:    "Execution Time": 3.415
comment_view.old.json:    "Execution Time": 156.021
community_view.json:    "Execution Time": 0.501
community_view.old.json:    "Execution Time": 81.619
post_view.json:    "Execution Time": 16.005
post_view.old.json:    "Execution Time": 674.093
site_view.json:    "Execution Time": 2.062
site_view.old.json:    "Execution Time": 1.089
user_view.json:    "Execution Time": 0.961
user_view.old.json:    "Execution Time": 974.725

I deployed the materialized view caches to dev.lemmy.ml, and the performance gain is huge.

https://gtmetrix.com/reports/dev.lemmy.ml/gKs8Nzqd

ab -c 10 -t 10 "https://dev.lemmy.ml/feeds/all.xml"

before:

Server Hostname:        dev.lemmy.ml
Document Path:          /feeds/all.xml
Requests per second:    3.78 [#/sec] (mean)
Time per request:       264.326 [ms] (mean, across all concurrent requests)
Transfer rate:          32.38 [Kbytes/sec] received

after:

Server Hostname:        dev.lemmy.ml
Document Path:          /feeds/all.xml
Requests per second:    32.17 [#/sec] (mean)
Time per request:       31.087 [ms] (mean, across all concurrent requests)
Transfer rate:          275.28 [Kbytes/sec] received

One problem tho is that this PR adds a lot of synchronous triggers to update the materialized views, IE, if you like a post, or create a comment, it tells those views to refresh. This only takes ~100ms but it still slows down all actions. If there's a way to do async postgres triggers, that'd be better.

@dessalines
Copy link
Member Author

dessalines commented Jan 18, 2020

Some performance comparisons with reddit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant