Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Denormalize community_id into post_aggregates for a 1000x speed-up when loading posts #3653
Denormalize community_id into post_aggregates for a 1000x speed-up when loading posts #3653
Changes from all commits
bb4c7c6
fb696be
b20ab9f
7866547
00091c1
021d3a6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this one, much like the post table has indexes on community_id and creator_id, it would probably be a good idea to add these two indexes to
post_aggregates
now too:IE:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, added these indexes as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dessalines I just discovered that adding these two indexes messes up the query plan again 😃
Edit: hmm, I just got some 3-second query plans with the indexes. Then I dropped the indexes - back to a few ms.
Then re-added, and it's still at a few ms. So I can't reproduce the problem at the moment, but I will investigate a bit more..There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the query plans are definitely consistently worse with those indexes.
With indexes: https://explain.depesz.com/s/W7cF
Without indexes: https://explain.depesz.com/s/4qds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted these indexes for now, let me know if you disagree with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, you didn't add any index to get the speedup? the "right" index should be post_aggregates(community_id, featured_local desc, hot_rank desc)
it should use that one when the user has only subscribed to a few communities, and the idx_featured_local_hot_rank when subscribed to many communities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed a bit on Matrix already, but commenting here as well for visibility:
This index also messes up the query plan. It indeed makes the queries super fast for users with few subscriptions, but it makes them slow again for others.
Without that index
User with 86 subscriptions: 3ms query execution
User with 0 subscriptions: 700 ms query execution
(I didn't measure exact time, but it's also very fast for logged out users without the index)
With that index
User with 86 subscrptions: 2-3 second query execution
User with 0 subscriptions: 1ms query execution
phiresky on Matrix said:
I will see tomorrow if playing around with statistics will help. I know there's also
CREATE STATISTICS
in postgres, but I've never used it so would need to investigate it a bit.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The limit and sort is the main culprit, but also its weird to me that postgres sometimes chooses a slower index.
We can tweak those indexes later, I'll try to tag the issues as
DB