Skip to content

Improve follow recommendations SQL query #25334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

renchap
Copy link
Member

@renchap renchap commented Jun 7, 2023

A server admin reported that this query took 29 hours to run on their instance, using PostgreSQL 14 with up-to-date vacuum & statistics.

After some investigation, we found out that the plan was very wrong and resulted in 300B+ rows being generated in a temporary table.

Switching from a JOIN to NOT EXISTS fixes the issue and the query is now behaving as expected.

This has been tested on PostgreSQL 12 on a large server, and performs the same as with the JOIN on this server.

Fixes #25191

@renchap renchap requested a review from ClearlyClaire June 7, 2023 14:54
@renchap renchap added the to backport PR needed to be backported label Jun 7, 2023
A server admin reported that this query took 29 hours to run on their instance,
using PostgreSQL 14 with up-to-date vacuum & statistics.

After some investigation, we found out that the plan was very wrong and
resulted in 300B+ rows being generated in a temporary table.

Switching from a `JOIN` to `NOT EXISTS` fixes the issue and the query is
now behaving as expected.

This has been tested on PostgreSQL 12 on a large server, and performs the
same as with the `JOIN` on this server.

Fixes mastodon#25191
@renchap renchap force-pushed the improve-follow-recommendations-sql branch from 2c32479 to 8b6dcda Compare June 7, 2023 14:55
@renchap renchap changed the title Improve-follow-recommendations-sql Improve follow recommendations SQL query Jun 7, 2023
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow_recommendations_v02.sql describes a materialized view, and updating it will require a database migration: see db/migrate/20210505174616_update_follow_recommendations_to_version_2.rb for an example.

@renchap
Copy link
Member Author

renchap commented Jun 13, 2023

I am trying to get this through a migration, but as this is a materialized view it needs to be dropped and then re-created if we go with a scenic migration.
As computing the view takes about 10 minutes on mastodon.social and it locks the table, we are not sure this is the correct way to go.

An alternative way of updating materialized views has been proposed in scenic, which involves creating the new view under a temporary name, then drop the old one and rename the new:
scenic-views/scenic#387

Unfortunately you can not easily replicate this behaviour from a migration, as it wont work with the version: parameter.

@renchap renchap removed the to backport PR needed to be backported label Jun 13, 2023
@renchap renchap added this to the 4.2.0 milestone Jul 21, 2023
@Gargron
Copy link
Member

Gargron commented Jul 24, 2023

Should we rename this view? Would that unstall this PR? I wouldn't mind having the FollowRecommendation name be available for a different model.

@renchap
Copy link
Member Author

renchap commented Jul 24, 2023

Yes we could rename it. But that would not solve the issue if one day we have another materialised view that needs to be updated.

The alternative I had in mind was to copy the way it is done in scenic-views/scenic#387 into a helper, and use this helper for the migration.

@renchap
Copy link
Member Author

renchap commented Aug 18, 2023

Closing if favor of #26545

@renchap renchap closed this Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler::FollowRecommendationsScheduler takes days and reads over 24TB from db
3 participants