Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slow query of federated timeline #12886

Merged
merged 1 commit into from
Jan 21, 2020

Conversation

notozeki
Copy link
Contributor

We had found that the federated timeline was slow after upgrading our server from v2.7.3 to v3.0.1. We found an almost the same issue (#11643) and its solution (#11648), but it fixes only the local timeline. This PR fixes also the federated timeline by creating a new index similar with #11648.

EXPLAIN results of our DB:

Query

SELECT  "statuses"."id", "statuses"."updated_at" FROM "statuses" LEFT OUTER JOIN "accounts" ON "accounts"."id" = "statuses"."account_id" WHERE "statuses"."visibility" = 0 AND (statuses.reblog_of_id IS NULL) AND (statuses.reply = FALSE OR statuses.in_reply_to_account_id = statuses.account_id) AND "statuses"."account_id" NOT IN (1) AND "statuses"."deleted_at" IS NULL AND "accounts"."silenced_at" IS NULL ORDER BY "statuses"."id" DESC, "statuses"."id" DESC LIMIT 10

Before

Limit  (cost=0.72..100659.04 rows=10 width=16)
  ->  Nested Loop Left Join  (cost=0.72..1107242.22 rows=110 width=16)
        Filter: (accounts.silenced_at IS NULL)
        ->  Index Scan Backward using statuses_pkey on statuses  (cost=0.43..1051835.02 rows=22060 width=24)
              Filter: ((reblog_of_id IS NULL) AND (deleted_at IS NULL) AND ((NOT reply) OR (in_reply_to_account_id = account_id)) AND (account_id <> 1) AND (visibility = 0))
        ->  Index Scan using index_accounts_on_id on accounts  (cost=0.29..2.50 rows=1 width=16)
              Index Cond: (id = statuses.account_id)

After

Limit  (cost=0.72..9689.39 rows=10 width=16)
  ->  Nested Loop Left Join  (cost=0.72..106575.99 rows=110 width=16)
        Filter: (accounts.silenced_at IS NULL)
        ->  Index Scan using index_statuses_public_20200117 on statuses  (cost=0.43..51134.55 rows=22067 width=24)
              Filter: (account_id <> 1)
        ->  Index Scan using index_accounts_on_id on accounts  (cost=0.29..2.50 rows=1 width=16)
              Index Cond: (id = statuses.account_id)

CPU usage of our DB:

スクリーンショット 2020-01-17 20 12 07のコピー

We created the index at 17:00 and CPU usage decreased immediately.

I am not sure this is the best way to fix it. Please give me a review. Thank you!

@Gargron Gargron merged commit e1c5f43 into mastodon:master Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants