Skip to content

Commit

Permalink
PERF: Add index id DESC, baked_version ON posts.
Browse files Browse the repository at this point in the history
A scheduled job runs `Post.rebake_old` with a limit of 80 which does a
look up for the latest posts that have not been baked to the latest
version. Before this commit, the query would run using the primary key's
index and does a reverse scan. However the query performance quicky
becomes bad as more and more of the latest posts have been baked to the
latest version.
  • Loading branch information
tgxworld committed Apr 8, 2019
1 parent ca57e18 commit 4791d99
Showing 1 changed file with 7 additions and 0 deletions.
@@ -0,0 +1,7 @@
class AddIndexIdBakedVersionOnPosts < ActiveRecord::Migration[5.2]
def change
add_index :posts, [:id, :baked_version],
order: { id: :desc },
where: "(deleted_at IS NULL)"
end
end

4 comments on commit 4791d99

@vinothkannans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is breaking the parallel ruby test. I'm not exactly sure about the reason. topic.posts method returns posts in reverse order. Checking...

@SamSaffron
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ... I can explain why a filtered index is what we want here.

You are selecting more or less like so:

select id from posts
where baked_version is not null and 
  baked_version <> 2 and
  deleted_at is null
order id desc

This is way of making this more efficient is by adding

create index idx on posts (
  id desc
) where deleted_at is null and baked_version is not null and baked_version <> 2

That way as you reindex the index keeps on getting smaller until it has 0 size.

To grab the next id you just have to head to the first entry in the index.

Once you change that it is no longer there.

This index does not help...

create index idx on posts (
   id desc, baked_version
) where deleted_at is null

Imagine you have the records

id   |    baked_version
1    |  2    
2    |  2    
3    |  1    
4    |  2    
5    |  2    

To find 3,1 you still got to scan the 5,2 and 4,2, all the index placed does is replaced a table scan with an index scan. We should be seeking here which is way cheaper.

@tgxworld
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgxworld
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed up :slight_smile: Thanks for the tip 👍

Please sign in to comment.