Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize volunteer map query performance #31621

Merged
merged 1 commit into from Nov 4, 2019
Merged

Conversation

islemaster
Copy link
Contributor

@islemaster islemaster commented Oct 31, 2019

Will wrote:

The change in #31493 (adding .order(Sequel.desc(:created_at)) to the end of the volunteer submission Sequel query) caused a performance regression because it confused the query-optimizer enough to start selecting the wrong (very inefficient) index to use.

Before the change, it selected the kind index, which allowed the WHERE kind = 'VolunteerEngineerSubmission2015' clause to ignore all of the millions of forms, except for the ~20k that are actually volunteer submissions the query cares about.

After the change, it selected the forms_created_at_index index, which it tried to use to optimize finding the ordered results, but then wasn't able to filter by kind, so it starts scanning through all million forms.

Just another casualty of the not-perfect query-optimizer choosing a non-optimal index in practice. To fix narrowly, you can either add an index hint (e.g., USE INDEX(kind)) to the query. Or, another option I found works well enough is to order by :id instead of by :created_at

So I'm ordering by id (adding index hints seems to be a little messy with Sequel) and removing the special behavior that does no ordering on the initial request.

Links

Testing story

Unit tests still pass. I've removed the test for the special no-ordering behavior that I'm removing.

Reviewer Checklist:

  • Tests provide adequate coverage
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Pull Request is labeled appropriately
  • Follow-up work items (including potential tech debt) are tracked and linked

Will wrote:

> The change in #31493 (adding `.order(Sequel.desc(:created_at))` to the end of the volunteer submission Sequel query) caused a performance regression because it confused the query-optimizer enough to start selecting the wrong (very inefficient) index to use.
>
> Before the change, it selected the `kind` index, which allowed the `WHERE kind = 'VolunteerEngineerSubmission2015'` clause to ignore all of the millions of forms, except for the ~20k that are actually volunteer submissions the query cares about.
>
> After the change, it selected the `forms_created_at_index` index, which it tried to use to optimize finding the ordered results, but then wasn't able to filter by `kind`, so it starts scanning through all million forms.
>
> Just another casualty of the not-perfect query-optimizer choosing a non-optimal index in practice. To fix narrowly, you can either add an index hint (e.g., `USE INDEX(kind)`) to the query. Or, another option I found works well enough is to order by `:id` instead of by `:created_at`

So I'm ordering by `id`. (Adding index hints seems to be a little messy
with Sequel)
Copy link
Contributor

@hacodeorg hacodeorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is so simple! I wonder why we didn't think of it.

@islemaster islemaster merged commit e59ac7e into staging Nov 4, 2019
@islemaster islemaster deleted the volunteer-map-query branch November 4, 2019 18:04
Copy link
Contributor

@wjordan wjordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

For a bit more detail/documentation on what's going on- every InnoDB index is automatically extended with the primary key (See 'Use of Index Extensions'), which makes ordering by id (can be done with the existing kind index) different from ordering by created_at (chooses the separate index for sorting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants