Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MM-23369: Allow mysql to choose a better index #14119

Merged
merged 4 commits into from Mar 25, 2020
Merged

MM-23369: Allow mysql to choose a better index #14119

merged 4 commits into from Mar 25, 2020

Conversation

agnivade
Copy link
Member

Summary

When the ORDER BY clause contains a column which is in the WHERE
clause and also part of an index, mysql tries to use that specific
index to avoid sorting. This is inspite of the fact that
there may be other indices which are better for scanning the table
and then doing a sort.

Essentially, mysql becomes dumb and scans a lot of rows to avoid
sorting. Whereas, it could have scanned a lot less rows and do the
sorting in no time.

To fix this, we use the other columns in the ORDER BY clause as well
which are part of the index. This causes no change in the results
because the other columns are an EQUAL condition check, but this
lets mysql use the right index. Because now mysql sees that it has
to order by other columns too, so it better use the other index
to scan and then do the sorting.

This does not affect tables of smaller size because the LIMIT of
rows is always 1. And mysql will stop sorting the moment it gets
the first row. So sorting is not the overhead at all.
Therefore, this seems like an optimal fix.

References:
https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html
https://code.openark.org/blog/mysql/7-ways-to-convince-mysql-to-use-the-right-index
https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html

Ticket Link

https://mattermost.atlassian.net/browse/MM-23369

When the ORDER BY clause contains a column which is in the WHERE
clause and also part of an index, mysql tries to use that specific
index to avoid sorting. This is inspite of the fact that
there may be other indices which are better for scanning the table
and then doing a sort.

Essentially, mysql becomes dumb and scans a lot of rows to avoid
sorting. Whereas, it could have scanned a lot less rows and do the
sorting in no time.

To fix this, we use the other columns in the ORDER BY clause as well
which are part of the index. This causes no change in the results
because the other columns are an EQUAL condition check, but this
lets mysql use the right index. Because now mysql sees that it has
to order by other columns too, so it better use the other index
to scan and then do the sorting.

This does not affect tables of smaller size because the LIMIT of
rows is always 1. And mysql will stop sorting the moment it gets
the first row. So sorting is not the overhead at all.
Therefore, this seems like an optimal fix.

References:
https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html
https://code.openark.org/blog/mysql/7-ways-to-convince-mysql-to-use-the-right-index
https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html
@agnivade agnivade added the Work In Progress Not yet ready for review label Mar 21, 2020
@agnivade agnivade added 2: Dev Review Requires review by a developer 3: QA Review Requires review by a QA tester. May occur at the same time as Dev Review and removed Work In Progress Not yet ready for review labels Mar 21, 2020
@agnivade agnivade added this to the v5.23.0 milestone Mar 21, 2020
@agnivade
Copy link
Member Author

@lieut-data - Thought you would be interested in this one.

@agnivade agnivade added the Do Not Merge/Awaiting Loadtest Must be loadtested before it can be merged label Mar 21, 2020
Copy link
Member

@lieut-data lieut-data left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes a ton of sense to me -- great research!

// Adding ChannelId and DeleteAt order columns
// to let mysql choose the "idx_posts_channel_id_delete_at_create_at" index always.
// See MM-23369.
OrderBy("ChannelId, DeleteAt, CreateAt " + sort).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might we write this as follows to minimize the "manual SQL" and maximize Squirrel's flexibility?

Suggested change
OrderBy("ChannelId, DeleteAt, CreateAt " + sort).
OrderBy("ChannelId", "DeleteAt", "CreateAt " + sort).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@jespino jespino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jespino jespino removed the 2: Dev Review Requires review by a developer label Mar 23, 2020
@agnivade agnivade requested a review from lindy65 March 24, 2020 05:06
Copy link
Contributor

@lindy65 lindy65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agnivade 👍 This will be tested post-merge...

@lindy65 lindy65 added QA/Review Done QA review is completed but other reviews are outstanding (exception to usual process) and removed 3: QA Review Requires review by a QA tester. May occur at the same time as Dev Review labels Mar 24, 2020
@jespino jespino added the 4: Reviews Complete All reviewers have approved the pull request label Mar 24, 2020
@agnivade
Copy link
Member Author

Load test done. No difference found:
mysqlorder

@agnivade agnivade removed the Do Not Merge/Awaiting Loadtest Must be loadtested before it can be merged label Mar 25, 2020
@agnivade agnivade merged commit bf3c2c0 into master Mar 25, 2020
@agnivade agnivade deleted the mysqlorderby branch March 25, 2020 07:09
@amyblais amyblais added the CherryPick/Approved Meant for the quality or patch release tracked in the milestone label Apr 15, 2020
@agnivade
Copy link
Member Author

/cherry-pick release-5.23

@mattermod mattermod added the CherryPick/Done Successfully cherry-picked to the quality or patch release tracked in the milestone label Apr 20, 2020
@mattermod mattermod removed the CherryPick/Approved Meant for the quality or patch release tracked in the milestone label Apr 20, 2020
agnivade added a commit that referenced this pull request Apr 20, 2020
* MM-23369: Allow mysql to choose a better index

When the ORDER BY clause contains a column which is in the WHERE
clause and also part of an index, mysql tries to use that specific
index to avoid sorting. This is inspite of the fact that
there may be other indices which are better for scanning the table
and then doing a sort.

Essentially, mysql becomes dumb and scans a lot of rows to avoid
sorting. Whereas, it could have scanned a lot less rows and do the
sorting in no time.

To fix this, we use the other columns in the ORDER BY clause as well
which are part of the index. This causes no change in the results
because the other columns are an EQUAL condition check, but this
lets mysql use the right index. Because now mysql sees that it has
to order by other columns too, so it better use the other index
to scan and then do the sorting.

This does not affect tables of smaller size because the LIMIT of
rows is always 1. And mysql will stop sorting the moment it gets
the first row. So sorting is not the overhead at all.
Therefore, this seems like an optimal fix.

References:
https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html
https://code.openark.org/blog/mysql/7-ways-to-convince-mysql-to-use-the-right-index
https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html

* Added a comment to clarify things in code

* Incorporating review comments

Co-authored-by: Agniva De Sarker <agnivade@yahoo.co.in>
@amyblais amyblais added Changelog/Not Needed Does not require a changelog entry Docs/Not Needed Does not require documentation labels Apr 21, 2020
@agnivade agnivade mentioned this pull request Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4: Reviews Complete All reviewers have approved the pull request Changelog/Not Needed Does not require a changelog entry CherryPick/Done Successfully cherry-picked to the quality or patch release tracked in the milestone Docs/Not Needed Does not require documentation QA/Review Done QA review is completed but other reviews are outstanding (exception to usual process)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants