Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the MySQL indexes #6605

Merged
merged 3 commits into from Jan 2, 2024
Merged

Conversation

leofeyer
Copy link
Member

@leofeyer leofeyer commented Dec 6, 2023

It is apparently recommended to use indexed columns in the ORDER BY clause to speed up sorting.

Edit: The PR now only adds two pid indexes and optimized a tl_search index.

@leofeyer leofeyer added this to the 5.3 milestone Dec 6, 2023
@leofeyer leofeyer self-assigned this Dec 6, 2023
@leofeyer leofeyer changed the title Add indexes to columns used in "ORDER BY" queries Add indexes to columns used in ORDER BY queries Dec 6, 2023
@m-vo
Copy link
Member

m-vo commented Dec 6, 2023

If this sorting is only for the back end, this might be overkill. Indexes make lookup fast, but they also make updates/deletes/reorganisation slow. Might need some performance testing to be sure... 🙂

@ausi
Copy link
Member

ausi commented Dec 6, 2023

It is apparently recommended to use indexed columns in the ORDER BY clause to speed up sorting.

I’m not sure if this is actually an improvement for us as most of our queries do a SELECT * and according to the docs there are many requirements that have to be met in order to use an index for sorting.

I’d only add those indexes for cases where we can measure an actual performance improvement.

@leofeyer
Copy link
Member Author

leofeyer commented Dec 6, 2023

It definitely makes a huge difference whether or not the ORDER BY column is indexed or not:

Not indexed

Indexed

@leofeyer
Copy link
Member Author

leofeyer commented Dec 6, 2023

Not indexed

Indexed

@ausi
Copy link
Member

ausi commented Dec 6, 2023

I was not able to reproduce these results on my machine (MySQL 8.0.33).

Also, your example queries do not look like real queries that are used by Contao. We never use the tl_search_index.relevance directly for sorting. And we usually have a WHERE clause with pid=?, id=? or something when querying tl_page (except for in backend maybe?).

@leofeyer
Copy link
Member Author

leofeyer commented Dec 6, 2023

Yes we do:

$strQuery .= " ORDER BY relevance DESC";

@ausi
Copy link
Member

ausi commented Dec 8, 2023

In this case relevance refers to the result of similarity / vectorLength AS relevance where similarity refers to the matching algorithm that uses the tl_search_index.relevance field. The query looks something like this:

SELECT 
	…, 
	similarity / vectorLength AS relevance 
FROM (
	SELECT 
		tl_search_index.pid AS sid,
		(0 + ((1 + LOG(SUM(match0 * tl_search_index.relevance))) * POW(MIN(match0 * matchedTerm.idf), 2) / 1)) / sqrt(0 + POW(MIN(match0 * matchedTerm.idf) / 1, 2)) AS similarity 
	FROM … 
)
ORDER BY relevance DESC

This query cannot make use of an index on tl_search_index.relevance for sorting the result.

@leofeyer leofeyer force-pushed the feature/order-by-indexes branch 3 times, most recently from d237b81 to 025ec49 Compare December 13, 2023 10:54
@leofeyer leofeyer changed the title Add indexes to columns used in ORDER BY queries Optimize the MySQL indexes Dec 13, 2023
@leofeyer
Copy link
Member Author

I have reduced the number of new indexes to 2 in 09721f0. Please review again.

@leofeyer leofeyer requested a review from a team December 13, 2023 10:59
@leofeyer leofeyer merged commit efb119a into contao:5.x Jan 2, 2024
16 checks passed
@leofeyer leofeyer deleted the feature/order-by-indexes branch January 2, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants