Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the FULLTEXT index from the mysql and sqlsrv schemas. #14558

Closed
wants to merge 1 commit into from
Closed

Remove the FULLTEXT index from the mysql and sqlsrv schemas. #14558

wants to merge 1 commit into from

Conversation

sergant210
Copy link
Collaborator

Little SQL optimization. For discussion...

What does it do?

Removed FULLTEXT index from the mysql and sqlsrv schemas of the site_content table.

Why is it needed?

This is a multi-columns index ('pagetitle', 'longtitle', 'description', 'introtext' and 'content'). I think it was supposed for searching in the table of contents. But it's not used in the core.
Reason to remove:

  • it takes more disk space then other indexes;
  • it's very expensive for INSERT/UPDATE/DELETE operations;
  • it's never used in the core.

Let developers or DBAs create it themselves if it needed.

Related issue(s)/PR(s)

No.

@JoshuaLuckers JoshuaLuckers added proposal Proposal about improvement aka RFC. Need to be discussed before start implementation. state/being-discussed labels Apr 20, 2019
@JoshuaLuckers
Copy link
Contributor

To make the best decision this should be discussed. To open the discussion:

I'm not convinced that removing the index solves a problem.

Reason to remove:

  • it takes up a lot of disk space;

How much of a problem is this?

  • it's very expensive for INSERT/UPDATE/DELETE operations.

What about SELECT operations? Those are mostly done on the table.

Let developers or DBAs create it themselves if it needed.

The same can be said about removing the index.

@Jako
Copy link
Collaborator

Jako commented Apr 20, 2019

A fulltext index is only used by a MATCH AGAINST query in MySQL: https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html (the same in MariaDB).

I can’t search for a usage in the core at the moment to verify the PR comment. But there could be use cases for the use of this index this in search extras. Maybe they use the index already. A depreciation note is necessary.

The index uses DB space and slows down the import of large datasets. In that case the fulltext index should be removed before an import and added afterwards according to some tips i.e. on stackoverflow.

@sergant210
Copy link
Collaborator Author

@JoshuaLuckers

How much of a problem is this?

I don't have exact numbers. More then the other indexes together.

What about SELECT operations? Those are mostly done on the table.

SELECT operations are not affected.

The same can be said about removing the index.

Yes if the goal is to keep the legacy code.

@Jako
Copy link
Collaborator

Jako commented Apr 20, 2019

Yes if the goal is to keep the legacy code.

Using an index is not legacy. But I am not sure, if that index should be created by an extra or should be default.

If the index is created by an extra, import scripts/extras have to regard those fulltext indexes too, to avoid speed issues.

@sergant210
Copy link
Collaborator Author

Imho, this index is legacy. It's created for 5 fields - pagetitle, longtitle, description, introtext and content. And it wil be used only if these 5 fields exist in query - MATCH(pagetitle, longtitle, description, introtext, content). In other cases it doesn't work.
Most sites use third-party applications, such as ElasticSearch, Algolia and etc.

@sergant210
Copy link
Collaborator Author

sergant210 commented Apr 21, 2019

@Jako Even SimpleSearch uses variable number of fields. And by default it uses six fields. So this index will be ignored.
Besides, SimpleSearch searches in BOOLEAN MODE.

What is documentation saying?

InnoDB tables require a FULLTEXT index on all columns of the MATCH() expression to perform boolean queries. Boolean queries against a MyISAM search index can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow.

@JoshuaLuckers
Copy link
Contributor

What if we optimise the index by reconsidering what columns it should use and leverage the index in the core (for example in the search function)?

@sergant210
Copy link
Collaborator Author

I doubt that we can guess what columns user wants to use. This Fulltext index isn't used in the backend. In the frontend let users use corresponding extras.

@sergant210
Copy link
Collaborator Author

sergant210 commented Apr 25, 2019

Do you know how many indices in the site_content table? 19.

How do you think all of these indices are needed? But they will all be rebuilt after each INSERT/UPDATE/DELETE operation.

@JoshuaLuckers
Copy link
Contributor

@opengeek what are your thoughts about this?

@Mark-H
Copy link
Collaborator

Mark-H commented Apr 25, 2019

I don't see any MATCH syntax in core that would use this index, so it does seem a little pointless to me and something worth cleaning up.

@Jako
Copy link
Collaborator

Jako commented Apr 25, 2019

But there has to be some deprecation note. Wherever we collect such things.

@Mark-H
Copy link
Collaborator

Mark-H commented Apr 25, 2019

Sure. I've added a page in the new documentation to collect changes like this: https://docs.modx.org/3.x/en/getting-started/maintenance/upgrading/3.0

I've also prepared a PR to that page for when this gets merged: modxorg/Docs#81

@sergant210 sergant210 marked this pull request as ready for review April 26, 2019 03:28
@opengeek
Copy link
Member

Why this isn't being used for the search in the manager is beyond me, but whatever.

@opengeek
Copy link
Member

In order to do any useful searching in MODX resources now, you will have to have an external search solution. This seems ludicrous to me.

@sergant210
Copy link
Collaborator Author

Why this isn't being used for the search in the manager is beyond me, but whatever.

Maybe nobody needs it. Laravel used Algolia and Taylor is quite happy.

@JoshuaLuckers
Copy link
Contributor

I'm in favour of improving the search functionality and make it use the index instead of removing it.

@sergant210 sergant210 closed this Sep 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposal about improvement aka RFC. Need to be discussed before start implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants