Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the FULLTEXT index from the mysql and sqlsrv schemas. #14558

Open
wants to merge 1 commit into
base: 3.x
from

Conversation

Projects
None yet
5 participants
@sergant210
Copy link
Contributor

commented Apr 15, 2019

Little SQL optimization. For discussion...

What does it do?

Removed FULLTEXT index from the mysql and sqlsrv schemas of the site_content table.

Why is it needed?

This is a multi-columns index ('pagetitle', 'longtitle', 'description', 'introtext' and 'content'). I think it was supposed for searching in the table of contents. But it's not used in the core.
Reason to remove:

  • it takes more disk space then other indexes;
  • it's very expensive for INSERT/UPDATE/DELETE operations;
  • it's never used in the core.

Let developers or DBAs create it themselves if it needed.

Related issue(s)/PR(s)

No.

@JoshuaLuckers

This comment has been minimized.

Copy link
Collaborator

commented Apr 20, 2019

To make the best decision this should be discussed. To open the discussion:

I'm not convinced that removing the index solves a problem.

Reason to remove:

  • it takes up a lot of disk space;

How much of a problem is this?

  • it's very expensive for INSERT/UPDATE/DELETE operations.

What about SELECT operations? Those are mostly done on the table.

Let developers or DBAs create it themselves if it needed.

The same can be said about removing the index.

@Jako

This comment has been minimized.

Copy link
Collaborator

commented Apr 20, 2019

A fulltext index is only used by a MATCH AGAINST query in MySQL: https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html (the same in MariaDB).

I can’t search for a usage in the core at the moment to verify the PR comment. But there could be use cases for the use of this index this in search extras. Maybe they use the index already. A depreciation note is necessary.

The index uses DB space and slows down the import of large datasets. In that case the fulltext index should be removed before an import and added afterwards according to some tips i.e. on stackoverflow.

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

@JoshuaLuckers

How much of a problem is this?

I don't have exact numbers. More then the other indexes together.

What about SELECT operations? Those are mostly done on the table.

SELECT operations are not affected.

The same can be said about removing the index.

Yes if the goal is to keep the legacy code.

@Jako

This comment has been minimized.

Copy link
Collaborator

commented Apr 20, 2019

Yes if the goal is to keep the legacy code.

Using an index is not legacy. But I am not sure, if that index should be created by an extra or should be default.

If the index is created by an extra, import scripts/extras have to regard those fulltext indexes too, to avoid speed issues.

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

Imho, this index is legacy. It's created for 5 fields - pagetitle, longtitle, description, introtext and content. And it wil be used only if these 5 fields exist in query - MATCH(pagetitle, longtitle, description, introtext, content). In other cases it doesn't work.
Most sites use third-party applications, such as ElasticSearch, Algolia and etc.

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 21, 2019

@Jako Even SimpleSearch uses variable number of fields. And by default it uses six fields. So this index will be ignored.
Besides, SimpleSearch searches in BOOLEAN MODE.

What is documentation saying?

InnoDB tables require a FULLTEXT index on all columns of the MATCH() expression to perform boolean queries. Boolean queries against a MyISAM search index can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow.

@JoshuaLuckers

This comment has been minimized.

Copy link
Collaborator

commented Apr 21, 2019

What if we optimise the index by reconsidering what columns it should use and leverage the index in the core (for example in the search function)?

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 21, 2019

I doubt that we can guess what columns user wants to use. This Fulltext index isn't used in the backend. In the frontend let users use corresponding extras.

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2019

Do you know how many indices in the site_content table? 19.

How do you think all of these indices are needed? But they will all be rebuilt after each INSERT/UPDATE/DELETE operation.

@JoshuaLuckers

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

@opengeek what are your thoughts about this?

@Mark-H

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

I don't see any MATCH syntax in core that would use this index, so it does seem a little pointless to me and something worth cleaning up.

@Jako

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

But there has to be some deprecation note. Wherever we collect such things.

Mark-H added a commit to modxorg/Docs that referenced this pull request Apr 25, 2019

@Mark-H

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

Sure. I've added a page in the new documentation to collect changes like this: https://docs.modx.org/3.x/en/getting-started/maintenance/upgrading/3.0

I've also prepared a PR to that page for when this gets merged: modxorg/Docs#81

@sergant210 sergant210 marked this pull request as ready for review Apr 26, 2019

@sergant210 sergant210 requested review from Mark-H and opengeek as code owners Apr 26, 2019

@opengeek

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

Why this isn't being used for the search in the manager is beyond me, but whatever.

@opengeek

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

In order to do any useful searching in MODX resources now, you will have to have an external search solution. This seems ludicrous to me.

@sergant210

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

Why this isn't being used for the search in the manager is beyond me, but whatever.

Maybe nobody needs it. Laravel used Algolia and Taylor is quite happy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.