Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Maintenance] Optimize tables with full text indexes periodically #11817

Merged
merged 2 commits into from Apr 5, 2022

Conversation

BlackbitDevs
Copy link
Contributor

@BlackbitDevs BlackbitDevs commented Mar 31, 2022

Full text indexes get slower over time because of gaps caused by deleting entries and because word counts get fragmented. In a concrete case we had the problem of ~600K entries in search_backend_data and the query

SELECT id FROM search_backend_data WHERE MATCH (`data`,`properties`) AGAINST ('123456' IN BOOLEAN MODE);

took 28 seconds.
This gets used in

$conditionFilters[] = 'oo_id IN (SELECT id FROM search_backend_data WHERE maintype = "object" AND MATCH (`data`,`properties`) AGAINST (' . $list->quote($query) . ' IN BOOLEAN MODE))';

and also in other places like object search.

After running
OPTIMIZE TABLE search_backend_data
this same query takes 0.1 seconds.
It is even documented in the MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/fulltext-fine-tuning.html#fulltext-optimize

This PR adds a maintenance task which optimizes the tables with fulltext indexes once every 7 days - currently those are search_backend_data and email_log. I also thought about fetching all tables which have fulltext indexes and optimize those but this would pull control of third-party tables like from plugins or application-specific implementation away from the owners.

@brusch brusch added this to the 10.4.0 milestone Apr 4, 2022
@brusch
Copy link
Member

brusch commented Apr 4, 2022

LGTM, but tasks need to be registered as services, see:

Pimcore\Maintenance\Tasks\VersionsCleanupTask:

Can you please update your PR accordingly?
Thanks in advance!

@brusch brusch self-assigned this Apr 4, 2022
@BlackbitDevs
Copy link
Contributor Author

Sorry, forgot the service. Is fixed now.

@brusch brusch merged commit 94868b5 into pimcore:10.x Apr 5, 2022
@brusch
Copy link
Member

brusch commented Apr 5, 2022

👍 Thanks!

@BlackbitDevs
Copy link
Contributor Author

Just as a note: I tried to migrate this maintenance task to a Pimcore 6.9 system (with PdoStore as lock backend) and stumbled upon the MySQL error

General error 2014 Cannot execute queries while other unbuffered queries are active. Consider using PDOStatement::fetchAll().  Alternatively, if your code is only ever going to run against mysql, you may enable query buffering by setting the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY attribute.

There I changed the calls in

Db::get()->exec('OPTIMIZE TABLE search_backend_data');
Db::get()->exec('OPTIMIZE TABLE email_log');

to use fetchAll() instead of exec() and this works. Do not know what caused the error but if this ever occurs also for Pimcore 10, this may become useful...

@DrLuke
Copy link
Contributor

DrLuke commented May 13, 2022

to use fetchAll() instead of exec() and this works. Do not know what caused the error but if this ever occurs also for Pimcore 10, this may become useful...

We just upgraded to Pimcore 10 and had this issue.

Can be solved by overriding the class:
overrides/Pimcore/Maintenance/Tasks/FullTextIndexOptimizeTask.php

<?php

namespace Pimcore\Maintenance\Tasks;

use Pimcore\Db;
use Pimcore\Maintenance\TaskInterface;
use Symfony\Component\Lock\LockFactory;

/**
 * @internal
 */
class FullTextIndexOptimizeTask implements TaskInterface
{
    /** @var \Symfony\Component\Lock\LockInterface */
    private $lock;

    public function __construct(LockFactory $lockFactory)
    {
        $this->lock = $lockFactory->createLock(self::class, 86400 * 7, false);
    }

    /**
     * {@inheritdoc}
     */
    public function execute()
    {
        // Override to make it work
        // See: https://github.com/pimcore/pimcore/pull/11817#issuecomment-1105026544
        if ($this->lock->acquire(false)) {
            Db::get()->fetchAll('OPTIMIZE TABLE search_backend_data');
            Db::get()->fetchAll('OPTIMIZE TABLE email_log');
        }
    }
}

And then in your composer.json:

"psr-4": {
  "App\\": "src/",
  "Pimcore\\Maintenance\\Tasks\\": "overrides/Pimcore/Maintenance/Tasks",
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants