You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working with a wiki that has a large data set, and seeing that currently I'm unable to edit a large number of pages due to a performance issue resulting from SMWSQLStore3Readers::getInProperties.
includes/storage/SQLStore/SMW_SQLStore3_Readers.php:getInProperties results in a generated SQL statement that uses a very inefficient INNER JOIN.
The generated query is SELECT DISTINCT smw_title,smw_sortkey FROM "smw_object_ids" INNER JOIN "smw_di_wikipage" AS t1 ON t1.p_id=smw_id WHERE t1.o_id='19683';
The slowness here is a result of the larger data set, I have a lot of articles referencing each other. This isn't a test set either, this data resulted from active use. select count(*) from mediawiki.smw_di_wikipage where o_id=19683; shows I have 101614 objects. So this INNER JOIN is an obvious killer.
Rewriting this to use a subquery instead of a join shows obvious performance improvements.
EXPLAIN ANALYZE VERBOSE SELECT DISTINCT smw_title,smw_sortkey FROM mediawiki."smw_object_ids" WHERE smw_id IN (SELECT p_id FROM mediawiki."smw_di_wikipage" WHERE o_id='19683’);
This subquery saves ~76 seconds and is 2800% faster. An obvious win overall. I'm prepared to rewrite this entire function, and am wondering if there is a reason I'm missing why a join is being used instead of a subquery before doing so.
The text was updated successfully, but these errors were encountered:
I'm working with a wiki that has a large data set, and seeing that currently I'm unable to edit a large number of pages due to a performance issue resulting from
SMWSQLStore3Readers::getInProperties
.includes/storage/SQLStore/SMW_SQLStore3_Readers.php
:getInProperties
results in a generated SQL statement that uses a very inefficientINNER JOIN
.The generated query is
SELECT DISTINCT smw_title,smw_sortkey FROM "smw_object_ids" INNER JOIN "smw_di_wikipage" AS t1 ON t1.p_id=smw_id WHERE t1.o_id='19683';
The slowness here is a result of the larger data set, I have a lot of articles referencing each other. This isn't a test set either, this data resulted from active use.
select count(*) from mediawiki.smw_di_wikipage where o_id=19683;
shows I have101614
objects. So thisINNER JOIN
is an obvious killer.Rewriting this to use a subquery instead of a join shows obvious performance improvements.
EXPLAIN ANALYZE VERBOSE SELECT DISTINCT smw_title,smw_sortkey FROM mediawiki."smw_object_ids" WHERE smw_id IN (SELECT p_id FROM mediawiki."smw_di_wikipage" WHERE o_id='19683’);
This subquery saves ~76 seconds and is 2800% faster. An obvious win overall. I'm prepared to rewrite this entire function, and am wondering if there is a reason I'm missing why a join is being used instead of a subquery before doing so.
The text was updated successfully, but these errors were encountered: