Skip to content

Clean up wcf1_tag_to_object#3904

Merged
TimWolla merged 7 commits intomasterfrom
tag-to-object-languageID
Feb 4, 2021
Merged

Clean up wcf1_tag_to_object#3904
TimWolla merged 7 commits intomasterfrom
tag-to-object-languageID

Conversation

@TimWolla
Copy link
Copy Markdown
Member

see #3803

  • Stop accessing wcf1_tag_to_object.languageID in TagCloudCacheBuilder::getTags()
  • Stop accessing wcf1_tag_to_object.languageID during DELETE in TagEngine
  • Stop accessing wcf1_tag_to_object.languageID in TagEngine::getObjectsTags()
  • Remove the languageID column from keys in wcf1_tag_to_object
  • Remove redundant column from wcf1_tag_to_object.tagID key
  • Make wcf1_tag_to_object's PRIMARY KEY a UNIQUE KEY

@TimWolla TimWolla requested a review from dtdesign January 27, 2021 10:54
@TimWolla
Copy link
Copy Markdown
Member Author

Update instructions are still missing.

@TimWolla TimWolla marked this pull request as draft January 27, 2021 10:54
Comment thread wcfsetup/install/files/lib/system/tagging/TagEngine.class.php Outdated
Comment thread wcfsetup/install/files/lib/system/tagging/TagEngine.class.php Outdated
@TimWolla TimWolla force-pushed the tag-to-object-languageID branch from c890f85 to aac29cd Compare January 27, 2021 11:39
@TimWolla TimWolla marked this pull request as ready for review January 27, 2021 11:44
@TimWolla TimWolla requested a review from a user January 27, 2021 12:01
@ghost ghost self-assigned this Jan 27, 2021
…:getTags()

This change comes with one primary benefit:

This stops accessing the redundant `languageID` column that is functionally
dependent on the tagID (see #3803).

On MariaDB 10.1 at a first glance this results in a *much* better query plan
(note the lower row count for the first query):

    MariaDB [*snip*]> EXPLAIN
        -> SELECT tag.tagid,
        ->        Count(object.objectid) AS counter
        -> FROM   wcf1_tag_to_object object
        ->        INNER JOIN wcf1_tag tag
        ->                ON tag.tagid = object.tagid
        -> WHERE  object.objecttypeid IN ( 92 )
        ->        AND tag.languageid IN ( 1 )
        -> GROUP  BY tag.tagid
        -> ORDER  BY counter DESC,
        ->           tag.tagid DESC
        -> LIMIT  500;
    +------+-------------+--------+------+-----------------------------------+------------+---------+---------------------------+------+-----------------------------------------------------------+
    | id   | select_type | table  | type | possible_keys                     | key        | key_len | ref                       | rows | Extra                                                     |
    +------+-------------+--------+------+-----------------------------------+------------+---------+---------------------------+------+-----------------------------------------------------------+
    |    1 | SIMPLE      | tag    | ref  | PRIMARY,languageID                | languageID | 4       | const                     | 5299 | Using where; Using index; Using temporary; Using filesort |
    |    1 | SIMPLE      | object | ref  | objectTypeID,objectTypeID_2,tagID | tagID      | 8       | *snip*.tag.tagID,const    |    3 | Using index                                               |
    +------+-------------+--------+------+-----------------------------------+------------+---------+---------------------------+------+-----------------------------------------------------------+
    2 rows in set (0.00 sec)

    MariaDB [*snip*]>
    MariaDB [*snip*]> EXPLAIN
        -> SELECT object.tagid,
        ->        Count(*) AS counter
        -> FROM   wcf1_tag_to_object object
        -> WHERE  object.objecttypeid IN ( 92 )
        ->        AND object.languageid IN ( 1 )
        -> GROUP  BY object.tagid
        -> ORDER  BY counter DESC,
        ->           object.tagid DESC
        -> LIMIT  500;
    +------+-------------+--------+------+-----------------------------------------------------------------+--------------+---------+-------------+-------+-----------------------------------------------------------+
    | id   | select_type | table  | type | possible_keys                                                   | key          | key_len | ref         | rows  | Extra                                                     |
    +------+-------------+--------+------+-----------------------------------------------------------------+--------------+---------+-------------+-------+-----------------------------------------------------------+
    |    1 | SIMPLE      | object | ref  | objectTypeID,objectTypeID_2,cbbba36334575d806c002a8756c8a107_fk | objectTypeID | 8       | const,const | 56293 | Using where; Using index; Using temporary; Using filesort |
    +------+-------------+--------+------+-----------------------------------------------------------------+--------------+---------+-------------+-------+-----------------------------------------------------------+
    1 row in set (0.00 sec)

When running the query this unfortunately is a bit slower 0.05 (new) vs 0.03
(old) seconds. This can also be confirmed with a (larger) MySQL 8 installation
and using EXPLAIN ANALYZE:

    > EXPLAIN ANALYZE
        -> SELECT tag.tagid,
        ->        Count(object.objectid) AS counter
        -> FROM   wcf1_tag_to_object object
        ->        INNER JOIN wcf1_tag tag
        ->                ON tag.tagid = object.tagid
        -> WHERE  object.objecttypeid IN ( 203, 337 )
        ->        AND tag.languageid IN ( 1 )
        -> GROUP  BY tag.tagid
        -> ORDER  BY counter DESC
        -> LIMIT  500 \G
    *************************** 1. row ***************************
    EXPLAIN: -> Limit: 500 row(s)  (actual time=166.506..166.540 rows=500 loops=1)
        -> Sort: counter DESC, limit input to 500 row(s) per chunk  (actual time=166.505..166.520 rows=500 loops=1)
            -> Table scan on <temporary>  (actual time=0.001..0.509 rows=16716 loops=1)
                -> Aggregate using temporary table  (actual time=163.863..165.054 rows=16716 loops=1)
                    -> Nested loop inner join  (cost=14797.71 rows=87401) (actual time=0.056..112.533 rows=201296 loops=1)
                        -> Index lookup on tag using languageID (languageID=1)  (cost=1132.88 rows=9229) (actual time=0.037..3.809 rows=18651 loops=1)
                        -> Filter: (object.objectTypeID in (203,337))  (cost=0.27 rows=9) (actual time=0.002..0.005 rows=11 loops=18651)
                            -> Index lookup on object using tagID (tagID=tag.tagID)  (cost=0.27 rows=12) (actual time=0.002..0.004 rows=11 loops=18651)

    1 row in set (0.16 sec)

    > EXPLAIN ANALYZE
        -> SELECT object.tagid,
        ->        Count(*) AS counter
        -> FROM   wcf1_tag_to_object object
        -> WHERE  object.objecttypeid IN ( 203, 337 )
        ->        AND object.languageid IN ( 1 )
        -> GROUP  BY object.tagid
        -> ORDER  BY counter DESC
        -> LIMIT  500 \G
    *************************** 1. row ***************************
    EXPLAIN: -> Limit: 500 row(s)  (actual time=135.420..135.461 rows=500 loops=1)
        -> Sort: counter DESC, limit input to 500 row(s) per chunk  (actual time=135.419..135.437 rows=500 loops=1)
            -> Table scan on <temporary>  (actual time=0.000..0.560 rows=16716 loops=1)
                -> Aggregate using temporary table  (actual time=132.599..133.936 rows=16716 loops=1)
                    -> Filter: ((object.languageID = 1) and (object.objectTypeID in (203,337)))  (cost=31642.06 rows=157965) (actual time=0.080..72.889 rows=201296 loops=1)
                        -> Index range scan on object using objectTypeID  (cost=31642.06 rows=157965) (actual time=0.076..53.904 rows=201296 loops=1)

    1 row in set (0.14 sec)

Nonetheless this appears to be worth it. Especially if we can remove the
`languageID` column in the future.

I also attempted to get rid of the second query, by simply putting a `tag.*`
into the column list of the first query. Unfortunately MariaDB 10.1 (which is
our minimum requirement) is too dumb to determine that all the columns in the
`tag` table are functionally dependent on `tag.tagID`. MySQL 8 is able to handle
that correctly.

I have verified that both queries result in the same results (except for the
undefined ordering when the counter is identical).
This stops accessing the redundant `languageID` column that is functionally
dependent on the tagID (see #3803).
…Tags()

This stops accessing the redundant `languageID` column that is functionally
dependent on the tagID (see #3803).

This change will make the query a little bit slower, but this will be
remediated by adjusting the indices on the wcf1_tag_to_object table after which
the performance will be identical:

    MariaDB [*snip*]> EXPLAIN
        -> SELECT tag.*,
        ->        tag_to_object.objectid
        -> FROM   wcf1_tag_to_object tag_to_object
        ->        LEFT JOIN wcf1_tag tag
        ->               ON ( tag.tagid = tag_to_object.tagid )
        -> WHERE  tag_to_object.objecttypeid = 92
        ->        AND tag_to_object.objectid IN ( 3553, 7990 )
        ->        AND tag_to_object.languageid IN ( 1 );
    +------+-------------+---------------+--------+----------------------------------------------------------------------+-------------------------------------+---------+-------------------------------+------+--------------------------+
    | id   | select_type | table         | type   | possible_keys                                                        | key                                 | key_len | ref                           | rows | Extra                    |
    +------+-------------+---------------+--------+----------------------------------------------------------------------+-------------------------------------+---------+-------------------------------+------+--------------------------+
    |    1 | SIMPLE      | tag_to_object | range  | objectTypeID,objectTypeID_2,cbbba36334575d806c002a8756c8a107_fk,test | cbbba36334575d806c002a8756c8a107_fk | 12      | NULL                          |    8 | Using where; Using index |
    |    1 | SIMPLE      | tag           | eq_ref | PRIMARY                                                              | PRIMARY                             | 4       | *snip*.tag_to_object.tagID    |    1 |                          |
    +------+-------------+---------------+--------+----------------------------------------------------------------------+-------------------------------------+---------+-------------------------------+------+--------------------------+
    2 rows in set (0.00 sec)

    MariaDB [*snip*]>
    MariaDB [*snip*]> EXPLAIN
        -> SELECT tag.*,
        ->        tag_to_object.objectid
        -> FROM   wcf1_tag_to_object tag_to_object
        ->        LEFT JOIN wcf1_tag tag
        ->               ON ( tag.tagid = tag_to_object.tagid )
        -> WHERE  tag_to_object.objecttypeid = 92
        ->        AND tag_to_object.objectid IN ( 3553, 7990 )
        ->        AND tag.languageid IN ( 1 );
    +------+-------------+---------------+--------+----------------------------------------+---------+---------+-------------------------------+------+--------------------------+
    | id   | select_type | table         | type   | possible_keys                          | key     | key_len | ref                           | rows | Extra                    |
    +------+-------------+---------------+--------+----------------------------------------+---------+---------+-------------------------------+------+--------------------------+
    |    1 | SIMPLE      | tag_to_object | range  | objectTypeID,objectTypeID_2,tagID,test | test    | 8       | NULL                          |    8 | Using where; Using index |
    |    1 | SIMPLE      | tag           | eq_ref | PRIMARY,languageID                     | PRIMARY | 4       | *snip*.tag_to_object.tagID    |    1 | Using where              |
    +------+-------------+---------------+--------+----------------------------------------+---------+---------+-------------------------------+------+--------------------------+
    2 rows in set (0.00 sec)
This column is functionally dependent on tagID. Since the previous commits this
column is no longer used and only filled for backwards compatibility.

See #3803
This key was identical to the `(objectTypeID, tagID)` key. We don't need the
objectTypeID here.
@ghost ghost force-pushed the tag-to-object-languageID branch from aac29cd to ec5e5fc Compare January 29, 2021 15:50
@TimWolla TimWolla added the Bug label Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants