Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specify index hint for queryConversionsByPageView to optimize LogAggregator Performance #21701

Open
wants to merge 1 commit into
base: 4.x-dev
Choose a base branch
from

Conversation

Fabian123333
Copy link

@Fabian123333 Fabian123333 commented Dec 15, 2023

Description:

Currently the LogAggregator might be very slow in large environments of with millions of visitors each day. This mainly comes back to one sql query, which lacks of proper index hints. Settings this optimized the performance from over two weeks per execution, to like 0.9 seconds.

Review

…ance

Currently the LogAggregator might be *very* slow in large environments of with millions of visitors each day. This mainly comes back to one sql query, which lacks of proper index hints. Settings this optimized the performance from over two weeks per execution, to like 0.9 seconds.
@Fabian123333 Fabian123333 changed the title specify index hint for queryConversionsByPageView to optimize performance specify index hint for queryConversionsByPageView to optimize LogAggregator Performance Dec 15, 2023
@michalkleiner
Copy link
Contributor

@Fabian123333 thank you for opening the PR. As you created the PR from 4.x branch, can you please confirm that it is not an issue on 5.x or it doesn't apply there or anything like that?

@michalkleiner
Copy link
Contributor

Could you also provide more information on the system you experienced the speed up on, e.g. was it MariaDb or MySQL server? What versions? It would help us replicate/confirm the performance increase. Thanks!

@Fabian123333
Copy link
Author

Fabian123333 commented Dec 16, 2023

@Fabian123333 thank you for opening the PR. As you created the PR from 4.x branch, can you please confirm that it is not an issue on 5.x or it doesn't apply there or anything like that?

On the 5.x branch the DataAccess has been refactored, offering a separat addJoinPrefixHintToQuery method to support Index hints in the DbHelper Class:
https://github.com/matomo-org/matomo/blob/5.x-dev/core/DbHelper.php#L365

This uses JOIN_PREFIX, fully supported by MySQL 8, but breaks compatibility with MariaDB versions (as USE INDEX is the platform independent version for optimizer hints). So this possibly would need refactoring if the optimization should also work for MariaDB, but that's another topic I might take a look on the next days.

Could you also provide more information on the system you experienced the speed up on, e.g. was it MariaDb or MySQL server? What versions? It would help us replicate/confirm the performance increase. Thanks!

The performance benefits have been measured on MariaDB 10.5 (build version 10.5.21-MariaDB-0+deb11u1).

The produced queries looked as follows:

SELECT  /*+ MAX_EXECUTION_TIME(3600000) */ 
          yyy.idvisit AS idvisit,
          2 AS idgoal,
          1 AS `type`,
          yyy.idaction AS idaction,
          COUNT(*) AS `1`,     
          ROUND(SUM(yyy.revenue_total),2) AS `2`,          
          COUNT(yyy.idvisit) AS `3`,                   
          ROUND(SUM(yyy.revenue_subtotal),2) AS `4`,                 
          ROUND(SUM(yyy.revenue_tax),2) AS `5`,
          ROUND(SUM(yyy.revenue_shipping),2) AS `6`,
          ROUND(SUM(yyy.revenue_discount),2) AS `7`,
          SUM(yyy.items) AS `8`,
          yyy.pages_before AS `9`,                    
          SUM(yyy.attribution) AS `10`,                    
          COUNT(*) AS `12`,              
          ROUND(SUM(yyy.revenue),2) AS `15`         
        FROM (
          SELECT
            num_total AS pages_before,
            1 / num_total AS attribution,
            r.idvisit AS idvisit,
            lac.idaction AS idaction,
            lvcon.revenue AS revenue_total,
            1 / num_total * lvcon.revenue AS revenue,
            1 / num_total * lvcon.revenue_subtotal AS revenue_subtotal,
            1 / num_total * lvcon.revenue_tax AS revenue_tax,
            1 / num_total * lvcon.revenue_shipping AS revenue_shipping,
            1 / num_total * lvcon.revenue_discount AS revenue_discount,
            1 / num_total * lvcon.items AS items
          FROM (
            SELECT /* sites 9 */ /* 2023-12-14,2023-12-15 */ /* Actions */ /* trigger = CronArchive */
				log_conversion.idvisit, COUNT(*) AS num_total
			FROM
				matomo_log_conversion AS log_conversion RIGHT JOIN matomo_log_link_visit_action AS log_vpast ON log_conversion.idvisit = log_vpast.idvisit LEFT JOIN matomo_log_action AS lac_past ON log_vpast.idaction_url = lac_past.idaction
			WHERE
				log_conversion.server_time >= '2023-12-14 23:00:00'
				AND log_conversion.server_time <= '2023-12-15 22:59:59'
				AND log_conversion.idsite IN ('9')AND log_conversion.idgoal = 2 
                          AND log_vpast.server_time <= log_conversion.server_time
                          AND lac_past.type = 1
			GROUP BY
				log_conversion.idvisit
          ) AS r
          LEFT JOIN matomo_log_conversion lvcon ON lvcon.idgoal = 2 AND lvcon.idvisit = r.idvisit
          RIGHT JOIN matomo_log_link_visit_action logv USE INDEX (index_idvisit) ON logv.idvisit = r.idvisit
          LEFT JOIN matomo_log_action lac ON logv.idaction_url = lac.idaction
          WHERE logv.server_time >= '2023-12-14 23:00:00'
            AND logv.server_time <= '2023-12-15 22:59:59'
            AND logv.idsite IN (9) 
            AND lac.type = 1
            AND logv.server_time <= lvcon.server_time
          ) AS yyy
        GROUP BY yyy.idaction
        ORDER BY `9` DESC

Resulting in a runtime of 0.909sec, compared to like 14 days of runtime without the optimizer hints.

@sgiehl sgiehl added the Needs Review PRs that need a code review label Dec 18, 2023
Copy link
Contributor

This issue is in "needs review" but there has been no activity for 7 days. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale The label used by the Close Stale Issues action label Dec 26, 2023
@michalkleiner michalkleiner removed the Stale The label used by the Close Stale Issues action label Dec 27, 2023
Copy link
Contributor

github-actions bot commented Jan 4, 2024

This issue is in "needs review" but there has been no activity for 7 days. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale The label used by the Close Stale Issues action label Jan 4, 2024
@michalkleiner michalkleiner removed the Stale The label used by the Close Stale Issues action label Jan 21, 2024
Copy link
Contributor

This issue is in "needs review" but there has been no activity for 7 days. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale The label used by the Close Stale Issues action label Jan 29, 2024
Copy link
Contributor

This PR was last updated more than one month ago, maybe it's time to close it. Please check if there is anything we still can do or close this PR. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale for long The label used by the Close Stale Issues action label Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review Stale for long The label used by the Close Stale Issues action Stale The label used by the Close Stale Issues action
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants