Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve page conversion attribution performance with pre-calculated field #20375

Closed
bx80 opened this issue Feb 17, 2023 · 6 comments · Fixed by #20526
Closed

Improve page conversion attribution performance with pre-calculated field #20375

bx80 opened this issue Feb 17, 2023 · 6 comments · Fixed by #20526
Assignees
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone

Comments

@bx80
Copy link
Contributor

bx80 commented Feb 17, 2023

Summary

Currently the archiving query that calculates conversion attribution for pages (implemented in #2030, revised in #19974) includes an expensive sub-query to calculate the number of pages viewed before conversion.

To improve performance and remove to the need for this sub-query, we can instead calculate the 'number of pages viewed before' value for each conversion at the time the conversion record is created and store it in the log_conversion table in a new unsigned smallint (max value 65,535).

The query can then be adjusted to simply read the value from the log_conversion table instead of using a sub-query which should have a positive effect on performance and temporary table usage.

Retrospectively populating the value of this new field for large existing datasets could be time consuming and unnecessary as archived data will already exist for historic time periods.

The migration that adds this field should calculate historic values only for 'today' and 'yesterday' periods at the time of deployment, but only if there are less than 10,000 conversions in the 24hr period and only if Matomo installation is not hosted on *.matomo.cloud.

To cover cases where historic archives are invalidated and goal page attribution prior to deployment needs to be recalculated a new console command should be added to retrospectively calculate values, eg.
./console core:calc-conversion-pages --dates=2023-03-01,2023-04-01

These changes need to be released as part of Matomo 5.0.0

Refs: L3-313 and L3-402

@bx80 bx80 added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. 5.0.0 labels Feb 17, 2023
@bx80 bx80 added this to the 5.0.0 milestone Feb 17, 2023
@bx80 bx80 self-assigned this Feb 17, 2023
@sgiehl
Copy link
Member

sgiehl commented Feb 17, 2023

Adding a new column to the log_conversion table could though take a long time for very big instances with million of conversions being tracked.

@tsteur I guess we decided earlier to not include any possible bigger database changes in Matomo 5. Shall this be postponed to Matomo 6 then? Or maybe only be added for new instances and old instances not having the column would still use the old archiving?

@tsteur
Copy link
Member

tsteur commented Feb 19, 2023

@sgiehl this one could be an exception as it's only adding a new column on log_conversion which is typically the smallest of all log tables meaning it'll go faster and we could roll this migration out before the Matomo 5 release assuming Matomo 4 will still when this column is already there but not used yet (which it usually would out of the box)

@tsteur
Copy link
Member

tsteur commented Feb 19, 2023

Like we could also already add it now in a Matomo 4.x release for new installs which also will benefit some on-premise users and our trials which would mean there be a lot less accounts to migrate

@sgiehl
Copy link
Member

sgiehl commented Feb 20, 2023

Adding that with Matomo 4 for new install would be possible, but would add a bit more complexity. If we should do that, we should discuss internally how to solve that best before working on it, so we are able to easily change that in Matomo 5 again.

@tsteur
Copy link
Member

tsteur commented Feb 20, 2023

I was only meaning adding the column itself already for new installs in Matomo 4 but not using it. This would make it smoother some people to upgrade to Matomo 5. And also on the Cloud.
Then it Matomo 5 there be the actual usage of that field etc.

@sgiehl
Copy link
Member

sgiehl commented Feb 20, 2023

@tsteur Ok. In that case we might need to reprioritize this part. We are currently planning to release a last minor release of Matomo 4 this week (but this could be postponed to next week maybe). If you think this should be included, please clarify this with @mattab , so it will be included in the current sprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants