Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom date range reports are slow: how to make them archive faster? #4768

Closed
mattab opened this Issue Feb 27, 2014 · 6 comments

Comments

Projects
None yet
2 participants
@mattab
Copy link
Member

commented Feb 27, 2014

The goal of this ticket is to discuss how we could improve the speed and efficiency of the custom date range report aggregation. Currently, archiving custom date ranges is slow.

@mattab

This comment has been minimized.

Copy link
Member Author

commented Sep 14, 2014

Example today's email to feedback@piwik.org: on the All Websites dashboard when default URL is module=MultiSites&action=index&idSite=2&period=range&date=last30 then user experiences slowness because the data is not pre-process and wil be processed . User said takes forever to load, only 10 websites. Almost faster to open each one... and this is one of many.

Update dec 2014: was fixed in #6672

sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014

sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014

Fixes matomo-org#4768 Implement performance improvement for period=ra…
…nge: do not archive sub-tables (only the parent table).

The sub-tables will be archived only when idSubtable is found, or flat=1, or expanded=1

@mattab mattab changed the title Custom date range reports should archive much faster Custom date range reports are slow: how to make them archive faster? Nov 11, 2014

@mattab mattab removed their assignment Nov 15, 2014

@mattab mattab modified the milestones: Mid term, Short term Dec 1, 2014

@mattab mattab modified the milestones: Short term, Mid term, Piwik 2.12.0 Feb 5, 2015

@mattab

This comment has been minimized.

Copy link
Member Author

commented Mar 12, 2015

@tsteur this request for example is slow: http://demo.piwik.org/index.php?module=CoreHome&action=index&date=2013-03-05,2015-03-11&period=range&idSite=1#/module=Actions&action=menuGetPageUrls&date=2013-03-05,2015-03-11&period=range&idSite=1 - it took about 40 seconds to archive. Maybe we could make this kind of large date range much faster?

@tsteur

This comment has been minimized.

Copy link
Member

commented Mar 12, 2015

This request takes 2.3 seconds when I request it... and it should be even faster once #7409 is merged

@tsteur

This comment has been minimized.

Copy link
Member

commented Mar 16, 2015

One thing that I noticed and took me a while to figure out was that, if someone actually uses range dates, one should disable browser archiving. Otherwise it will always re-archive the last day, week, month or year depending on the range. We might have to do this automatically (disable browser archiving for some subperiods if range is used and an archive already exists)

Note: Once we do pre-archive range dates this can become a problem as it would always pre-archive the last year / month / week / day as it will be always authorized to archive

@tsteur

This comment has been minimized.

Copy link
Member

commented Mar 19, 2015

A lot of improvements were made here. We have to decide next week how we want to continue with this problem. It might make sense to make further improvements when working on #7470 (refactoring the Archiver). One idea was for example to build the range only the requested record. This is not easy to add to the current implementation of the archiver but would bring quite a bit of improvement.

Another idea could be to sometimes substract range dates. Eg if today is 2015-03-19 and one fetches 2014-12-20,yesterday (yesterday will be very often the case) we could fetch the year of 2015 and substract the 2015-03-19 archive. Same if we have 10 months. Instead of fetching 10 monthly archives, we could fetch 1 yearly archive and substract 2 monthly archives. This is quite hard to implement though.

Easiest to implement and probably the fastest solution as well would be to only fetch the requested recordName and only the requested 1st level table or only the requested subtable. I tested it and it is very fast and easy to implement. Problem is it does not work with subtableIds. We'd have to use labels as subtables as we can generate the subtableIds only if we build the expanded table. Building the expanded table is expensive (in terms of needed time) again.

@mattab

This comment has been minimized.

Copy link
Member Author

commented Mar 30, 2015

Created follow up issues: (in order of importance)

  • When requesting Date Range or Custom Segment, only archive the requested record #7573
  • Date Range archiving: Only archive report for the requested idSubtable #7575
  • Make Date Range use the optimal number of periods by substracting periods #7574
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.