Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation load after adding new segments on long existing Piwik instance #6638

Closed
mgazdzik opened this Issue Nov 11, 2014 · 4 comments

Comments

Projects
None yet
3 participants
@mgazdzik
Copy link
Contributor

mgazdzik commented Nov 11, 2014

Current archiving flow can bring certaing ammount of problems when archiving segments on instances which are 2-3-4 years old. During normal flow of cron archiving, there will always be only last 2 years processed. Adding new segment(s) can bring up two following problems for archive process:

  • if at any time archiving will fall back to computing last3, last4 (anything bigger than last2) for year period - it can cause processing of days and months for that 3rd year. This will cause huge increase of archiving time. In addition, the more segments exist on instance, the more additional computing will have to be done to complete last3 archiving.
  • On big traffic instances adding new segments can also be troublesome, because Piwik would try to process last 2 days, weeks, months and years. Given a batch of 50 segments, such 'catching up' will take significantly bigger ammount of time. Workaround for this is to add only couple segments at one time, but this can be troublesome when having many Piwik admins.

The goal of this ticket is to decide best approarch to this issue and hopefully plan implementation for improvement

@mgazdzik

This comment has been minimized.

Copy link
Contributor Author

mgazdzik commented Nov 11, 2014

One idea to handle thosse issues is:

  • define single unit of new segments processing (for ex. only one month back including its days and weeks, all the way back up till ts_created of website). Having that done, we can set config variable to only process certain ammount of units for single segment in single archiving run. That way we can limit ammount of computation at single archive run and split whole process into more lighter runs and keep providing current statistics at the same time.
@mattab

This comment has been minimized.

Copy link
Member

mattab commented Nov 15, 2014

Thanks for the suggestion! we'll investigate this in 2.11.0 sprint, once we have #5363

@RMastop

This comment has been minimized.

Copy link
Contributor

RMastop commented Dec 1, 2014

Segment does not necessarily need to be archived since the beginning of time. I would suggest adding an extra set of properties: start-date and end-date of the segment. This way the archiving would not need to calculate all historic data.

@mattab mattab changed the title [RFC] Computation load after adding new segments on long existing Piwik instance Computation load after adding new segments on long existing Piwik instance Jan 6, 2015

@mattab mattab added the RFC label Jan 6, 2015

@mattab mattab modified the milestones: Piwik 2.11.0, Piwik 2.12.0 Feb 6, 2015

@mattab

This comment has been minimized.

Copy link
Member

mattab commented Mar 9, 2015

First step to solve this issue will be #7223

If that issue does not solve the problem all together, let's discuss again what solution we could put in place 👍

@mattab mattab closed this Mar 9, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.