New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Massive growth of archive data after update to 3.12.0 #15145
Comments
There is another data point. Just before the update somebody might have setup a new custom date period (appearing as
Is it possible that the increase in archive size is related to the new custom date period? |
Nope. But there seems to be a jump from
|
From the logs again. After the update to 3.12.0 it looks like old reports are invalidated / recreated on every cron run with the default 52 days period.
|
It looks like the archived data is invalidated on every cron run starting with 3.12.0:
This did not happen before. Also I do not quite understand why the invalidation triggers archival of |
This might be related to #14639 @diosmosis ping, do you have an idea on how to proceed with troubleshooting? |
Stepping through the debugger I find the following in CronArchive.php#L870 // when some data was purged from this website
// we make sure we query all previous days/weeks/months
$processDaysSince = $lastTimestampWebsiteProcessedDay;
if ($websiteInvalidatedShouldReprocess
// when --force-all-websites option,
// also forces to archive last52 days to be safe
|| $this->shouldArchiveAllSites) {
$processDaysSince = false;
}
$date = $this->getApiDateParameter($idSite, "day", $processDaysSince); Since If I understand things correctly, then site invalidation always was inefficient, but #14639 just revealed that by making it the default behavior. |
After the update from 3.11.0 to 3.12.0 space requirements of archived data grow massively. Especially the monthly numeric as well as blob archive rows multiplied.
Also the generated rows for daily archives are skyrocketing:
The update was performed on November 7th in the evening.
From the changelog I gather that archiving was improved a lot since 3.11.0. Which aspect of the archiving process could possibly trigger such a massive increase in space requirements?
The text was updated successfully, but these errors were encountered: