New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New diagnostic command 'diagnostics:analyze-archive-table' that analyzes archive tables #8625

Merged
merged 4 commits into from Aug 25, 2015

Conversation

Projects
None yet
3 participants
@diosmosis
Member

diosmosis commented Aug 22, 2015

I created a new command that will analyze a single archive tables to find the following information:

  • number of archives present
  • number of invalidated archives
  • number of temporary archives
  • number of error archives
  • number of segment archives
  • number of numeric rows
  • number of blob rows

This information can be used by users or by people maintaining other installs to see why some archive tables are bigger than others. Even if the table isn't bloated, sysadmins may not be aware of how Piwik is being used, and this command can give insights regarding, for example, if many segments are being used on certain months or if many archives result in an error, etc.

Sample output:

Statistics for the archive_numeric_2015_01 and archive_blob_2015_01 tables:

+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+
| Group                                     | # Archives | # Invalidated | # Temporary | # Error | # Segment | # Numeric Rows | # Blob Rows |
+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+
| day[2015-01-01 - 2015-01-01] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| month[2015-01-01 - 2015-01-31] idSite = 1 | 5          | 0             | 0           | 0       | 4         | 45             | 95          |
| day[2015-01-02 - 2015-01-02] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 27             | 63          |
| day[2015-01-03 - 2015-01-03] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 27             | 63          |
| day[2015-01-04 - 2015-01-04] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 43             | 79          |
| day[2015-01-05 - 2015-01-05] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 31             | 67          |
| week[2015-01-05 - 2015-01-11] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 45             | 95          |
| day[2015-01-06 - 2015-01-06] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 35             | 71          |
| day[2015-01-07 - 2015-01-07] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 31             | 67          |
| day[2015-01-08 - 2015-01-08] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 29             | 63          |
| day[2015-01-09 - 2015-01-09] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-10 - 2015-01-10] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-11 - 2015-01-11] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-12 - 2015-01-12] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| week[2015-01-12 - 2015-01-18] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-13 - 2015-01-13] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-14 - 2015-01-14] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-15 - 2015-01-15] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-16 - 2015-01-16] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-17 - 2015-01-17] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-18 - 2015-01-18] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-19 - 2015-01-19] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| week[2015-01-19 - 2015-01-25] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-20 - 2015-01-20] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-21 - 2015-01-21] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-22 - 2015-01-22] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-23 - 2015-01-23] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-24 - 2015-01-24] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-25 - 2015-01-25] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-26 - 2015-01-26] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| week[2015-01-26 - 2015-02-01] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-27 - 2015-01-27] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-28 - 2015-01-28] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-29 - 2015-01-29] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-30 - 2015-01-30] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-31 - 2015-01-31] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+

Total # Archives: 138
Total # Invalidated Archives: 0
Total # Temporary Archives: 0
Total # Error Archives: 0
Total # Segment Archives: 102

Possible improvements

  • Analyze only specific archives, eg, invalidated / temporary / segment archives.
  • Display number of rows per individual archive (would display one archive per row, should combine w/ above improvement).

TODO

  • Tests (1 integration for ArchiveTableDao + 1 system for command)
  • Switch from using tables_prefix to Db\Settings::getTablePrefix().
if (isset($result[$label])) {
$result[$label] = array_merge($result[$label], $blobStatsRow);
} else {
$result[$label] = $blobStatsRow + $numericQueryEmptyRow;

This comment has been minimized.

@tsteur

tsteur Aug 24, 2015

Member

Do you think it's worth adding a test for this edge case? or maybe there is already :) either way up to you...

@tsteur

tsteur Aug 24, 2015

Member

Do you think it's worth adding a test for this edge case? or maybe there is already :) either way up to you...

This comment has been minimized.

@tsteur

View changes

Show outdated Hide outdated core/DataAccess/ArchiveTableDao.php
@tsteur

View changes

Show outdated Hide outdated core/DataAccess/ArchiveTableDao.php

diosmosis added some commits Aug 22, 2015

Adding diagnostic command that allows you to see exactly what's insid…
…e an archive table. Useful when trying to find out why one table is using a lot of space or is using more space than others.
@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Aug 24, 2015

Member

Ready for review or merge.

Member

diosmosis commented Aug 24, 2015

Ready for review or merge.

tsteur added a commit that referenced this pull request Aug 25, 2015

Merge pull request #8625 from piwik/diagnostic_archive_cmd
Adding diagnostic command that analyzes archive tables

@tsteur tsteur merged commit dc681e6 into master Aug 25, 2015

1 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
continuous-integration/travis-ci/push The Travis CI build failed
Details
Scrutinizer 5 updated code elements
Details

@tsteur tsteur deleted the diagnostic_archive_cmd branch Aug 25, 2015

@mattab mattab changed the title from Adding diagnostic command that analyzes archive tables to New diagnostic command 'diagnostics:analyze-archive-table' that analyzes archive tables Oct 13, 2015

@mattab mattab added the Major label Oct 13, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment