Skip to content

Commit

Permalink
refs #1700 basic performance analytics: handle server generation time…
Browse files Browse the repository at this point in the history
… for each page and page title

CORE

 * formatting sub-second times
 * getColumn() method on data table array (in order to behave the same as the regular data table class)
 * data tables can store in their meta data, which columns are empty (this is used in order to dynamically hide the new "generation time" column)
 * ViewDataTable and Api.getProcessedReport act according to the empty column meta data


SCHEMA

 * new column custom_float_1 in log_link_visit_action
 * new version to apply the change


TRACKER

 * Piwik_Tracker::setGenerationTime
 * tracking parameter "generation_time_me"
 * value is stored in new custom_float_1 column
 * the log importer can handle a group "generation_time_micro" which can be used in a custom log format. _micro is used because apache can log the time in microseconds but piwik processes milliseconds.
 * note: extension of JS tracker still missing


ACTIONS PLUGIN

 * for pages and page titles, add new columns sum_time_generation and nb_hits_with_time_generation to the blob archives
 * if they are set, compute avg_time_generation on the API level. if not, remove the columns and mark them as empty in the data table meta data.
 * show new column "avg. generation time" in the pages and page titles reports


plus TESTS for everything
  • Loading branch information
timo-bes committed Mar 26, 2013
1 parent fc578d6 commit 65458d0
Show file tree
Hide file tree
Showing 83 changed files with 1,113 additions and 23 deletions.
6 changes: 6 additions & 0 deletions core/Archive.php
Expand Up @@ -77,6 +77,10 @@ abstract class Piwik_Archive
// Site Search
const INDEX_SITE_SEARCH_HAS_NO_RESULT = 28;
const INDEX_PAGE_IS_FOLLOWING_SITE_SEARCH_NB_HITS = 29;

// Performance Analytics
const INDEX_PAGE_SUM_TIME_GENERATION = 30;
const INDEX_PAGE_NB_HITS_WITH_TIME_GENERATION = 31;

// Goal reports
const INDEX_GOAL_NB_CONVERSIONS = 1;
Expand Down Expand Up @@ -105,6 +109,8 @@ abstract class Piwik_Archive
// Actions metrics
Piwik_Archive::INDEX_PAGE_NB_HITS => 'nb_hits',
Piwik_Archive::INDEX_PAGE_SUM_TIME_SPENT => 'sum_time_spent',
Piwik_Archive::INDEX_PAGE_SUM_TIME_GENERATION => 'sum_time_generation',
Piwik_Archive::INDEX_PAGE_NB_HITS_WITH_TIME_GENERATION => 'nb_hits_with_time_generation',

Piwik_Archive::INDEX_PAGE_EXIT_NB_UNIQ_VISITORS => 'exit_nb_uniq_visitors',
Piwik_Archive::INDEX_PAGE_EXIT_NB_VISITS => 'exit_nb_visits',
Expand Down
2 changes: 2 additions & 0 deletions core/DataTable.php
Expand Up @@ -140,6 +140,8 @@ class Piwik_DataTable
/** Name for metadata that describes when a report was archived. */
const ARCHIVED_DATE_METADATA_NAME = 'archived_date';
const MAX_DEPTH_DEFAULT = 15;
/** Name for metadata that describes which columns are empty and should not be shown. */
const EMPTY_COLUMNS_METADATA_NAME = 'empty_column';

/**
* Maximum nesting level.
Expand Down
19 changes: 19 additions & 0 deletions core/DataTable/Array.php
Expand Up @@ -230,6 +230,25 @@ public function deleteColumn($column)
$table->deleteColumn($column);
}
}

/**
* Returns the array containing all rows values in all data tables for the requested column
*
* @param string $name
* @return array
*/
public function getColumn( $name )
{
$values = array();
foreach($this->array as $table)
{
$moreValues = $table->getColumn($name);
foreach ($moreValues as &$value) {
$values[] = $value;
}
}
return $values;
}

/**
* Merges the rows of every child DataTable into a new DataTable and
Expand Down
1 change: 1 addition & 0 deletions core/Db/Schema/Myisam.php
Expand Up @@ -330,6 +330,7 @@ class_name VARCHAR(255) NULL,
custom_var_v4 VARCHAR(200) DEFAULT NULL,
custom_var_k5 VARCHAR(200) DEFAULT NULL,
custom_var_v5 VARCHAR(200) DEFAULT NULL,
custom_float_1 FLOAT NULL DEFAULT NULL,

This comment has been minimized.

Copy link
@halfdan

halfdan Mar 26, 2013

Member

That feels so dirty :(

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 27, 2013

Author Member

Matt and I discussed this for almost an hour... We can't use CustomVariables becuase ECommerce Tracking already uses four of them. So we had to add a new column. We picked a generic name so that we can reuse it for event tracking at a later point.
I know it's not ideal, but it's consistent with custom_var_xx and the way ECommerce Tracking and SiteSearch work.
Do you have a better idea?

This comment has been minimized.

Copy link
@halfdan

halfdan Mar 27, 2013

Member

I don't have a better idea, but:

  • In core/Tracker/Action you statically set the name of the column: const DB_COLUMN_TIME_GENERATION = 'custom_float_1'; - so the column is certainly not custom and should rather be named for what it is: generationTime.
  • The current implementation of custom variables is not flexible (adding more custom variables will require 2*(custom variables) new fields in the database - people are even asking for ~50 custom variables), we should not create more clutter with custom_float_n.

Don't take the criticism personally, it's more about how we handle these custom things in general that I dislike.

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 27, 2013

Author Member

To be honest, I dislike the way these things are handled, too. As I said, we aimed for a consistent solution and it's called custom_float because we will reuse the field for other stuff when the record is a different action type. For event tracking, the value will be set by the user, so custom is not too wrong...

This comment has been minimized.

Copy link
@halfdan

halfdan Mar 27, 2013

Member

Did someone test how much of a performance impact it would be to just create a new table and put custom data in there?

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 27, 2013

Author Member

I don't think so. This layout was added with custom variables and we kept it since.
The thing is that there's no point in doing it differently for performance tracking. We would have to refactor custom variables, site search and ecommerce as well to have a consistent code base.
So this feature probably is the wrong place to discuss it. It would be better to have a separate ticket to investigate and refactor the log layout.

This comment has been minimized.

Copy link
@halfdan

halfdan Mar 27, 2013

Member

I'll create a ticket once my trac login works again.

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 27, 2013

Author Member

Great, thanks. Please post the ticket number here.
It would be awesome if we could find a better solution...

This comment has been minimized.

Copy link
@mattab

mattab Mar 28, 2013

Member

custom_float_1 renamed to custom_float (we talked about it on the phone btw)

and the solution becomes beautiful hehe.

Note: it will be perfect to support Event tracking later, so we can have an event value (eg. "Time spent watching this video") and automatically summarize this numeric values (eg. build histogram, show min/max, avg, etc.)

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 28, 2013

Author Member

Didn't we talk about measuring multiple performance metrics in the future? What are we going to call the column for DOM render time? custom_float_2? Then we have custom_float and custom_float_2...

PRIMARY KEY(idlink_va),
INDEX index_idvisit(idvisit),
INDEX index_idsite_servertime ( idsite, server_time )
Expand Down
14 changes: 10 additions & 4 deletions core/Piwik.php
Expand Up @@ -1537,19 +1537,25 @@ static public function getPrettySizeFromBytes( $size, $unit = null, $precision =
* @param int $numberOfSeconds
* @param bool $displayTimeAsSentence If set to true, will output "5min 17s", if false "00:05:17"
* @param bool $isHtml
* @param bool $round to the full seconds
* @return string
*/
static public function getPrettyTimeFromSeconds($numberOfSeconds, $displayTimeAsSentence = true, $isHtml = true)
static public function getPrettyTimeFromSeconds($numberOfSeconds, $displayTimeAsSentence = true, $isHtml = true, $round = false)
{
$numberOfSeconds = (int)$numberOfSeconds;
$numberOfSeconds = $round ? (int)$numberOfSeconds : (float)$numberOfSeconds;

// Display 01:45:17 time format
if($displayTimeAsSentence === false)
{
$hours = floor( $numberOfSeconds / 3600);
$minutes = floor( ($reminder = ($numberOfSeconds - $hours * 3600)) / 60 );
$seconds = $reminder - $minutes * 60;
return sprintf("%02s", $hours) . ':' . sprintf("%02s", $minutes) .':'. sprintf("%02s", $seconds);
$seconds = floor( $reminder - $minutes * 60 );
$time = sprintf("%02s", $hours) . ':' . sprintf("%02s", $minutes) .':'. sprintf("%02s", $seconds);
$milliSeconds = ($numberOfSeconds * 1000) % 1000;
if ($milliSeconds) {
$time .= '.' . $milliSeconds;
}
return $time;
}
$secondsInYear = 86400 * 365.25;
$years = floor($numberOfSeconds / $secondsInYear);
Expand Down
26 changes: 24 additions & 2 deletions core/Tracker/Action.php
Expand Up @@ -61,6 +61,8 @@ class Piwik_Tracker_Action implements Piwik_Tracker_Action_Interface
private $searchCategory = false;
private $searchCount = false;

private $timeGeneration = false;

/**
* Encoding of HTML page being viewed. See reencodeParameters for more info.
*
Expand All @@ -80,6 +82,10 @@ class Piwik_Tracker_Action implements Piwik_Tracker_Action_Interface
const PARAMETER_NAME_SEARCH_COUNT = 'search_count';
const PARAMETER_NAME_SEARCH_CATEGORY = 'search_cat';
const PARAMETER_NAME_SEARCH_KEYWORD = 'search';

/* Custom Variables names & slots plus Tracking API Parameters for performance analytics */
const DB_COLUMN_TIME_GENERATION = 'custom_float_1';
const PARAMETER_NAME_TIME_GENERATION = 'generation_time_ms';

/**
* Map URL prefixes to integers.
Expand Down Expand Up @@ -408,7 +414,8 @@ public static function getQueryParametersToExclude($idSite)

$parametersToExclude = array_merge($excludedParameters,
self::$queryParametersToExclude,
$campaignTrackingParameters);
$campaignTrackingParameters,
array(self::PARAMETER_NAME_TIME_GENERATION));

$parametersToExclude = array_map('strtolower', $parametersToExclude);
return $parametersToExclude;
Expand Down Expand Up @@ -701,6 +708,11 @@ public function record( $idVisit, $visitorIdCookie, $idRefererActionUrl, $idRefe
'idaction_name_ref' => $idRefererActionName,
'time_spent_ref_action' => $timeSpentRefererAction
);

if (!empty($this->timeGeneration)) {
$insert[self::DB_COLUMN_TIME_GENERATION] = (int)$this->timeGeneration;
}

$customVariables = $this->getCustomVariables();

$insert = array_merge($insert, $customVariables);
Expand Down Expand Up @@ -744,7 +756,7 @@ public function getCustomVariables()
{
$customVariables = Piwik_Tracker_Visit::getCustomVariables($scope = 'page', $this->request);

// Enrich Site Search actions with Custom Variables, overwritting existing values
// Enrich Site Search actions with Custom Variables, overwriting existing values
if (!empty($this->searchCategory)) {
if (!empty($customVariables['custom_var_k' . self::CVAR_INDEX_SEARCH_CATEGORY])) {
printDebug("WARNING: Overwriting existing Custom Variable in slot " . self::CVAR_INDEX_SEARCH_CATEGORY . " for this page view");
Expand Down Expand Up @@ -851,6 +863,8 @@ protected function extractUrlAndActionNameFromRequest()
$actionType = self::TYPE_SITE_SEARCH;
list($actionName, $url) = $siteSearch;
}
// Look for performance analytics parameters
$this->detectPerformanceAnalyticsParameters();
}
$actionName = self::cleanupString($actionName);

Expand Down Expand Up @@ -1026,6 +1040,14 @@ protected function detectSiteSearchFromUrl($website, $parsedUrl)
$categoryName = trim(urldecode($categoryName));
return array($url, $actionName, $categoryName, $count);
}

protected function detectPerformanceAnalyticsParameters()
{
$generationTime = Piwik_Common::getRequestVar(self::PARAMETER_NAME_TIME_GENERATION, -1, 'int', $this->request);
if ($generationTime > 0) {
$this->timeGeneration = $generationTime;
}
}

/**
* Clean up string contents (filter, truncate, ...)
Expand Down
35 changes: 35 additions & 0 deletions core/Updates/1.12-b1.php
@@ -0,0 +1,35 @@
<?php
/**
* Piwik - Open source web analytics
*
* @link http://piwik.org
* @license http://www.gnu.org/licenses/gpl-3.0.html GPL v3 or later
*
* @category Piwik
* @package Updates
*/

/**
* @package Updates
*/
class Piwik_Updates_1_12_b1 extends Piwik_Updates
{
static function isMajorUpdate()
{
return true;
}

static function getSql($schema = 'Myisam')
{
return array(
'ALTER TABLE `'. Piwik_Common::prefixTable('log_link_visit_action') .'`
ADD `custom_float_1` FLOAT NULL DEFAULT NULL' => false
);
}

static function update()
{
Piwik_Updater::updateDatabase(__FILE__, self::getSql());
}

}
2 changes: 1 addition & 1 deletion core/Version.php
Expand Up @@ -20,5 +20,5 @@ final class Piwik_Version
* Current Piwik version
* @var string
*/
const VERSION = '1.11.1';
const VERSION = '1.12-b1';
}
16 changes: 16 additions & 0 deletions core/ViewDataTable.php
Expand Up @@ -1262,6 +1262,7 @@ public function setColumnsToDisplay( $columnsNames )
/**
* Returns columns names to display, in order.
* If no columns were specified to be displayed, return all columns found in the first row.
* If the data table has empty_columns meta data set, those columns will be removed.
* @param array PHP array conversion of the data table
* @return array
*/
Expand All @@ -1279,6 +1280,21 @@ public function getColumnsToDisplay()
}

$this->columnsToDisplay = array_filter($this->columnsToDisplay);

$emptyColumns = $this->dataTable->getMetadata(Piwik_DataTable::EMPTY_COLUMNS_METADATA_NAME);

This comment has been minimized.

Copy link
@peterbo

peterbo Mar 26, 2013

Contributor

Hi Timo, I get a "Fatal error: Call to undefined method Piwik_DataTable_Array::getMetadata() in /var/www/vhosts/media-proweb.de/httpdocs/statistik/core/ViewDataTable.php on line 1284" when generating the sparklines / graphs on this one.

This comment has been minimized.

Copy link
@timo-bes

timo-bes Mar 27, 2013

Author Member

Should be fixed now. Thanks, @peterbo.
@mattab Here's a problem with not having an interface for DataTable and DataTable_Array that the tests didn't catch... Are you sure you don't want to add to the 2.0 refactoring list to work with an interface for the two classes?

if (is_array($emptyColumns))
{
foreach ($emptyColumns as $emptyColumn)
{
$key = array_search($emptyColumn, $this->columnsToDisplay);
if ($key !== false)
{
unset($this->columnsToDisplay[$key]);
}
}
$this->columnsToDisplay = array_values($this->columnsToDisplay);
}

return $this->columnsToDisplay;
}

Expand Down
3 changes: 3 additions & 0 deletions lang/en.php
Expand Up @@ -180,6 +180,9 @@
'General_ColumnAverageTimeOnPage' => 'Avg. time on page',
'General_TimeOnPage' => 'Time on page',
'General_ColumnAverageTimeOnPageDocumentation' => 'The average amount of time visitors spent on this page (only the page, not the entire website).',
'General_ColumnGenerationTime' => 'Generation time',
'General_ColumnAverageGenerationTime' => 'Avg. generation time',
'General_ColumnAverageGenerationTimeDocumentation' => 'The average time it took to generate the page. The value has either been passed to the Java Script tracker by your website or imported from the logs.',
'General_ColumnValuePerVisit' => 'Revenue per Visit',
'General_ColumnVisitsWithConversions' => 'Visits with Conversions',
'General_VisitsWith' => 'Visits with %s',
Expand Down
16 changes: 14 additions & 2 deletions libs/PiwikTracker/PiwikTracker.php
Expand Up @@ -85,6 +85,7 @@ function __construct( $idSite, $apiUrl = '' )
$this->attributionInfo = false;
$this->ecommerceLastOrderTimestamp = false;
$this->ecommerceItems = array();
$this->generationTime = false;

$this->requestCookie = '';
$this->idSite = $idSite;
Expand Down Expand Up @@ -138,8 +139,18 @@ public function setUrlReferrer( $url )
{
$this->urlReferrer = $url;
}

/**

/**
* Sets the time that generating the document on the server side took.
*
* @param int $timeMs Generation time in ms
*/
public function setGenerationTime( $timeMs )
{
$this->generationTime = $timeMs;
}

/**
* @deprecated
* @ignore
*/
Expand Down Expand Up @@ -1095,6 +1106,7 @@ protected function getRequest( $idSite )
(!empty($this->customData) ? '&data=' . $this->customData : '') .
(!empty($this->visitorCustomVar) ? '&_cvar=' . urlencode(json_encode($this->visitorCustomVar)) : '') .
(!empty($this->pageCustomVar) ? '&cvar=' . urlencode(json_encode($this->pageCustomVar)) : '') .
(!empty($this->generationTime) ? '&generation_time_ms=' . ((int)$this->generationTime) : '') .

// URL parameters
'&url=' . urlencode($this->pageUrl) .
Expand Down
7 changes: 7 additions & 0 deletions misc/log-analytics/import_logs.py
Expand Up @@ -1153,6 +1153,8 @@ def _get_hit_args(self, hit):
urllib.quote(args['url'], ''),
("/From = %s" % urllib.quote(args['urlref'], '') if args['urlref'] != '' else '')
)
if hit.generation_time_milli > 0:
args['generation_time_ms'] = hit.generation_time_milli
return args

def _record_hits(self, hits):
Expand Down Expand Up @@ -1435,6 +1437,11 @@ def invalid_line(line, reason):
except (ValueError, IndexError):
# Some lines or formats don't have a length (e.g. 304 redirects, IIS logs)
hit.length = 0

try:
hit.generation_time_milli = int(match.group('generation_time_micro')) / 1000
except IndexError:
hit.generation_time_milli = 0

if config.options.log_hostname:
hit.host = config.options.log_hostname
Expand Down
46 changes: 44 additions & 2 deletions plugins/API/API.php
Expand Up @@ -870,6 +870,8 @@ private function handleTableReport($idSite, $dataTable, &$reportMetadata, $hasDi
// Process each Piwik_DataTable_Simple entry
foreach($dataTable->getArray() as $label => $simpleDataTable)
{
$this->removeEmptyColumns($columns, $reportMetadata, $simpleDataTable);

list($enhancedSimpleDataTable, $rowMetadata) = $this->handleSimpleDataTable($idSite, $simpleDataTable, $columns, $hasDimension, $showRawMetrics);
$enhancedSimpleDataTable->metadata = $simpleDataTable->metadata;

Expand All @@ -880,6 +882,7 @@ private function handleTableReport($idSite, $dataTable, &$reportMetadata, $hasDi
}
else
{
$this->removeEmptyColumns($columns, $reportMetadata, $dataTable);
list($newReport, $rowsMetadata) = $this->handleSimpleDataTable($idSite, $dataTable, $columns, $hasDimension, $showRawMetrics);
}

Expand All @@ -890,15 +893,42 @@ private function handleTableReport($idSite, $dataTable, &$reportMetadata, $hasDi
);
}

/**
/**
* Removes metrics from the list of columns and the report meta data if they are marked empty
* in the data table meta data.
*/
private function removeEmptyColumns( &$columns, &$reportMetadata, $dataTable )
{
$emptyColumns = $dataTable->getMetadata(Piwik_DataTable::EMPTY_COLUMNS_METADATA_NAME);

if (!is_array($emptyColumns))
{
return;
}

$columns = $this->hideShowMetrics($columns, $emptyColumns);

if (isset($reportMetadata['metrics']))
{
$reportMetadata['metrics'] = $this->hideShowMetrics($reportMetadata['metrics'], $emptyColumns);
}

if (isset($reportMetadata['metricsDocumentation']))
{
$reportMetadata['metricsDocumentation'] = $this->hideShowMetrics($reportMetadata['metricsDocumentation'], $emptyColumns);
}
}

/**
* Removes column names from an array based on the values in the hideColumns,
* showColumns query parameters. This is a hack that provides the ColumnDelete
* filter functionality in processed reports.
*
* @param array $columns List of metrics shown in a processed report.
* @param array $emptyColumns Empty columns from the data table meta data.
* @return array Filtered list of metrics.
*/
private function hideShowMetrics( $columns )
private function hideShowMetrics( $columns, $emptyColumns = array() )
{
if (!is_array($columns))
{
Expand Down Expand Up @@ -937,6 +967,18 @@ private function hideShowMetrics( $columns )
}
}
}

// remove empty columns
if (is_array($emptyColumns))
{
foreach ($emptyColumns as $column)
{
if (isset($columns[$column]))
{
unset($columns[$column]);
}
}
}

return $columns;
}
Expand Down

0 comments on commit 65458d0

Please sign in to comment.