Ondřej Košarko edited this page Oct 27, 2017 · 4 revisions

Configuration

See the piwik section in configuration There are three trackers (site_ids) - oai, bitstreams and views (statistics.api). Bitstreams are the actual file downloads. Those are logged by *BitstreamReader. Views are "page views" logged by tracking code in footer. These two are displayed to the users from either the statistics menu or as a pdf report. OAI tracks the access to the machine to machine OAI-PMH endpoint.

In piwik you have to create user(s) with auth token, that can read/write from these sites. And paste the tokens into the configuration.

With 2017.04 there are two new configuration options:

access for dspace users

If you want to enable access to piwik statistics create a group called "statistics_viewers" and add the users that should have access (hint. authenticated/anonymous group).

Managing piwik database

Outlier in "Visits Over Time" (new bot or whatever)

Still on the web page, open the particular day; if you are lucky "Visitor Log" on it's first page contains entry with a lot of actions, open the visitor profile -> ID, IP address otherwise use the sql below...

details about piwik backend: https://developer.piwik.org/guides/persistence-and-the-mysql-backend

mysql -u user -p'password' get the details in /var/www/config/config.ini.php

mysql crash course:

show databases;
use piwik_db;
show tables;
show columns from piwik_log_link_visit_action;

Finding visitors with a lot of actions

//hex(idvisitor) is what you see in the visitor profile, idvisitor is binary
//get top 15 ids from a site (4 is repository downloads in this case)
select hex(idvisitor),count(*) as count from piwik_log_link_visit_action where idsite=4 group by idvisitor order by count DESC LIMIT 15;
//or for particular date or date range use the server_time column e.g.
select hex(idvisitor),count(*) as count from piwik_log_link_visit_action where idsite=4 and server_time between '2016-04-08 00:00:00' and '2016-04-08 23:59:59' group by idvisitor order by count DESC LIMIT 15;
//or grouping by dates
select hex(idvisitor),count(*) as count, year(server_time), month(server_time),day(server_time) from piwik_log_link_visit_action where idsite=4 group by idvisitor, year(server_time), month(server_time), day(server_time) order by count DESC LIMIT 15;

deleting the records

delete from piwik_log_link_visit_action where idsite=4 and idvisitor=unhex('bd6611fbe712b84b') and server_time between '2016-04-08 00:00:00' and '2016-04-08 23:59:59';
//also clean the log_visit table
delete from piwik_log_visit where idsite=4 and idvisitor=unhex('bd6611fbe712b84b');

drop precomputed stats for the year_month you've touched

drop table piwik_archive_blob_2016_04;
drop table piwik_archive_numeric_2016_04;
Query OK, 0 rows affected (0.12 sec)

recompute:

root@piwik:/var/www# ./console core:archive --force-all-periods=315576000 --force-date-last-n=20 --url=http://ufal.mff.cuni.cz/piwik/ --force-idsites=4
  1. --force-all-periods Limits archiving to websites with some traffic in the last [seconds] seconds. (last +- 10 years in the example)
  2. --force-date-last-n This script calls the API with period=lastN. (calls for last 20 days/weeks/months/years)
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.