Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

sum_daily_nb_uniq_visitors calculations incorrect for some ranges in many API methods #4377

Closed
Gadget27 opened this Issue · 15 comments

4 participants

@Gadget27

The sum_daily_nb_uniq_visitors is incorrect for certain data ranges when calling API methods using period=range. I've discovered this issue within UserCountry, DeviceDetection, UserSettings, and Provider methods. I suspect it exists in more, but my test have only included those so far.

To reproduce from demo.piwik.org:

Referencing the results for Germany in the following UserCountry reports:

2013-11-01 to 2013-11-30:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-30&format=xml&token_auth=anonymous

returns nb_visits = 5380, sum_daily_nb_uniq_visitors = 4759

Add one day -> 2013-11-01 to 2013-12-01:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-12-01&format=xml&token_auth=anonymous

returns an empty result set!

Add another day -> 2013-11-01 to 2013-12-02:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-12-02&format=xml&token_auth=anonymous

nb_visits = 5696, sum_daily_nb_uniq_visitors = 289 (!?)

Clearly the 2nd API call returning nothing is a problem. With the 3rd, you can see that the increase of 2 days from the 1st call increased the visit count by a believable number, but the unique visitors total drop dramatically from 4759 to only 289. That is impossible.
Keywords: sum_daily_nb_uniq_visitors API range

@mattab
Owner

Thanks for the report!

Do you think this is a new bug (regression) in 2.0-beta, or was the bug already there in 1.12 ?

@Gadget27

I was experiencing similar results with 1.12 as well, however I had never before received an empty result set as in test #2 until recently with 2.0b-11.

@Gadget27

I've also found another issue that may be related which is detailed in the following post.

[http://forum.piwik.org/read.php?2,108244,108311#msg-108311]

They don't seem to follow the same pattern to generate the bad results. Still, they both involve irregular visit report numbers on API calls using period=range on some date ranges, so there may be a connection.

To reproduce the nb_visits error, look at the nb_visits value for Germany in the following links...

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-08&format=xml&token_auth=anonymous

From Nov 1 to Nov 8, it reports 1490 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-15&format=xml&token_auth=anonymous

From Nov 1 to Nov 15, it reports 1470 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-25&format=xml&token_auth=anonymous

From Nov 1 to Nov 25, it reports 543 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-30&format=xml&token_auth=anonymous

From Nov 1 to Nov 30, it reports 5380 visits.

@Gadget27

As of 2.0.2, the second example of the sum_daily_nb_uniq_visitors tests no longer returns an empty result set. It is instead returning a value which is certainly incorrect. In fact it looks as if in the cases of testing ranges like 2013-11-01 to 2013-12-01 and 2013-11-01 to 2013-12-02, the value returned is the sum only for the dates in the month of December. Its as if it is ignoring the values from November entirely.

@anonymous-piwik-user

(re-writing here my post on http://forum.piwik.org/read.php?2,110025,110045 with more details )

I sometimes experience the very same problem on several of my websites tracked using Piwik 1.12, but it happend again right now when testing 2.1-rc3.

The workaround I use to fix the problem when I see it on a specific period, is to run an "invalidateArchivedReports" operation:

http://.../index.php?token_auth=...&module=API&method=CoreAdminHome.invalidateArchivedReports&idSites=61&dates=2014-02-03,2014-02-04

and then re-launch archive.php:

sudo -u apache php /.../misc/cron/archive.php --url=http://... --force-idsites=61 --force-all-periods

Note: the erratic metrics I had right now were not for sum_daily_nb_uniq_visitors but for visits and actions. The unique visitors metric was the same after the workaround. So here the number of visits was lower than the number of unique visitors because the number of visits was wrong.

@anonymous-piwik-user

This bug may be a consequence of #4532 so maybe it is fixed since the 2.1 release.

@Gadget27

It appears that the secondary issue I reported on in comment 3 has been resolved as of 2.1. The original issue, however, is still open. The results are a bit different than I described in the post, but the sum_daily_nb_uniq_visitors are still wrong, nonetheless.

@tsteur
Owner

I was able to reproduce this and found the actual issue. Will provide a fix but not sure whether it will be the best solution. There are many different ways to fix this issue...

@tsteur
Owner

In 9e86c79: refs #4377 make sure metrics like sum_daily_nb_uniq_visitors (which are renamed after aggregation) are summed correctly. If period is for instance 2014-04-01,2014-05-01 we will sum two periods. The month of April 2014 and May 1st. The dataTable of the month will already contain the renamed column (as it was aggregated before) whereas May 1st datatable will not contain the renamend column but the original. Both columns cannot be summed therefore and the original column will overwrite the value of the renamed column. Meaning sum_daily_nb_uniq_visitors is in this case always the value of May 1st

@tsteur
Owner

Note: To test this you have to invalidate all existing range archives (period=5)

@tsteur
Owner

In 7ca0a8c: refs #4377 added link to ticket

@tsteur
Owner

In 140562d: refs #4377 added test for this use case

@tsteur
Owner

In 12a1c2e: refs #4377 fix some tests

@tsteur
Owner

In b39aade: refs #4377 some more test fixes

@mattab
Owner

In e163969: fix tests refs #4377

@Gadget27 Gadget27 added this to the 2.3.0 - Piwik 2.3.0 milestone
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@tsteur tsteur refs #4377 make sure metrics like sum_daily_nb_uniq_visitors (which a…
…re renamed after aggregation) are summed correctly. If period is for instance 2014-04-01,2014-05-01 we will sum two periods. The month of April 2014 and May 1st. The dataTable of the month will already contain the renamed column (as it was aggregated before) whereas May 1st datatable will not contain the renamend column but the original. Both columns cannot be summed therefore and the original column will overwrite the value of the renamed column. Meaning sum_daily_nb_uniq_visitors is in this case always the value of May 1st
9e86c79
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@tsteur tsteur refs #4377 added link to ticket 7ca0a8c
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@tsteur tsteur refs #4377 added test for this use case 140562d
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@tsteur tsteur refs #4377 fix some tests 12a1c2e
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@tsteur tsteur refs #4377 some more test fixes b39aade
@sabl0r sabl0r referenced this issue from a commit in sabl0r/piwik
@mattab mattab fix tests refs #4377 e163969
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.