Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report "Event Categories" summarize "others" with only 1 unique Visitor #20397

Open
OlliWu opened this issue Feb 24, 2023 · 5 comments
Open

Report "Event Categories" summarize "others" with only 1 unique Visitor #20397

OlliWu opened this issue Feb 24, 2023 · 5 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc.

Comments

@OlliWu
Copy link

OlliWu commented Feb 24, 2023

I’m facing the following issue with one of our Sites.
The Report “Behavior -> Events -> Event Categories” always sums up visits as “others”, even with an extremely high number of “datatable_archiving_maximum_rows_”.
Believing the Report, these “other Visits” are coming from only 1 unique Visitor. Which seems wrong to me.

event_categories_others

Expected Behavior

With datatable_archiving_maximum_rows_* = 100.000.000, the Report "Event Categories" should not summarize Visits to "Others" as described here: https://matomo.org/faq/how-to/faq_54/

Current Behavior

The Report “Behavior -> Events -> Event Categories” always sums up visits as “others”, coming from only one unique Visitor.

event_categories_others_2

Steps to Reproduce (for Bugs)

  1. Track lots of Event Actions
  2. Run Archiver
  3. View Report “Behavior -> Events -> Event Categories”

Context

As you can see in the following Screenshot, the Site is tracking lots of Event Names. The “Event Names” Report itself, can’t even be displayed. (I guess because there’s no paging) Don’t know if this is part of the problem.

event_actions

I started by setting the datatable_archiving_maximum_rows from 100k to 1 Million and then from there to 10 and 100 Million rows. In every Step, I invalidated the Report Data and started the archiver. The Numbers (Visits, Events, etc) always changed. But “Others” just decreased by a couple of thousand Visits.

These are the Settings I used to produce the above Report.
datatable_archiving_maximum_rows_custom_dimensions = 100000000
datatable_archiving_maximum_rows_subtable_custom_dimensions = 100000000
datatable_archiving_maximum_rows_actions = 100000000
datatable_archiving_maximum_rows_subtable_actions = 100000000
datatable_archiving_maximum_rows_events = 100000000
datatable_archiving_maximum_rows_subtable_events = 100000000
datatable_archiving_maximum_rows_custom_variables = 100000000
datatable_archiving_maximum_rows_subtable_custom_variables = 100000000
archiving_ranking_query_row_limit = 0

Your Environment

The Matomo Installation handle multiple Sites, tracking approximately 1 million visits, 4.5 million pageviews and 8 million actions per day.

  • Matomo Version: 4.13.1
  • PHP Version: 8.0.27
  • Server Operating System: Suse Linux
  • Additionally installed plugins: PremiumBundle
  • Browser: Firefox
  • Operating System: Windows 10

I can’t get behind it and really don’t know which Metric causes the Issue. I don’t know either if there are just too much rows fetched by the select statement or if its some kind of bug.
Hope you can give me a hint where to look or what to do. Just let me know if you need more info.
Thanks
Olli

@OlliWu OlliWu added Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Feb 24, 2023
@bx80
Copy link
Contributor

bx80 commented Feb 27, 2023

Thanks for the detailed report on this @OlliWu 👍 With maximum archiving rows settings of 100,000,000 you shouldn't be seeing any data grouped under "Others" so this looks like a bug. This doesn't seem to happen on smaller datasets, so it could be related to the number of rows.

I'll assign the issue for prioritization.

@bx80 bx80 added Bug For errors / faults / flaws / inconsistencies etc. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Feb 27, 2023
@bx80 bx80 added this to the For Prioritization milestone Feb 27, 2023
@mattab
Copy link
Member

mattab commented Mar 14, 2023

@OlliWu Q: we're thinking it could be a bug caused by setting archiving_ranking_query_row_limit = 0 - could you try instead to change it to 1archiving_ranking_query_row_limit = 1000000`
and try again to invalidate, or wait, for data to be processed with this setting, whether it works better?

If it works better or not, please let us know, so we can confirm finding the problem. Thanks!

@OlliWu
Copy link
Author

OlliWu commented Mar 14, 2023

I did so many tests, I was pretty sure i tested something like that too. But, it turned out, I did not. :-)
I invalidated two different days and did for each of them another archiving with the following parameters:

datatable_archiving_maximum_rows_custom_dimensions = 100000
datatable_archiving_maximum_rows_subtable_custom_dimensions = 100000
datatable_archiving_maximum_rows_actions = 1000000
datatable_archiving_maximum_rows_subtable_actions = 1000000
datatable_archiving_maximum_rows_events = 1000000
datatable_archiving_maximum_rows_subtable_events = 1000000
archiving_ranking_query_row_limit = 1000000

The same result on both days: No Others. 👍

Here are the Screenshots.

BEFORE
20230308_id35_before

AFTER
20230308_id35_after

As far as I can tell, the Numbers are looking valid too.

@mattab
Copy link
Member

mattab commented Mar 24, 2023

Thanks for the update @OlliWu and does it mean that from your perspective, there is a bug when archiving_ranking_query_row_limit = 0 ? If so, it would be appreciated to create a new bug report for it (or we could do it once you confirm)

@OlliWu
Copy link
Author

OlliWu commented Apr 4, 2023

@mattab i really don't know for sure. Maybe i just get the documentation wrong. So, first of all, if I want to get my data not summarized as "others" , i'm going to change the values according to the following FAQ: https://matomo.org/faq/how-to/faq_54/
Now, if there are still "others". I'm looking at the description of the various config options and there's "archiving_ranking_query_row_limit" which is described as "maximum number of rows to fetch from the database when archiving. if set to 0, no limit is used."
Maybe it's just me, but i thought if i set it to 0, all the data needed to create a detailed report is fetched. But it's clearly not.

So, maybe the the query_row_limit works as expected, but only the Description is misleading? For me, it's totally fine to use a high value, e.g. 1 Million instead of "0".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

3 participants