Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
duplicate entries in piwik_log_action leading to pages not visible in segments #6436
currently we encountered a strange issue with pages listed in actions->pages, but not if using a segment (which should include this page). Investigating the issue I found duplicate entries in piwik_log_action.
In piwik_log_link_visit_action only idaction 4114715 is referenced, and displayed with views in actions->pages; creating a segment matching example.com/somepage gets 0 views for this page.
Something to consider is that, we use 2 balanced Piwik installations for tracking, a setup which should be working fine, but also has the potential to create this kind of issue as type, hash is not unique.
Looking through ./core/Tracker/TableLogAction.php there seems to be the possibility, that two concurrent request with the same URL could result in duplicate entries.
Consider loadIdsAction executed in parallel on two installations with the same input. There should be a small time window, where both read the current state, create their list of items to be inserted, then insert the same item with different idactions, so that valid entries are created.
If I understand correctly it doesn't make sense to have multiple rows in log_action with the same name/type combination, so adding a unique constraint to log_action for name/type or hash/type and changing
Although I'm not sure about cleanup of existing data. If duplicates need to be deleted we have to cleanup log_link_visit_action with about 700,000,000 entries
Another possibility to workaround the issue of multiple Piwik instances writing data seems to be something like
This statement should avoid inserting an entry with the same name and hash as an existing entry without using an UNIQUE constraint.
Please note: I was not able to test this statement (syntax or if it would fix the problem) due to time constraints ;)
referenced this issue
Oct 28, 2014
for a cleanup script, you can't just delete the 'duplicate' entries, you have to merge them:
select idaction,name,count(*) from piwik_log_action as action, piwik_log_link_visit_action as linker where linker.idaction_url_ref=action.idaction group by action.idaction order by name;
for each of the names which have multiple idactions:
say i have multiple idactions: 63,90,134,213,180
update piwik_log_link_visit_action set idaction_url_ref=63 where idaction_url_ref in (90,134,213,180);
update piwik_log_link_visit_action set idaction_url=63 where idaction_url in (90,134,213,180);