Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Make Visitor Log load very fast even with hundreds of thousands of visitors, or if a visitor views thousands of pages #3097

Closed
mattab opened this Issue · 5 comments

2 participants

@mattab
Owner

On a test Piwik, select year period, then visitor log, loading the page is really slow and I suppose could even crash Mysql in some cases.

When looking at year or month or day it should be as fast in all cases, since we select the last N visitors for the last day of the period.

I thought this was fixed already, but it isn't, we should run a clever SQL that is fast just like when looking at one day data.

@BeezyT
Collaborator

As far as I can see, the problem is as follows:

Take a look at Piwik_Live_API::loadLastVisitorDetailsFromDatabase().
It only does a LIMIT if $filter_limit is set, but there is none.

It loads all visits in the period and the data table widget does pagination.
It's not hard to see why this fails for many visits.

If you do echo $subQuery['sql'];, you get:
SELECT log_visit.* FROM piwik_log_visit AS log_visit WHERE log_visit.idsite = ? AND log_visit.visit_last_action_time >= ? AND log_visit.visit_last_action_time <= ? ORDER BY idsite, visit_last_action_time DESC

It selects everything in the range (which can be millions of rows), sorts it, ships it to PHP and builds a data table. The controller method getVisitorLog does $view->setSortedColumn('idVisit', 'ASC');, which might even sort it again in PHP before applying pagination. Obviously, that won't work for millions of rows in the data table.

What we would have to do is this:

  • Use LIMIT in the SQL query
  • Use calc_found_rows in SQL and add new methods to the data table to generate the pagination without having the actual data
  • Maybe we should avoid sorting on high traffic websites and rely on the ording of the table. Not sure whether this is possible because a GROUP BY is applied to the select query quoted above and AFAIK there is no specified ordering after a GROUP BY
@mattab
Owner

I would propose instead to keep this SQL and call it, for 1 day at a time until there are enough visitors for the display? that would be simpler I think...

@BeezyT
Collaborator

One day still has an unlimited number of rows. Doing it for one day at a time is an improvement but still not a fix for very high traffic websites.

@BeezyT
Collaborator

The problem was not as described here. Nevertheless, it should be fixed in [3ac47e8]

@mattab
Owner

Great work on this one!

@mattab mattab added this to the 1.12 - The Great 1.x Backlog milestone
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.