Join GitHub today
Improved archive cleanup with 60% in time! #11988
Changed only the function insertActionsToKeep
@tsteur Selecting min(id) will take by it selfs multiple minutes. Do not know why, but somehow it is hard to get the minimum value from a table. I thought that it was an index, so that should be fast. Have not yet investigated as I do not see this as an issue yet. So what I have seen, is that selecting the minimum value took 12 minutes on production. Do not know why. It is a long time. But still when the minimal value is somewhat like 1200000 it will save about
But lets take the following example. We have the table
So what I did was removing the loop over the fields
So I get a result set back that contains all the data that should be saved in the temporary table. In order to do this, the following query needs to be created 'INSERT IGNORE INTO [TEMPTABLE] VALUES (idsite),(idvisitor),(idvisit),(idaction_url_ref)'. So I create a single query that insert 1000 records at once. This 1000 is just a number. It could be that 10000 is also possible. But that is depending on the amount of available memory.
My insert query is not faster then
We have noticed that this fix will use a lot of disk. Our SSD showed a 70% utilization when running this new cleanup code. So it will take more resources to cleanup, but now it only takes about 2 days in stead of 2 weeks. So we are happy with it
Interesting, as there is an index on this field (first column and on log_visit table even primary key) I would also have expected this query to be very quite fast.