You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.
We're running Matomo in an environment with a rather high number of trackings. Recently we lost some until we noticed that log_action.idaction reached 2^31. In the log we saw that errror: Error in Matomo (tracker): Error query: Mysqli statement execute error : Out of range value for column 'idaction' at row 1 In query: INSERT INTO matomo_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?)
We altered all relevant tables and switched to BIGINT by hand. But from my point of view it would be better if Matomo wouldn't require such a "hack" and use BIGINT as default.
Do you still consider it better to save storage and force users to modify Matomo's tables. Or wouldn't be better to revert #10569 and use BIGINT as default?
Matomo can handle that number of actions without the need to change the database schema manually.
Changes to the database schema need to be done on high-traffic sites.
BIGINT keys will be required to support distributed databases, especially if using random keys. Any future support for single schema would also depend on having a large enough key space. We could have a different schema in that case, but it would be better for data migration and interoperability to standardize on BIGINT keys unless there is still a good reason not to.
I don't think preserved disc space for keys should be that much of an issue nowadays, but just wanted to get a confirmation from @tsteur or @mattab as it seems we made the decision on purpose not to use bigint there in the past...
Also changing that columns means a potential big database migration, which we imho wanted to avoid for Matomo 5. If we decide to do that nevertheless, we could also consider merging #17466 to Matomo 5, as it was moved to Matomo 6, due to the required database migration.
We just realised that InnoDB will book the row space (ie. 8 bytes per BIGINT) even for NULLable columns. therefore we decide that it's not needed for now, to make log_action.idaction BIGINT, as it would add a large overhead of (10 idaction* fields * 4 bytes overhead) per action, ie. an overhead of 10 x 4 bytes = 40b per action. Not willing to add such overhead for all users when only < 0.01% will have have a log_action table with more than 4 billion entries. So i'll partially revert the changes and only make BIGINT the primary/foreign keys log_visit.idvisit and log_link_visit_action.idlink_va
Maybe another solution could be to have a console command to let people optionally trigger the column upgrade when they need it.
Out of interest I did some rough partial estimates on the monetary cost of increasing the idaction key size.
Typical AWS RDS storage cost: ~$0.115 per GiB/month.
4 bytes extra per BIGINT x 10 idaction keys = 40bytes extra per action.
Give these cost increases in table storage only (excluding index sizes, memory use and backup):
1,000,000 actions = 38MiB = $0.004 per month increase or $0.05 per year
10,000,000 actions = 381MiB = $0.043 per month increase or $0.54 per year
100,000,000 actions = 3,814MiB = $0.428 per month increase or $5.14 per year
1,000,000,000 actions = 38,146MiB = $4.28 per month increase or $51.40 per year
Since data was lost because this limit was hit, perhaps in addition to a console command to upgrade the columns we could also add a simple (weekly?) scheduled task to check tables with INT primary keys and send an admin warning if they have reached 90% (?) of the keyspace along with a link to an FAQ explaining how to upgrade the columns?