-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is your feature request related to a problem?
We have a few problems with the persons table:
- Clickhouse collapsing merge trees: Make
personandperson_distinct_iddeletion efficient #4242 - Problem with timestamp precision: Increase ClickHouse Person DateTime precision #4260
- Issues with
$set_once: Looks like set_once is not working as expected #4082 - Adding
$incrementin addition to$setand$set_once: Add $increment for numerical user props plugin-server#326
These problems all require some changes to the person tables and thus it makes sense to tackle them all together.
Describe the solution you'd like
I'd like to denormalise the person properties JSON field into a separate person_properties table. This means one property per user per row. This lets us update different properties independent of one another, and avoids sending multi-kilobyte JSON blobs to and from postgres. That in turn greatly improves data integrity, as we will get rid of the case where two plugin servers are updating (different!) properties on the user at the same time, and both read the same huge input json, modify one field. Then the second server, not knowing the first server already saved its changes, will override the first server's changes when writing its modified json back into postges.
This person properties table could start with collapsing merge tree on clickhouse, saving future refactoring.
Describe alternatives you've considered
In PostHog/plugin-server#326 @yakkomajuri went through some effort for $increment to update the field in the JSON via postgres. This avoids losing numbers when when two $increment calls update the same field at the same time, but the other server could still $set and override the entire properties json.
Additional context
:gottagofast: