-
Notifications
You must be signed in to change notification settings - Fork 8.3k
is_deleted for ReplacingMergeTree should delete the row during automatic merge #75100
Description
Company or project name
No response
Use case
is_deleted rows should automatically be removed from.the database for ReplacingMergeTree to simplify setup.
Describe the solution you'd like
From the docs, is_deleted rows do not get removed from clickhouse. Unfortunately the way the feature works at the moment is not really helpful. You're putting the onus back onto the developer that an is_deleted row still exists forever without manual intervention, and reclaiming that space requires some manual labour using a manual ALTER TABLE ... DELETE query (easy to forget, need to coordinate some async job process periodically or attach this on all delete operations) or an easy but expensive OPTIMIZE FINAL CLEANUP with experimental flags set.
I'm still not totally understanding of why you can't remove any is_deleted rows during the automatic merge process. I can see some complications around making sure you're considering all possible PKs when doing the merge, but it still seems possible at a smaller extra coordination cost.
I find these kinds of decisions make the concept leak everywhere on the developer side rather than on the database which should know how to handle it by the pros. As such, is it possible to handle this in ClickHouse?
Describe alternatives you've considered
Using OPTIMIZE FINAL CLEANUP is not recommended by you, manual deletion requires extra onus as per above.
Additional context
No response