Skip to content

Exactly-once write semantics #28270

Discussion options

You must be logged in to vote

You can retry the whole INSERT to Distributed table.
The batch will be split between shards and data will be send to shards as usual.
If you have deterministic sharding key, then data will be split between shards deterministically.
Then, on every shard that already have this block of data, it will be skipped, and inserted on shards that don't have data yet.

Moreover, if you use asynchronous inserts to Distributed table (which is by default) and don't enable distributed_directory_monitor_batch_inserts, the Distributed table will perform retries automatically for you, until the data will be inserted exactly once.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by leticiawebb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants