Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi thread write same batch data to clickhouse ReplicatedMergeTree, will data duplicate? #31245

Open
lifulong opened this issue Nov 10, 2021 · 2 comments
Labels
question Question?

Comments

@lifulong
Copy link

we need to write data to clickhouse exectly once, want to use ReplicatedMergeTree to achieve this goal, but worry about data duplicate while do retry in some case

@lifulong lifulong added the question Question? label Nov 10, 2021
@den-crane
Copy link
Contributor

You mean parallel into different CH servers?
If you insert data into the replicas of a shard you identical (binary identical) inserts will be deduplicated. https://clickhouse.com/docs/en/operations/settings/merge-tree-settings/#replicated-deduplication-window

@lifulong
Copy link
Author

parallel into same shard, maybe different replicate
we have used replicated_deduplication_window conf now, consider this case, while write to one replicate, it write timeout for server busy, we try another replicate then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question?
Projects
None yet
Development

No branches or pull requests

2 participants