-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
生产环境下使用的一些建议 #26
Comments
这个想法挺好的
不能倍增,恢复时间越久,倍增的时间越多, 集群ok后,sinker写入的延迟就可能越大
第一个版本是完全死循环写ck的,直到成功, 但数据库异常是不可规避的,死循环写在异常不可恢复的情况下,需要kill -9,数据也会丢失; sinker开源后,没有把业务相关的处理做到非常好,我建议用户基于基础代码进行适当改造 我司内部版本 有以下功能:
后续若有的代码与业务绑定不大,可以持续开源这块 |
期待能够实现开源
如果实现了1,实际上clickhouse_sinker开启以后就可以不用太关注这个服务了。clickhouse sinker程序在一台机器上常开, 不必担心kafka, clickhouse维护的时候, clickhouse sinker大量的报错和重试。 |
I'm curious how this can achieve exactly-once ingestion, if there are multiple Kafka partitions, and a consumer in a group may get messages from multiple partitions or even changing partitions when Kafka rebalancing happens. Keeping track of batch high offsets would work for a single partition in a batch in case of clickhouse_sinker crashes. But if the ClickHouse server crashes before you get positive response to the last insert, you won't be able to tell if the last batch is successful or not, right? In order to deal with that with ClickHouse's batch idempotency (exactly same batches will be deduped), we need to send in exactly the same batches for the unacked batches in case of ClickHouse server crash, which means we need to keep track of batch low offsets and high offsets (from a single partition, or every partition involved in the batch). Right? And we cannot use consumer group when rebalancing can happen? Thanks in advance for sharing your insights on this. |
Yes, so we should ensure each insert is ok, which we use
In each batch insert, we keep tracking the largest offsets of involved partitions, when the batch insert is successful, we commit the offsets of partitions. |
Thanks Sundy for sharing more insights.
So you are saying
That works only if you never need to fetch from Kafka again for a pending batch (i.e. batch insertions are always successful - the assumption above), right? If I ever need to re-assemble a batch, I have to be able to control the mix of data from all the partitions involved to generate exactly the same batch (to deal with imaginary crash case above). |
If it crashes, the offset will not commit either, so it'ok.
Yes |
Thanks again. That's what I'd like to confirm. |
我们希望用clickhouse_sinker 作为一个系统级服务,能够常驻,自动适应clickhouse, kafka链接状态的变化, 而不需要去手动停止。
现在的配置retryTimes, 如果使用,将导致clickhouse集群维护期间大量的数据丢失,不适合生产环境.
建议LoopWrite, 抛弃retryTimes的设置,
写失败的情况下,自动倍增系统sleep的时间,直到成功。这样就无需在clickhouse停机的情况下,首先要关闭clickhouse_sinker, clickhouse_sinker可以常驻,在最多1小时候就自动提供sinker服务了。
The text was updated successfully, but these errors were encountered: