When a disk failure occurs in Clickhouse, Spark writing to Clickhouse results in data duplication？ #60735

Tiakon · 2024-03-04T03:27:57Z

Tiakon
Mar 4, 2024

The scenario is as follows:

Spark version: 2.4.0.cloudera2
clickhouse-native-jdbc-shaded version: 2.4.0
Clickhouse version: 23.3.8.21

Our table A is of the ReplicatedMergeTree type, which is a distributed table consisting of 5 shards and 2 replicas. In the cluster shard configuration, internal_replication is set to true, and data distribution is managed by the ReplicatedMergeTree engine.

The workflow we follow is that Spark writes to a local instance of Table A within Clickhouse via nginx, randomly choosing one machine rather than directly writing into the distributed table.

We have observed that when a disk failure happens on Clickhouse, if we are attempting to write to a shard where host2 is a replica of host1 for Table A, and a write request fails due to a disk failure on host1, an exception is returned to the Spark client, triggering a task retry, which then sends another write request through nginx.

However, despite the failed write to host1, ReplicatedMergeTree synchronizes this failed request to host2, causing the first request to be successfully written. As a result, a batch of data gets written twice.

How can we avoid such a situation?

yiyang-shao · 2024-03-11T03:32:10Z

yiyang-shao
Mar 11, 2024
Collaborator

if we are attempting to write to a shard where host2 is a replica of host1 for Table A, and a write request fails due to a disk failure on host1, an exception is returned to the Spark client, triggering a task retry, which then sends another write request through nginx.

Hi! May I know the detailed exception message you get when disk failure happens?

7 replies

Tiakon Mar 12, 2024
Author

We haven't actively changed this value; insert_deduplicate remains at its default setting of 1.

Tiakon Mar 12, 2024
Author

Hi yiyang! Since clickhouse supports inserting data for deduplication, I wonder if there is such a possibility that the deduplication is not performed. The A table we want to write contains an entry time. The value is not specified when spark writes the ck, but is generated by the default function defined in the ck table after the ck is written. Is it because of this reason that the same batch of data, because ck generated a different entry time, did not carry out the deduplication operation?

tavplubix Mar 13, 2024
Collaborator

The following information is displayed when the disk fails

These errors are irrelevant to INSERT queries. Please share exception messages (with stacktrace) of the failed INSERT queries (the exception that is returned to Spark)

Tiakon Mar 14, 2024
Author

Hi tavplubix ! Thank you for your reply. The following is the exception information received on the Spark side when a ClickHouse exception occurs.：

2024-02-20 23:50:09,208 streaming-job-executor-0 WARN spark.xxxxxSparkSQLUDFUtils 255 >> DescribeIPSet:428637 -> 428637
2024-02-20 23:50:09,208 streaming-job-executor-0 WARN spark.xxxxxSparkSQLUDFUtils 256 >> ipv4RadixTree :750069 -> 750069
2024-02-20 23:50:09,208 streaming-job-executor-0 WARN spark.xxxxxSparkSQLUDFUtils 257 >> ipv6RadixTree:124738 -> 124738
2024-02-20 23:50:09,208 streaming-job-executor-0 WARN spark.xxxxxSparkSQLUDFUtils 240 >> Load configuration file elapsed 9197 ms.
2024-02-20 23:50:23,393 streaming-job-executor-0 WARN traits.xxxxxSpark2WriteSinkTrait 39 >> 写入 xxx_db.flow_analysis_local: 10042
2024-02-20 23:50:27,021 streaming-job-executor-0 WARN traits.xxxxxSpark2WriteSinkTrait 39 >> 写入 xxxx_db.visitor_local: 4
2024-02-20 23:50:34,340 task-result-getter-3 WARN scheduler.TaskSetManager 66 >> Lost task 0.0 in stage 1520210.0 (TID 155587759, xx-xxx-js-yz1-0a-1id9.bigdata.xxxxx.info, executor 51): com.github.housepower.jdbc.ClickHouseSQLException: DB::NetExceptionDB::NetException:
Unexpected packet Data received from client. Stack trace:

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0xe22f215 in /usr/bin/clickhouse
1. ? @ 0x14aada47 in /usr/bin/clickhouse
2. DB::TCPHandler::receiveUnexpectedData(bool) @ 0x14aaba5e in /usr/bin/clickhouse
3. DB::TCPHandler::receivePacket() @ 0x14aa0db8 in /usr/bin/clickhouse
4. DB::TCPHandler::runImpl() @ 0x14a9827d in /usr/bin/clickhouse
5. DB::TCPHandler::run() @ 0x14aae379 in /usr/bin/clickhouse
6. Poco::Net::TCPServerConnection::start() @ 0x179f4f14 in /usr/bin/clickhouse
7. Poco::Net::TCPServerDispatcher::run() @ 0x179f613b in /usr/bin/clickhouse
8. Poco::PooledThread::run() @ 0x17b7d9c7 in /usr/bin/clickhouse
9. Poco::ThreadImpl::runnableEntry(void*) @ 0x17b7b3fd in /usr/bin/clickhouse
10. start_thread @ 0x7ea5 in /usr/lib64/libpthread-2.17.so
11. __clone @ 0xfe96d in /usr/lib64/libc-2.17.so

        at com.github.housepower.jdbc.protocol.ExceptionResponse.readExceptionFrom(ExceptionResponse.java:46)
        at com.github.housepower.jdbc.protocol.RequestOrResponse.readFrom(RequestOrResponse.java:52)
        at com.github.housepower.jdbc.connect.PhysicalConnection.receiveResponse(PhysicalConnection.java:109)
        at com.github.housepower.jdbc.connect.PhysicalConnection.receiveEndOfStream(PhysicalConnection.java:101)
        at com.github.housepower.jdbc.ClickHouseConnection.sendInsertRequest(ClickHouseConnection.java:154)
        at com.github.housepower.jdbc.statement.ClickHousePreparedInsertStatement.close(ClickHousePreparedInsertStatement.java:111)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:675)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1405)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

2024-02-20 23:50:37,392 streaming-job-executor-0 WARN traits.xxxxxSpark2WriteSinkTrait 39 >> 写入 xxxx_db.request_rank_local: 2433026
2024-02-20 23:50:37,641 streaming-job-executor-0 WARN business.CommonAnalysisBusiness 232 >> Batch statistics: total:4276442 error:0

What is rather complicated is that the exception is thrown from within a task and then retried by the executor, thus making it impossible for application-level code to catch this particular exception.

tavplubix Mar 18, 2024
Collaborator

Unexpected packet

This indicates an issue on the client's side: the client (Spark?) violates the protocol

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a disk failure occurs in Clickhouse, Spark writing to Clickhouse results in data duplication？ #60735

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

When a disk failure occurs in Clickhouse, Spark writing to Clickhouse results in data duplication？ #60735

Tiakon Mar 4, 2024

Replies: 1 comment · 7 replies

yiyang-shao Mar 11, 2024 Collaborator

Tiakon Mar 12, 2024 Author

Tiakon Mar 12, 2024 Author

tavplubix Mar 13, 2024 Collaborator

Tiakon Mar 14, 2024 Author

tavplubix Mar 18, 2024 Collaborator

Tiakon
Mar 4, 2024

Replies: 1 comment 7 replies

yiyang-shao
Mar 11, 2024
Collaborator

Tiakon Mar 12, 2024
Author

Tiakon Mar 12, 2024
Author

tavplubix Mar 13, 2024
Collaborator

Tiakon Mar 14, 2024
Author

tavplubix Mar 18, 2024
Collaborator