[Bug] [StreamLoad] writing JSON has become exceptionally slow #35306

15767714253 · 2024-05-23T12:12:41Z

Search before asking

I had searched in the issues and found no similar issues.

Version

2.3.5

What's Wrong?

My Table
CREATE TABLE dwd_ess_big_cell_inc
(
time datetime NOT NULL COMMENT '',
namespace_code VARCHAR(64) NOT NULL COMMENT '',
device_instance_property_code VARCHAR(64) NOT NULL COMMENT '',
device_instance_code VARCHAR(64) NOT NULL COMMENT '',
value VARCHAR(64) NULL COMMENT '',
kafka_time DATETIME NOT NULL COMMENT '创建时间',
create_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间'
) ENGINE = OLAP UNIQUE KEY( time, namespace_code,device_instance_property_code,device_instance_code)
COMMENT ''
PARTITION BY RANGE (time) ()
DISTRIBUTED BY HASH(time,namespace_code,device_instance_property_code, device_instance_code)
PROPERTIES
(
"min_load_replica_num" = "1",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "HOUR",
"dynamic_partition.start" = "-24",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "24",
"dynamic_partition.replication_num" = "3",
"compaction_policy" = "time_series",
"enable_unique_key_merge_on_write" = "false"
);

flink doris connector config

"properties": {
"format": "json",
"timezone": "Asia/Shanghai",
"read_json_by_line": "true",
"send_batch_parallelism": 10,
"memtable_on_sink_node": "true",
"columns": "time,time=from_unixtime(round(time/1000,0)),namespace_code,device_instance_property_code,device_instance_code,value,kafka_time,kafka_time=from_unixtime(round(kafka_time/1000,0))"
},

My FE Config
enable_single_replica_load = true
fetch_stream_load_record_interval_second = 30

My BE Config
number_tablet_writer_threads = 48
streaming_load_json_max_mb = 1024
enable_single_replica_load = true
jsonb_type_length_soft_limit_bytes = 2147483643
string_type_length_soft_limit_bytes = 2147483643
enable_stream_load_record = true
max_send_batch_parallelism_per_job = 20

1FE 3BE
4 * 64G 32vCpu

StreamLoad Result

Sometimes it is like this.

What You Expected?

During my previous tests, with 10 concurrent processes, each committing a million writes, it would not take more than 10 seconds. However, I don't understand why writing to the new cluster has become so slow now. This cluster only contains this single table and has plenty of resources. I hope to find out what the problem is.

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

JNSimba · 2024-05-24T07:31:22Z

Can be upgraded to 1.6.1？

15767714253 · 2024-05-25T06:09:01Z

Can be upgraded to 1.6.1？

OK！

15767714253 closed this as completed May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] [StreamLoad] writing JSON has become exceptionally slow #35306

[Bug] [StreamLoad] writing JSON has become exceptionally slow #35306

15767714253 commented May 23, 2024

JNSimba commented May 24, 2024 •

edited

Loading

15767714253 commented May 25, 2024

[Bug] [StreamLoad] writing JSON has become exceptionally slow #35306

[Bug] [StreamLoad] writing JSON has become exceptionally slow #35306

Comments

15767714253 commented May 23, 2024

Search before asking

Version

What's Wrong?

What You Expected?

How to Reproduce?

Anything Else?

Are you willing to submit PR?

Code of Conduct

JNSimba commented May 24, 2024 • edited Loading

15767714253 commented May 25, 2024

JNSimba commented May 24, 2024 •

edited

Loading