You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had searched in the issues and found no similar issues.
What happened
Collecting local text files into Doris has a low processing speed.
By observing the information printed in the log, it was found that the reading speed and writing speed of the data were on par. We suspect that the transfer and integration of text files by source may not be as efficient
SeaTunnel Version
seatunnel 2.3.3
Doris 2.0.2
Java 1.8.0_333
SeaTunnel Config
env {
# You can set flink configuration here
execution.parallelism = 10
job.mode = "BATCH"
}
source {
localFile {
file_format_type = "text"
path = "/home/hadoop/yangst/wudao_20240111.txt"
}
}
sink {
jdbc {
url = "jdbc:mysql://xxx.xx.xx:port/wudao?rewriteBatchedStatements=true"
driver = "com.mysql.cj.jdbc.Driver"
user = "root"
password = "xxxx"
query = "insert into wudao(line) values(?)"
}
}
Running Command
nohup sh bin/seatunnel.sh -c config/wudao.config -e local> nohup.out 2>&1&
Error Exception
Job Progress Information
***********************************************
Job Id : 799900012245942273
Read Count So Far : 35192
Write Count So Far : 26996
Average Read Count : 50/s
Average Write Count : 50/s
Last Statistic Time : 2024-01-17 15:35:05
Current Statistic Time : 2024-01-17 15:36:05
***********************************************
2024-01-17 15:37:02,846 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
CoordinatorService Thread Pool Status
***********************************************
activeCount : 1
corePoolSize : 0
maximumPoolSize : 2147483647
poolSize : 1
completedTaskCount : 67
taskCount : 68
***********************************************
2024-01-17 15:37:02,848 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
Job info detail
***********************************************
createdJobCount : 0
scheduledJobCount : 0
runningJobCount : 1
failingJobCount : 0
failedJobCount : 0
cancellingJobCount : 0
canceledJobCount : 0
finishedJobCount : 0
restartingJobCount : 0
suspendedJobCount : 0
reconcilingJobCount : 0
***********************************************
2024-01-17 15:37:05,614 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -
***********************************************
Job Progress Information
***********************************************
Job Id : 799900012245942273
Read Count So Far : 37192
Write Count So Far : 28996
Average Read Count : 33/s
Average Write Count : 33/s
Last Statistic Time : 2024-01-17 15:36:05
Current Statistic Time : 2024-01-17 15:37:05
***********************************************
2024-01-17 15:38:02,845 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
CoordinatorService Thread Pool Status
***********************************************
activeCount : 1
corePoolSize : 0
maximumPoolSize : 2147483647
poolSize : 1
completedTaskCount : 67
taskCount : 68
***********************************************
2024-01-17 15:38:02,849 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
Job info detail
***********************************************
createdJobCount : 0
scheduledJobCount : 0
runningJobCount : 1
failingJobCount : 0
failedJobCount : 0
cancellingJobCount : 0
canceledJobCount : 0
finishedJobCount : 0
restartingJobCount : 0
suspendedJobCount : 0
reconcilingJobCount : 0
***********************************************
2024-01-17 15:38:05,613 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -
***********************************************
Job Progress Information
***********************************************
Job Id : 799900012245942273
Read Count So Far : 38192
Write Count So Far : 29996
Average Read Count : 16/s
Average Write Count : 16/s
Last Statistic Time : 2024-01-17 15:37:05
Current Statistic Time : 2024-01-17 15:38:05
***********************************************
2024-01-17 15:39:02,846 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
CoordinatorService Thread Pool Status
***********************************************
activeCount : 1
corePoolSize : 0
maximumPoolSize : 2147483647
poolSize : 1
completedTaskCount : 67
taskCount : 68
***********************************************
2024-01-17 15:39:02,849 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
Job info detail
***********************************************
createdJobCount : 0
scheduledJobCount : 0
runningJobCount : 1
failingJobCount : 0
failedJobCount : 0
cancellingJobCount : 0
canceledJobCount : 0
finishedJobCount : 0
restartingJobCount : 0
suspendedJobCount : 0
reconcilingJobCount : 0
***********************************************
2024-01-17 15:39:05,613 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -
***********************************************
Job Progress Information
***********************************************
Job Id : 799900012245942273
Read Count So Far : 41192
Write Count So Far : 32996
Average Read Count : 50/s
Average Write Count : 50/s
Last Statistic Time : 2024-01-17 15:38:05
Current Statistic Time : 2024-01-17 15:39:05
***********************************************
2024-01-17 15:39:45,040 INFO org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator - wait checkpoint completed: 4
2024-01-17 15:40:02,845 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
CoordinatorService Thread Pool Status
***********************************************
activeCount : 1
corePoolSize : 0
maximumPoolSize : 2147483647
poolSize : 4
completedTaskCount : 72
taskCount : 73
***********************************************
2024-01-17 15:40:02,847 INFO org.apache.seatunnel.engine.server.CoordinatorService - [localhost]:5801 [seatunnel-478968] [5.1]
***********************************************
Job info detail
***********************************************
createdJobCount : 0
scheduledJobCount : 0
runningJobCount : 1
failingJobCount : 0
failedJobCount : 0
cancellingJobCount : 0
canceledJobCount : 0
finishedJobCount : 0
restartingJobCount : 0
suspendedJobCount : 0
reconcilingJobCount : 0
***********************************************
2024-01-17 15:40:05,609 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -
***********************************************
Job Progress Information
***********************************************
Job Id : 799900012245942273
Read Count So Far : 44192
Write Count So Far : 35996
Average Read Count : 50/s
Average Write Count : 50/s
Last Statistic Time : 2024-01-17 15:39:05
Current Statistic Time : 2024-01-17 15:40:05
***********************************************
duanmuyh
changed the title
The collection rate of local files and HDFS files is very slow when doing
Attempted to collect local articles and HDFS text files into Doris. The working speed is very slow whether it is collecting locally or from HDFS
Jan 30, 2024
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.
Search before asking
What happened
Collecting local text files into Doris has a low processing speed.
By observing the information printed in the log, it was found that the reading speed and writing speed of the data were on par. We suspect that the transfer and integration of text files by source may not be as efficient
SeaTunnel Version
seatunnel 2.3.3
Doris 2.0.2
Java 1.8.0_333
SeaTunnel Config
Running Command
Error Exception
Zeta or Flink or Spark Version
No response
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: