Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/resync 或者 /recovery 接口是否可以按时间线做批量并行同步? #60

Open
z-k-q opened this issue Mar 21, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@z-k-q
Copy link

z-k-q commented Mar 21, 2023

我自己测试,在稍大数据量的时候,同步效率不高。当前抽取数据方式为select * from limit,查看执行计划是做全表扫描,一张表分多批,每个批次也都是全表扫描,在这里应该有巨大的性能提升空间的。
这边我说下我的建议哈,看大佬能否优化下
1,最好是能实现按时间线分批并行扫描抽取数据,按时间线查询比较符合influxdb的数据结构,效率较高,每个时间线查询不会全表扫描
2,在当前的基础上,优化分批取数的机制,不按照limit分批,是否可以查询最早最晚时间戳,按时间戳并行分批取数,避免全表扫描

@chengshiwen chengshiwen added the enhancement New feature or request label Mar 21, 2023
@chengshiwen
Copy link
Owner

1、下个版本考虑优化下,感谢
2、早期发现通过 http 方式同步耗时漫长,后来写了 influx-tool,其中 influx-tool transfer 命令传输效率会高很多
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants