We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
https://forum.mirrorship.cn/t/topic/12192/5
查询一个paimon 大表非常慢,表总数据行数大约22 亿,集群 100 个 backends节点,查看 profile发现数据 scan只有 3 实例运行 Starrocks版本:社区3.2.4 paimon 表 有200 bucket 查询语句:select count(1) from paimon.xxx.xxxx;
com.starrocks.qe.HDFSBackendSelector.HdfsScanRangeHasher#acceptScanRangeLocations没有考虑到paimon
下面是一个 paimon的THdfsScanRange具体信息
没有一个条件命中com.starrocks.qe.HDFSBackendSelector.HdfsScanRangeHasher#acceptScanRangeLocations中的primitiveSink,所以最终所有 paimon THdfsScanRange计算的 hash 值都是 0
starrocks/fe/fe-core/src/main/java/com/starrocks/qe/HDFSBackendSelector.java
Lines 127 to 145 in 613f0b5
因为 hash值都是 0, 所以 com.starrocks.qe.HDFSBackendSelector#computeScanRangeAssignment中hashring 返回的始终是同一组 backend
Lines 278 to 279 in 613f0b5
一个修复思路是在PaimonScanNode中把 split bucket 信息放入THdfsScanRange,HDFSBackendSelector根据 bucket 信息做 hash
starrocks/fe/fe-core/src/main/java/com/starrocks/planner/PaimonScanNode.java
Lines 208 to 226 in 613f0b5
The text was updated successfully, but these errors were encountered:
@yohengyang 感谢反馈,请问下,你们的这个case里,具体有多少个remoteScanRangeLocations(PaimonScanNode::getScanRangeLocations()的返回值)?
remoteScanRangeLocations
Sorry, something went wrong.
这个 case有 200个TScanRangeLocations,我们的表是 200bucket
Successfully merging a pull request may close this issue.
https://forum.mirrorship.cn/t/topic/12192/5
问题描述
查询一个paimon 大表非常慢,表总数据行数大约22 亿,集群 100 个 backends节点,查看 profile发现数据 scan只有 3 实例运行
Starrocks版本:社区3.2.4
paimon 表 有200 bucket
查询语句:select count(1) from paimon.xxx.xxxx;
定位到问题原因
com.starrocks.qe.HDFSBackendSelector.HdfsScanRangeHasher#acceptScanRangeLocations没有考虑到paimon
下面是一个 paimon的THdfsScanRange具体信息
没有一个条件命中com.starrocks.qe.HDFSBackendSelector.HdfsScanRangeHasher#acceptScanRangeLocations中的primitiveSink,所以最终所有 paimon THdfsScanRange计算的 hash 值都是 0
starrocks/fe/fe-core/src/main/java/com/starrocks/qe/HDFSBackendSelector.java
Lines 127 to 145 in 613f0b5
因为 hash值都是 0, 所以 com.starrocks.qe.HDFSBackendSelector#computeScanRangeAssignment中hashring 返回的始终是同一组 backend
starrocks/fe/fe-core/src/main/java/com/starrocks/qe/HDFSBackendSelector.java
Lines 278 to 279 in 613f0b5
一个修复思路是在PaimonScanNode中把 split bucket 信息放入THdfsScanRange,HDFSBackendSelector根据 bucket 信息做 hash
starrocks/fe/fe-core/src/main/java/com/starrocks/planner/PaimonScanNode.java
Lines 208 to 226 in 613f0b5
The text was updated successfully, but these errors were encountered: