[GLUTEN-4480][CH] Decouple LocalFiles from plan to improve driver generating substrait plan#4481
Conversation
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Thank you for working on this @exmy. I noticed the PR doesn't make changes to the following two parts of code yet: Do you have a plan to make cleanup to the code as well? Especially 2 since some of the code seems to be obsolete so can be removed probably. I am no sure so would you like to help doing a check? Thanks. After that, we may be able to remove a series of APIs on |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
@zzcclp Could you help check if we need cleanup the two parts of code? |
change the code in the part of the |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
2 similar comments
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
@zhztheplayer @zzcclp Have cleaned up the code in the part of the |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
LGTM for changes to common module @zzcclp If you want to review CH changes, thanks. |
| // TODO: We still maintain the old logic of parsing LocalFiles or ExtensionTable in RealRel | ||
| // to be compatiable with some suites about metrics. | ||
| // Remove this compatiability in later and then only java iter has local files in ReadRel. | ||
| if (read.has_local_files() || (!read.has_extension_table() && !isReadFromMergeTree(read))) |
There was a problem hiding this comment.
don't add this parameter isMergeTree, can ignore the suites about metrics first and raist a issue to fix.
There was a problem hiding this comment.
the paramter isMergeTree is confused.
There was a problem hiding this comment.
don't add this parameter
isMergeTree, can ignore the suites about metrics first and raist a issue to fix.
We need a way to identify whether to parse according to LocalFiles or ExtensionTable here. If we don't have isMergeTree parameter, are there alternative ways to achieve the same thing?
There was a problem hiding this comment.
read.has_extension_table() is for mergetree, and read.has_local_files() is for other file format, parquet or orc don't need the extension table part.
There was a problem hiding this comment.
ReadRel no longer includes LocalFiles or ExtensionTable since they have been decoupled from RealRel which is exactly the goal this pr want to achieve, right?
There was a problem hiding this comment.
Got, can we judge the split_info is localfiles or extendsiontable according to the split_info string?
There was a problem hiding this comment.
I tried before, but unfortunately, there wasn't a suitable way to judge it solely based on split_info string information.
…erating substrait plan
5d2bc25 to
3f24a8c
Compare
|
Run Gluten Clickhouse CI |
|
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
What changes were proposed in this pull request?
Applied for CH backend following velox does #4177
The time taken to generate the substrait plan has decreased from
249938msto4486msfor163180partitions after this pach.(Fixes: #4480)
How was this patch tested?
Pass CI