[GLUTEN-5414] [VL] Support arrow csv option and schema#5850
Conversation
|
Run Gluten Clickhouse CI |
8 similar comments
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
9 similar comments
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Can you help review this PR and rerun the failed test? Thanks! @zhztheplayer |
There was a problem hiding this comment.
nit: Why quoting the name? Since we didn't do it on other names.
There was a problem hiding this comment.
Should we modify velox_docker_cache.yml as well? Did you check that file already?
There was a problem hiding this comment.
nit: The placing folder can be renamed from resources/datasource/csv to resource/arrow-datasource/csv or something.
There was a problem hiding this comment.
The csv file is not relevant to arrow datasource, it is just used to test Arrow.
It is just a normal CSV file, so I place it here.
|
Run Gluten Clickhouse CI |
|
Can you help merge this one? Thanks! @zhztheplayer |
|
Run Gluten Clickhouse CI |
This reverts commit be7710e88d00e0326ac5dc0cbe40cbb0e11f213a.
|
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
| mvn clean install -am \ | ||
| -DskipTests -Drat.skip -Dmaven.gitcommitid.skip -Dcheckstyle.skip |
There was a problem hiding this comment.
do we really need compile whole project? it's a time consuming command...
I wonder which depends module can not be compiled by -am options?
There was a problem hiding this comment.
arrow-bom module includes many modules, so we need to compile whole project
Support basic option now, will support more options after arrow patch merged.
apache/arrow#41646
Before this patch, if the required schema is different with file schema, csv read will fallback.
And changed to use index in file instead of check the file column name considering case sensitive.
Add a new common test function when the rule applies to Logical plan.
Compile arrow with version 15.0.0-gluten, upgrade arrow-dataset and arrow-c-data version from 15.0.0 to 15.0.0-gluten