feat: Improve Column Remapping Performance in COPY FROM#193
Conversation
…ojectColumns Introduce ProjectIntoDataSourceOptimizer that pushes Projection into LogicalTableFunctionCall when the pattern is Projection -> TableFunctionCall, enabling column remapping at the data source level. Also migrates the column pruning semantics from "columns to skip" to "columns to project" across the entire pipeline (reader, optimizer, planner, serialization). Key changes: - New ProjectIntoDataSourceOptimizer for projection pushdown and column reorder - Rename ReadSharedState::skipColumns -> projectColumns with updated semantics - Rename ArrowOptionsBuilder::skipColumns() -> projectColumns() - Add projectColumns to TableFuncBindData replacing deprecated columnSkips - Restore BATCH_READ=false safety in bind_load_from.cpp - Remove stale commented-out code - Update all tests to use new project-columns semantics Made-with: Cursor Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Review Summary by QodoImprove Column Remapping Performance via Projection Pushdown Optimizer
WalkthroughsDescription• Introduce ProjectIntoDataSourceOptimizer for projection pushdown into data sources • Migrate column semantics from "skip" to "project" across entire pipeline • Rename skipColumns to projectColumns in ReadSharedState and related classes • Update protobuf field skip_columns to project_columns for semantic clarity • Restore BATCH_READ=false safety for non-COPY_FROM operations Diagramflowchart LR
A["Projection Operator"] -->|"ProjectIntoDataSourceOptimizer"| B["TableFunctionCall"]
B -->|"setProjectColumns"| C["BindData"]
C -->|"getProjectColumns"| D["Arrow Projection"]
E["skipColumns semantics"] -->|"migrate to"| F["projectColumns semantics"]
F -->|"used in"| G["Reader/Optimizer/Planner"]
File Changes1. include/neug/compiler/optimizer/project_into_data_source_optimizer.h
|
Code Review by Qodo
|
Committed-by: Xiaoli Zhou from Dev container
This reverts commit 9e0abe2. Committed-by: Xiaoli Zhou from Dev container
Committed-by: Xiaoli Zhou from Dev container
Committed-by: Xiaoli Zhou from Dev container
What do these changes do?
Introduce ProjectIntoDataSourceOptimizer that pushes Projection into LogicalTableFunctionCall when the pattern is Projection -> TableFunctionCall, enabling column remapping at the data source level. Also migrates the column pruning semantics from "columns to skip" to "columns to project" across the entire pipeline (reader, optimizer, planner, serialization).
Key changes:
Related issue number
Fixes #165