feat: add input ref resolver #158

a9QrX3Lu · 2021-11-21T09:44:26Z

No description provided.

skyzh

Current implementation looks good to me, and I'll give my approval, as long as it works. But from my perspective, this InputRefResolver doesn't seem very clear in its functionalities.

After reviewing this PR, I came up with an idea. In fact, InputRefResolver is a phase in logical planner that rewrites the BoundExpr to produce a logical plan. I think that the process could be vaguely split into two phases: walk through the expr tree, and produce a plan.

Basically, InputRefResolver should have its internal states, scan_column_list and aggr_list. When walking through the expr tree, the InputRefResolver will add the columns that need to be scanned into the scan_column_list, and produce AggCall based on the position of each column in scan_column_list. That should be exactly the same as what you have done in previous binder, except that we will know exactly which position of column in DataChunk will be read in this phase.

Anyway, after fixing the _agg_calls comment, this PR can be merged. Let's discuss how to do this better later (or maybe just leave it here and refactor it when we need to).

Good work!

src/binder/expression/mod.rs

skyzh · 2021-11-21T09:55:02Z

src/binder/statement/select.rs

        for expr in group_by.iter_mut() {
            self.bind_column_idx_for_expr(&mut expr.kind);
        }

+        let mut has_agg = false;


This has_agg looks fine to me for now, but I think it's better to handle this in logical plan. From my perspective,has_agg decides whether to use an AggExecutor in logical plan or physical plan. Let's leave it here, and decide where to put it later.

skyzh · 2021-11-21T09:59:01Z

src/physical_planner/input_ref_resolver.rs

+use itertools::Itertools;
+
+/// Transform expr referring to input chunk into `BoundInputRef`
+fn transform_expr(


I have an idea for transform_expr. I think this function should directly output a plan, instead of simply rewriting the expr. The transform_expr function walks through the BoundExpr tree. When it finds an AggCall, it simply turns the child to an AggExecutor and resolve the InputRefs.

... and when resolving InputRefs, we definitely need your prior work: a select_list and an aggregation_list. When we find a column to scan, we add that column to select_list, which will be directly feed into the SeqScanExecutor. e.g. If select_list is ["x", "y", "z"], the SeqScanExecutor will take these 3 columns as scan targets. And when we need to do aggregation, we find the corresponding position of the column. e.g. we need to do count(x), we find "x" is the first element in select_list, and we use InputRef(0) for this agg call.

... and in this case, bindings is select_list, and agg_calls is my aggregation_list.

The above might not be a complete idea. We may discuss this later.

src/physical_planner/input_ref_resolver.rs

skyzh · 2021-11-21T10:17:09Z

tests/sql/aggregation.slt

----
-24 25.5
+# query IR
+# select sum(v1+v2),sum(v1+v3) from t


Just curious why this statement would fail?

I see, it's a cast error. Will fix this later.

skyzh · 2021-11-21T10:24:46Z

@MingjiHan99 would you please take a closer look and see if this looks good to you?

wangrunji0408

LGTM!

Btw, as Mr. Chi mentioned, I also feel that the InputRefResolver can be refactored into a PlanRewriter, where the Vec<ColumnRefId> can be a state of the resolver. It seems that DuckDB is in this style.

src/logical_planner/select.rs

src/physical_planner/input_ref_resolver.rs

MingjiHan99 · 2021-11-21T19:35:07Z

LGTM.
With the new input ref resolver, we can support more complicated cases including subqueries in the future. @skyzh @wangrunji0408 @pleiadesian

feat: add input ref resolver

7d6507d

a9QrX3Lu requested review from skyzh, MingjiHan99 and wangrunji0408 November 21, 2021 09:44

a9QrX3Lu linked an issue Nov 21, 2021 that may be closed by this pull request

planner: convert ColumnRef to InputRef in physical planner #119

Closed

skyzh approved these changes Nov 21, 2021

View reviewed changes

minor change

109db86

wangrunji0408 approved these changes Nov 21, 2021

View reviewed changes

src/logical_planner/select.rs Show resolved Hide resolved

src/physical_planner/input_ref_resolver.rs Show resolved Hide resolved

src/physical_planner/input_ref_resolver.rs Show resolved Hide resolved

src/physical_planner/input_ref_resolver.rs Show resolved Hide resolved

skyzh merged commit 87d3512 into main Nov 22, 2021

skyzh deleted the wzl-inputref branch November 22, 2021 00:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add input ref resolver #158

feat: add input ref resolver #158

a9QrX3Lu commented Nov 21, 2021

skyzh left a comment •

edited

Loading

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh Nov 21, 2021

skyzh commented Nov 21, 2021

wangrunji0408 left a comment •

edited

Loading

MingjiHan99 commented Nov 21, 2021

feat: add input ref resolver #158

feat: add input ref resolver #158

Conversation

a9QrX3Lu commented Nov 21, 2021

skyzh left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyzh commented Nov 21, 2021

wangrunji0408 left a comment • edited Loading

Choose a reason for hiding this comment

MingjiHan99 commented Nov 21, 2021

skyzh left a comment •

edited

Loading

wangrunji0408 left a comment •

edited

Loading