Move error check from pipeline fixer to pipeline checker #5938

mustafasrepo · 2023-04-10T07:01:49Z

Which issue does this PR close?

Closes #.

Rationale for this change

Sometimes PipelineFixer will receive executors that cannot run on unbounded data. In these cases currently, we generate error. However, during optimization stages these pipeline breaking executors may be removed, may be changed, etc. Hence, PipelineFixer shouldn't give pre-mature decisions.

For instance, If window expression is in the form
SUM(inc_col) OVER(ORDER BY ts DESC ROWS BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING) as sum1 it would be calculated with WindowAggExec. However, it can be converted to the equivalent form
SUM(inc_col) OVER(ORDER BY ts ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING) as sum1
Second form can be calculated with BoundedWindowAggExec without breaking the pipeline (given that its input is already ordered with ts ASC).
If source is unbounded, during optimization we may convert expression in the form 1 to in the form 2 which is not breaking pipeline.

What changes are included in this PR?

With this PR, PipelineFixer doesn't generate an error, when some executor cannot be run unbounded data (Those cases may be fixed during optimization.) If those executors cannot be fixed, we generate error in PipelineChecker anyway.
Util codes to construct test table are moved from window.rs to under test_util/mod.rs to be able to use utils in other files.

Are these changes tested?

Yes test_source_sorted_unbounded_source is added to check this behaviour.

Are there any user-facing changes?

…le binding

comphead · 2023-04-10T16:37:03Z

datafusion/core/src/physical_optimizer/pipeline_checker.rs

                  LIMIT 5".to_string(),
            cases: vec![Arc::new(test1), Arc::new(test2)],
-            error_operator: "Window Error".to_string()
+            error_operator: "Sort Error".to_string()


lgtm, thanks for taking this, why the error message changed to Sort Error, I dont see the respective logic change..

I think this happens because when the task of erroring out is left to PipelineChecker, which is the last rule, sorts are already in-place. While checking, the algorithm first encounters the sort (before the window), which is is pipeline-breaking. Hence the error messsage.

alamb · 2023-04-11T20:23:21Z

Thank you @mustafasrepo @ozankabak and @comphead for the review

* Add window reversal sub rule to pipeline fixer. * update test * Remove an unnecessary clone, avoid object construction by using mutable binding * Propagate error to pipeline checker --------- Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>

mustafasrepo and others added 3 commits April 7, 2023 11:36

Add window reversal sub rule to pipeline fixer.

7df979a

update test

de94064

Remove an unnecessary clone, avoid object construction by using mutab…

1a921be

…le binding

github-actions bot added the core Core DataFusion crate label Apr 10, 2023

mustafasrepo marked this pull request as draft April 10, 2023 13:38

Propagate error to pipeline checker

7ec0bbd

mustafasrepo changed the title ~~Add window reversal sub rule to pipeline fixer.~~ Move error check from pipeline fixer to pipeline checker Apr 10, 2023

mustafasrepo marked this pull request as ready for review April 10, 2023 14:54

comphead approved these changes Apr 10, 2023

View reviewed changes

alamb merged commit a6dcc2d into apache:main Apr 11, 2023

mustafasrepo deleted the feature/window_swap_rule branch April 26, 2023 07:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move error check from pipeline fixer to pipeline checker #5938

Move error check from pipeline fixer to pipeline checker #5938

Uh oh!

mustafasrepo commented Apr 10, 2023 •

edited

Loading

Uh oh!

comphead Apr 10, 2023

Uh oh!

ozankabak Apr 10, 2023

Uh oh!

alamb commented Apr 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Move error check from pipeline fixer to pipeline checker #5938

Move error check from pipeline fixer to pipeline checker #5938

Uh oh!

Conversation

mustafasrepo commented Apr 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

comphead Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

ozankabak Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

alamb commented Apr 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mustafasrepo commented Apr 10, 2023 •

edited

Loading