-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Consolidate EliminateNestedUnion and EliminateOneUnion optimizer rules'
#18678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2dda8cd to
23ec0b6
Compare
| @@ -420,4 +423,28 @@ mod tests { | |||
| TableScan: table_1 | |||
| ") | |||
| } | |||
|
|
|||
| #[test] | |||
| fn eliminate_one_union() -> Result<()> { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
single input unions can't be created easily, so this test looks a little different
|
|
||
| #[test] | ||
| fn eliminate_nothing() -> Result<()> { | ||
| let plan_builder = table_scan(Some("table"), &schema(), None)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is already the same test in datafusion/optimizer/src/eliminate_nested_union.rs
| } | ||
|
|
||
| #[test] | ||
| fn eliminate_one_union() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ported this test over
23ec0b6 to
70b34d9
Compare
| @@ -55,6 +55,9 @@ impl OptimizerRule for EliminateNestedUnion { | |||
| _config: &dyn OptimizerConfig, | |||
| ) -> Result<Transformed<LogicalPlan>> { | |||
| match plan { | |||
| LogicalPlan::Union(Union { mut inputs, .. }) if inputs.len() == 1 => Ok( | |||
| Transformed::yes(Arc::unwrap_or_clone(inputs.pop().unwrap())), | |||
| ), | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's it? Nice find, will be curious if this shows up in planning benchmark
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
| Arc::new(EliminateCrossJoin::new()), | ||
| Arc::new(EliminateLimit::new()), | ||
| Arc::new(PropagateEmptyRelation::new()), | ||
| // Must be after PropagateEmptyRelation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this comment was added with EliminateOneUnion so need to remove it, reference: 704e034#diff-42d66905c3fa6b245c3493fb4df4816e0fde3036941315ab42198fb16f3907df
(Any impact that the logic is now before PropagateEmptyRelation? Or running multiple passes of optimizer makes order redundant anyway?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Must be after PropagateEmptyRelation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all tests pass with OptimizeUnions in its current location I think we can just remove this comment as stale. I have done so in 4ed0786
| @@ -55,6 +55,9 @@ impl OptimizerRule for EliminateNestedUnion { | |||
| _config: &dyn OptimizerConfig, | |||
| ) -> Result<Transformed<LogicalPlan>> { | |||
| match plan { | |||
| LogicalPlan::Union(Union { mut inputs, .. }) if inputs.len() == 1 => Ok( | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a minor comment here, considering its not exactly related to nested unions?
(Or could rename away from EliminateNestedUnions to something like SimplifyUnions to be all encompassing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Or could rename away from EliminateNestedUnions to something like SimplifyUnions to be all encompassing)
I like this idea -- done in 9041a17
|
I find it odd that the perf slightly decreased in some benchmarks. I'll try and reproduce it today locally. |
i think looking at the two runs (of the planning benchmark) I would say that there is no noticable performance difference (kind of as expected). I will rerun to see if we can see a repeating pattern emerge |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
I'm going to go with the benchmarks being within a range of noise. The first 10 or so do seem to be slightly higher on average though that could just be because they were the first to run. |
| // Backwards compatibility name | ||
| #[deprecated(since = "52.0.0", note = "Please use OptimizeUnions instead")] | ||
| pub type EliminateNestedUnion = OptimizeUnions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider the mod too? Since it'll now be optimize_unions::EliminateNestedUnion which is breaking anyway 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point -- I will update
|
I got ambitious (it happens once every few moons 🤣) and ran the benchmark locally just to see if I saw the same uptick in the first few benchmarks as @alamb's results seem to have. Looks like my machine has about +/- 3-5% overall which is good to know. Details
|
Which issue does this PR close?
Rationale for this change
Each time a LogicalPlan is rewritten to eliminate a Union, we traverse the entire plan tree and copy some non trivial parts of it
Thus it is faster to plan when we have fewer passes over the plan tree
the EliminateNestedUnion and EliminateOneUnion rules both do similar things, and
the EliminateNestedUnion rule is very simple. So let's combine them into a
single rule that does both things in one pass over the plan tree.
What changes are included in this PR?
Consolidate
EliminateNestedUnionandEliminateOneUnionoptimizer rules into a single passAre these changes tested?
Yes with existing tests
I will also run planning benchmarks
Are there any user-facing changes?
No