Refactor `Optimizer` to use owned plans and `TreeNode` API (10% faster planning) #9948

alamb · 2024-04-04T13:06:46Z

Note: this looks like a large PR, but many of the changes are lines that removed &

Which issue does this PR close?

Part of #9637 (stop copying LogicalPlans in the Optimizer) and #8913 (unified TreeNode rewrite)

Rationale for this change

The current structure of the Optimizer copies LogicalPlans a large number of times. This is both slow as well as requires a large number of allocations

After #9999, the TreeNode API can handle rewriting LogicalPlan efficiently without clone.

Thus it makes sense to use the TreeNode API in the optimizer, both because I think the code is simpler as well as to take advantage of the performance improvements in TreeNode API.

What changes are included in this PR?

Refactor Optimizer to use TreeNode API
Change Optimizer::optimize to take an owned LogicalPlan rather than force a copy

Are these changes tested?

By existing CI

Performance benchmarks: Planning is 10% faster for TPCH, 13% faster for TPCDS

Details

++ critcmp main optimizer_tree_node2
group                                         main                                   optimizer_tree_node2
-----                                         ----                                   --------------------
logical_aggregate_with_join                   1.00  1177.8±21.23µs        ? ?/sec    1.00  1176.3±12.15µs        ? ?/sec
logical_plan_tpcds_all                        1.01    154.2±0.85ms        ? ?/sec    1.00    153.1±0.77ms        ? ?/sec
logical_plan_tpch_all                         1.01     16.5±0.14ms        ? ?/sec    1.00     16.4±0.16ms        ? ?/sec
logical_select_all_from_1000                  1.06     19.3±0.15ms        ? ?/sec    1.00     18.2±0.18ms        ? ?/sec
logical_select_one_from_700                   1.00   779.6±21.54µs        ? ?/sec    1.01   785.8±15.37µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   726.7±12.27µs        ? ?/sec    1.00   728.4±12.68µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00    710.9±6.84µs        ? ?/sec    1.00    714.1±6.52µs        ? ?/sec
physical_plan_tpcds_all                       1.13   1834.3±2.48ms        ? ?/sec    1.00   1624.6±2.37ms        ? ?/sec
physical_plan_tpch_all                        1.10    119.2±0.60ms        ? ?/sec    1.00    107.9±0.56ms        ? ?/sec
physical_plan_tpch_q1                         1.19      7.3±0.06ms        ? ?/sec    1.00      6.1±0.04ms        ? ?/sec
physical_plan_tpch_q10                        1.11      5.5±0.05ms        ? ?/sec    1.00      5.0±0.03ms        ? ?/sec
physical_plan_tpch_q11                        1.10      4.8±0.02ms        ? ?/sec    1.00      4.4±0.02ms        ? ?/sec
physical_plan_tpch_q12                        1.09      3.9±0.02ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
physical_plan_tpch_q13                        1.09      2.6±0.02ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
physical_plan_tpch_q14                        1.09      3.3±0.03ms        ? ?/sec    1.00      3.1±0.02ms        ? ?/sec
physical_plan_tpch_q16                        1.11      4.9±0.05ms        ? ?/sec    1.00      4.4±0.02ms        ? ?/sec
physical_plan_tpch_q17                        1.11      4.6±0.03ms        ? ?/sec    1.00      4.2±0.03ms        ? ?/sec
physical_plan_tpch_q18                        1.10      5.0±0.03ms        ? ?/sec    1.00      4.5±0.02ms        ? ?/sec
physical_plan_tpch_q19                        1.06      9.4±0.07ms        ? ?/sec    1.00      8.8±0.04ms        ? ?/sec
physical_plan_tpch_q2                         1.11     10.5±0.04ms        ? ?/sec    1.00      9.4±0.03ms        ? ?/sec
physical_plan_tpch_q20                        1.12      6.1±0.05ms        ? ?/sec    1.00      5.4±0.03ms        ? ?/sec
physical_plan_tpch_q21                        1.12      8.3±0.07ms        ? ?/sec    1.00      7.4±0.03ms        ? ?/sec
physical_plan_tpch_q22                        1.12      4.4±0.02ms        ? ?/sec    1.00      3.9±0.06ms        ? ?/sec
physical_plan_tpch_q3                         1.09      3.9±0.02ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
physical_plan_tpch_q4                         1.12      2.9±0.01ms        ? ?/sec    1.00      2.6±0.01ms        ? ?/sec
physical_plan_tpch_q5                         1.08      5.6±0.02ms        ? ?/sec    1.00      5.2±0.05ms        ? ?/sec
physical_plan_tpch_q6                         1.07  1981.4±12.85µs        ? ?/sec    1.00  1858.7±85.62µs        ? ?/sec
physical_plan_tpch_q7                         1.10      7.5±0.05ms        ? ?/sec    1.00      6.8±0.05ms        ? ?/sec
physical_plan_tpch_q8                         1.10      9.5±0.08ms        ? ?/sec    1.00      8.7±0.04ms        ? ?/sec
physical_plan_tpch_q9                         1.10      7.2±0.04ms        ? ?/sec    1.00      6.5±0.04ms        ? ?/sec
physical_select_all_from_1000                 1.19    128.1±0.29ms        ? ?/sec    1.00    107.4±0.41ms        ? ?/sec
physical_select_one_from_700                  1.02      4.0±0.03ms        ? ?/sec    1.00      3.9±0.05ms        ? ?/sec

Are there any user-facing changes?

There is a small API change: Optimizer::optimize now takes an owned LogicalPlan rather a reference (which forces a copy)

Planned follow on task

Add special cases / rewrite other optimizer passes to reduce copies

alamb · 2024-04-04T13:19:45Z

datafusion-examples/examples/rewrite_expr.rs

@@ -59,7 +59,7 @@ pub fn main() -> Result<()> {

    // then run the optimizer with our custom rule
    let optimizer = Optimizer::with_rules(vec![Arc::new(MyOptimizerRule {})]);
-    let optimized_plan = optimizer.optimize(&analyzed_plan, &config, observe)?;
+    let optimized_plan = optimizer.optimize(analyzed_plan, &config, observe)?;


This illustrates the API change -- the optimizer now takes an owned plan rather than a reference

A great progress!

alamb · 2024-04-04T13:20:05Z

datafusion/core/tests/optimizer_integration.rs

@@ -110,7 +110,7 @@ fn test_sql(sql: &str) -> Result<LogicalPlan> {
    let optimizer = Optimizer::new();
    // analyze and optimize the logical plan
    let plan = analyzer.execute_and_check(&plan, config.options(), |_, _| {})?;
-    optimizer.optimize(&plan, &config, |_, _| {})
+    optimizer.optimize(plan, &config, |_, _| {})


A large amount of this PR is changes to test to pass in an owned plan

alamb · 2024-04-04T13:20:55Z

datafusion/optimizer/src/eliminate_limit.rs


        let formatted_plan = format!("{optimized_plan:?}");
        assert_eq!(formatted_plan, expected);
-        assert_eq!(plan.schema(), optimized_plan.schema());


I changed the tests to call Optimizer::optimize directly, which already checks the schema doesn't change, so this test is redundant

This applies to several other changes in this PR

alamb · 2024-04-04T13:35:41Z

datafusion/optimizer/src/optimizer.rs

 ///
-/// Notice: **sometime** result after optimize still can be optimized, we need apply again.


I do not think this comment is applicable anymore -- the optimizer handles the recursion internally as well as applying multiple optimizer passes

datafusion/optimizer/src/optimizer.rs

alamb · 2024-04-04T13:37:25Z

datafusion/optimizer/src/optimizer.rs

@@ -356,97 +423,22 @@ impl Optimizer {
        debug!("Optimizer took {} ms", start_time.elapsed().as_millis());
        Ok(new_plan)
    }
-
-    fn optimize_node(


This code implemented plan recursion within the optimizer and is (now) redundant with the TreeNode API

alamb · 2024-04-04T13:38:07Z

datafusion/optimizer/src/optimizer.rs

-                Field { name: \"c\", data_type: UInt32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, \
-                field_qualifiers: [Some(Bare { table: \"test\" }), Some(Bare { table: \"test\" }), Some(Bare { table: \"test\" })], \
-                functional_dependencies: FunctionalDependencies { deps: [] } }, \
+            "Optimizer rule 'get table_scan rule' failed\n\


The original error actually is incorrect that it reports the reversed schemas (the "new schema" was actually the original schema)

alamb · 2024-04-09T12:42:08Z

datafusion/optimizer/src/optimizer.rs

+}
+
+/// Recursively rewrites LogicalPlans
+struct Rewriter<'a> {


datafusion/optimizer/src/optimizer.rs has all the important changes (to use the TreeNode API and stop copying)

alamb · 2024-04-09T12:43:36Z

datafusion/optimizer/src/push_down_projection.rs

-                &OptimizerContext::new(),
-            )?
-            .unwrap_or_else(|| plan.clone());
+        let optimized_plan =


Since the optimizer correctly applies rules recursively now, there is no need to explicitly call optimize recursively

alamb · 2024-04-09T12:44:38Z

datafusion/optimizer/src/test/mod.rs

-        )?
-        .unwrap_or_else(|| plan.clone());
+    // Apply the rule once
+    let opt_context = OptimizerContext::new().with_max_passes(1);


Without this some of the tests don't pass. By default the optimizer runs a few times until no changes are detected. Limiting to 1 pass mimics the previous test behavior

I'm not sure about this. It seems that the previous code does not limit the pass to 1. Why do we need to limit it now to have the same behavior as the previous one? 😕

My explanation goes like: The loop that applies the rule more than once calls optimize_recursively each time

https://github.com/apache/arrow-datafusion/blob/75c399ce7d4d5360140c64089dd7b05ffd7c49ef/datafusion/optimizer/src/optimizer.rs#L298-L303

This test only called optimze_recursively once (directly) and thus the OptimizeRule is only applied once

When I rewrote the test to use Optimizer::optimize the loop will now kick in and so the OptimizeRule will be run several times unless we set with_max_passes

This same reasoning applies to the other tests, but apparently they get the same answer when applied more than once

alamb · 2024-04-09T14:11:28Z

@jackwener since you implemented some of the original optimizer recursion I wonder if you would have some time to review this PR

jayzhan211

It looks pretty nice now!! 🚀

jayzhan211 · 2024-04-10T00:52:03Z

datafusion/optimizer/src/optimizer.rs

+
+                let result = match rule.apply_order() {
+                    // optimizer handles recursion
+                    Some(apply_order) => new_plan.rewrite(&mut Rewriter::new(


Does rename it to rewrite_recurisvely more straightforward?

It's a generic api (so using rewrite makes sense to me), it's not introduced by this PR

I got a question here.
In previous code, optimize_inputs get the plan.inputs(). how does rewrite get the plan.inputs() here? How does childnode in rewrite equals to plan.inputs()

Oh, I see the map_children

Yes, exactly

rewrite : https://github.com/apache/arrow-datafusion/blob/cb21404bd3736ff9a6d8d443a67c64ece4c551a9/datafusion/common/src/tree_node.rs#L117-L124

(eventually) calls map_children: https://github.com/apache/arrow-datafusion/blob/cb21404bd3736ff9a6d8d443a67c64ece4c551a9/datafusion/common/src/tree_node.rs#L27-L31

This is the beauty of these TreeNode APIs and @peter-toth's plan to make them all consistent.

jackwener

A nice job to me, thanks @alamb!

…node2

alamb · 2024-04-10T10:24:16Z

Thank you very much @jackwener and @jayzhan211 for the reviews 🙏

…node2

mustafasrepo · 2024-04-15T06:05:38Z

Thanks @alamb for this work.

alamb added the api change Changes the API exposed to users of the crate label Apr 4, 2024

github-actions bot added optimizer Optimizer rules core Core datafusion crate sqllogictest labels Apr 4, 2024

alamb force-pushed the alamb/optimizer_tree_node2 branch from 613fdae to b950a9e Compare April 4, 2024 13:09

alamb changed the title ~~Rewrite Optimizer to use TreeNode API~~ Refactor Optimizer to use TreeNode API Apr 4, 2024

alamb force-pushed the alamb/optimizer_tree_node2 branch from b950a9e to ef189fb Compare April 4, 2024 13:34

alamb commented Apr 4, 2024

View reviewed changes

This was referenced Apr 4, 2024

DEMO: Introduce TreeNodeMutator for rewriting TreeNodes in place, change optimizer to rewrite LogicalPlan in place - 10% faster planning time #9780

Closed

Avoid copying (so much) for LogicalPlan::map_children #9946

Closed

alamb force-pushed the alamb/optimizer_tree_node2 branch from ef189fb to d951289 Compare April 4, 2024 19:25

github-actions bot added the logical-expr Logical plan and expressions label Apr 4, 2024

alamb mentioned this pull request Apr 4, 2024

Introduce OptimizerRule::rewrite to rewrite in place, rewrite ExprSimplifier (20% faster planning) #9954

Merged

alamb force-pushed the alamb/optimizer_tree_node2 branch from d951289 to a755338 Compare April 4, 2024 20:19

alamb changed the title ~~Refactor Optimizer to use TreeNode API~~ Refactor Optimizer to use TreeNode API (10% faster planning) Apr 5, 2024

alamb force-pushed the alamb/optimizer_tree_node2 branch 2 times, most recently from 52f3a54 to 2d5e154 Compare April 7, 2024 17:50

alamb mentioned this pull request Apr 8, 2024

Avoid LogicalPlan::clone() in LogicalPlan::map_children when possible #9999

Merged

alamb force-pushed the alamb/optimizer_tree_node2 branch from 2d5e154 to a282d6a Compare April 9, 2024 10:00

Rewrite Optimizer to use TreeNode API

a4cb731

alamb force-pushed the alamb/optimizer_tree_node2 branch from a282d6a to a4cb731 Compare April 9, 2024 12:31

github-actions bot removed the logical-expr Logical plan and expressions label Apr 9, 2024

alamb changed the title ~~Refactor Optimizer to use TreeNode API (10% faster planning)~~ Refactor Optimizer to use owned plans and TreeNode API (10% faster planning) Apr 9, 2024

alamb commented Apr 9, 2024

View reviewed changes

alamb marked this pull request as ready for review April 9, 2024 12:45

alamb requested review from jackwener and mustafasrepo April 9, 2024 14:11

alamb mentioned this pull request Apr 9, 2024

[EPIC] Stop copying LogicalPlan during OptimizerPasses #9637

Closed

31 tasks

jayzhan211 approved these changes Apr 10, 2024

View reviewed changes

jayzhan211 reviewed Apr 10, 2024

View reviewed changes

jackwener approved these changes Apr 10, 2024

View reviewed changes

Merge remote-tracking branch 'apache/main' into alamb/optimizer_tree_…

453bfcb

…node2

alamb added 2 commits April 10, 2024 06:26

Merge remote-tracking branch 'apache/main' into alamb/optimizer_tree_…

5846ac6

…node2

fmt

7392b94

alamb merged commit 03d8ba1 into apache:main Apr 10, 2024
24 checks passed

alamb deleted the alamb/optimizer_tree_node2 branch April 10, 2024 13:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `Optimizer` to use owned plans and `TreeNode` API (10% faster planning) #9948

Refactor `Optimizer` to use owned plans and `TreeNode` API (10% faster planning) #9948

alamb commented Apr 4, 2024 •

edited

Loading

alamb Apr 4, 2024

jackwener Apr 10, 2024

alamb Apr 4, 2024

alamb Apr 4, 2024 •

edited

Loading

alamb Apr 4, 2024

alamb Apr 4, 2024 •

edited

Loading

alamb Apr 4, 2024

alamb Apr 9, 2024

alamb Apr 9, 2024

alamb Apr 9, 2024

jayzhan211 Apr 10, 2024

alamb Apr 10, 2024

jayzhan211 Apr 10, 2024

alamb commented Apr 9, 2024

jayzhan211 left a comment

jayzhan211 Apr 10, 2024

jackwener Apr 10, 2024

jayzhan211 Apr 10, 2024

jayzhan211 Apr 10, 2024

alamb Apr 10, 2024

jackwener left a comment

alamb commented Apr 10, 2024

mustafasrepo commented Apr 15, 2024

		///
		/// Notice: sometime result after optimize still can be optimized, we need apply again.

Refactor Optimizer to use owned plans and TreeNode API (10% faster planning) #9948

Refactor Optimizer to use owned plans and TreeNode API (10% faster planning) #9948

Conversation

alamb commented Apr 4, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Planned follow on task

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb Apr 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb Apr 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Apr 9, 2024

jayzhan211 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackwener left a comment

Choose a reason for hiding this comment

alamb commented Apr 10, 2024

mustafasrepo commented Apr 15, 2024

Refactor `Optimizer` to use owned plans and `TreeNode` API (10% faster planning) #9948

Refactor `Optimizer` to use owned plans and `TreeNode` API (10% faster planning) #9948

alamb commented Apr 4, 2024 •

edited

Loading

alamb Apr 4, 2024 •

edited

Loading

alamb Apr 4, 2024 •

edited

Loading