Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(planner): Refine code #10488

Merged
merged 2 commits into from
Mar 11, 2023
Merged

chore(planner): Refine code #10488

merged 2 commits into from
Mar 11, 2023

Conversation

leiysky
Copy link
Collaborator

@leiysky leiysky commented Mar 10, 2023

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

@vercel
Copy link

vercel bot commented Mar 10, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated
databend ⬜️ Ignored (Inspect) Mar 10, 2023 at 8:13AM (UTC)

@mergify mergify bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Mar 10, 2023
@leiysky leiysky marked this pull request as ready for review March 11, 2023 04:09
@BohuTANG BohuTANG merged commit 5a7820d into databendlabs:main Mar 11, 2023
@dusx1981
Copy link
Contributor

I did a test, and the rules can be classified and managed by modifying Rule.rs and heuristic.rs, so that all rule applications can be correctly mapped to each operator's own unique rule set. I originally submitted a PR today , but the logic of seeing all rule-related bitmaps was abandoned, so I closed the PR. As there are more and more rules, the impact on efficiency of each full traversal rule will become greater and greater. If necessary, I can resubmit a PR.
Here is the modified code snippet:

pub enum RuleID {
    // Rewrite rules
    // Filter
    NormalizeScalarFilter,
    NormalizeDisjunctiveFilter,
    PushDownFilterAggregate,
    PushDownFilterEvalScalar,
    PushDownFilterUnion,
    PushDownFilterJoin,
    PushDownFilterScan,
    PushDownFilterSort,
    EliminateFilter,
    MergeFilter,

    // EvalScalar
    EliminateEvalScalar,
    MergeEvalScalar,

    // Limit
    PushDownLimitUnion,
    PushDownLimitOuterJoin,
    RulePushDownLimitExpression,
    PushDownLimitSort,
    PushDownLimitAggregate,
    PushDownLimitScan,

    // Agg
    SplitAggregate,
    FoldCountAggregate,

    PushDownPrewhere,

    // Sort
    PushDownSortScan,
pub static DEFAULT_REWRITE_RULES: Lazy<Vec<RuleID>> = Lazy::new(|| {
    vec![
        // Filter
        RuleID::NormalizeDisjunctiveFilter,
        RuleID::NormalizeScalarFilter,
        RuleID::EliminateFilter,
        RuleID::MergeFilter,
        RuleID::PushDownFilterUnion,
        RuleID::PushDownFilterAggregate,
        RuleID::PushDownFilterSort,
        RuleID::PushDownFilterEvalScalar,
        RuleID::PushDownFilterJoin,
        RuleID::PushDownFilterScan,
        // EvalScalar
        RuleID::EliminateEvalScalar,
        RuleID::MergeEvalScalar,
        // Limit
        RuleID::PushDownLimitUnion,
        RuleID::RulePushDownLimitExpression,
        RuleID::PushDownLimitSort,
        RuleID::PushDownLimitAggregate,
        RuleID::PushDownLimitOuterJoin,
        RuleID::PushDownLimitScan,
        // Agg
        RuleID::FoldCountAggregate,
        RuleID::SplitAggregate,
        RuleID::PushDownPrewhere, /* PushDownPrwhere should be after all rules except PushDownFilterScan */
        // Sort
        RuleID::PushDownSortScan, // PushDownFilterScan should be after PushDownPrewhere
    ]
});

@leiysky leiysky deleted the fix-ruleset branch March 13, 2023 06:16
@leiysky
Copy link
Collaborator Author

leiysky commented Mar 13, 2023

@dusx1981 Thanks for your concern.

I removed the logic about operator-specific rule sets because in the current heuristic optimizer, the order of rule application matters(and that's why we use a RuleList instead of RuleSet to store the rewrite rules). RuleSet cannot represent this semantic well and will bring confusion.

@dusx1981
Copy link
Contributor

dusx1981 commented Mar 13, 2023 via email

@dusx1981
Copy link
Contributor

dusx1981 commented Mar 13, 2023

@leiysky

@dusx1981 Thanks for your concern.

I removed the logic about operator-specific rule sets because in the current heuristic optimizer, the order of rule application matters(and that's why we use a RuleList instead of RuleSet to store the rewrite rules). RuleSet cannot represent this semantic well and will bring confusion.

After the rules are classified and managed, it is easier to add new rules later, coupled with the impact of efficiency, isn’t it necessary to optimize the application of its own rules for each operator?

optimized operator

  1. filter
  2. limit
  3. eval scala
  4. Agg
  5. Sort
  6. Scan
  7. ...

@dusx1981
Copy link
Contributor

dusx1981 commented Mar 14, 2023

@leiysky

@dusx1981 Thanks for your concern.

I removed the logic about operator-specific rule sets because in the current heuristic optimizer, the order of rule application matters(and that's why we use a RuleList instead of RuleSet to store the rewrite rules). RuleSet cannot represent this semantic well and will bring confusion.

Optimization of the heuristic optimizer

  1. Filter: Reduce the number of loops from 22 to 10
  2. EvalScalar: Reduce the number of loops from 22 to 2
  3. Limit: Reduce the number of loops from 22 to 6
  4. Agg: Reduce the number of loops from 22 to 2
  5. Sort: Reduce the number of loops from 22 to 1
  6. Scan: Reduce the number of loops from 22 to 0
  7. DumpTableScan: Reduce the number of loops from 22 to 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-chore this PR only has small changes that no need to record, like coding styles.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants