Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: do a better job of optimizing recursive ctes #89954

Open
1 of 6 tasks
DrewKimball opened this issue Oct 13, 2022 · 0 comments
Open
1 of 6 tasks

opt: do a better job of optimizing recursive ctes #89954

DrewKimball opened this issue Oct 13, 2022 · 0 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@DrewKimball
Copy link
Collaborator

DrewKimball commented Oct 13, 2022

#85597 added support for variable-inequality lookup joins in an attempt to improve performance of recursive CTE expressions that are used to model loose-index-scans. However, the optimizer still isn't very smart about taking advantage of the execution changes. For example, the recursive CTE from this blog post executes with a cross join when it should use an ordered lookup join with a limit. This issue tracks items that should be improved in order to ensure good recursive CTE performance without the need for query rewrites.

  • Currently, we only plan a variable-inequality lookup join in the case when the input has one row or there is a join hint. This is because the joinreader logic does not fully de-duplicate lookup spans for variable inequalities, and so performance can be worse than performing a cross join in some cases. We should improve the de-duplication and/or update the cost model so that the 1-row input restriction can be safely removed.
  • opt: improve WithScan properties for recursive CTEs #90725 We do a poor job of calculating recursive CTE properties. We should be able to tell that each iteration only produces one row if both the recursive and anchor branches have a LIMIT 1.
  • We should be able to infer useful properties other than limits for recursive CTEs, like keys and orderings.
  • Our ability to predict statistics for recursive CTEs is very poor. We should be able to recognize common patterns (e.g. loose-index-scans) and make predictions based off of that.
  • We already have a rule that allows planning separate joins for each disjunction in a matched join's ON condition SplitDisjunctionOfJoinTerms. However, this rule does not match when one of the inputs is a WithScan. We should be able to fire this rule with a WithScan as long as one of the inputs has a canonical scan that can potentially be used for lookups.
  • We should review all rules that match Scan operators and consider whether they should be extended to work with WithScan operators as well.

Jira issue: CRDB-20511

@DrewKimball DrewKimball added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Oct 13, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 13, 2022
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Oct 26, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Oct 26, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 9, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 9, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 9, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 9, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 10, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 10, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 10, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 10, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 16, 2022
This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

Informs cockroachdb#89954

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 16, 2022
…e cte

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs cockroachdb#89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.
craig bot pushed a commit that referenced this issue Nov 17, 2022
90725: opt: improve WithScan properties for recursive CTEs r=DrewKimball a=DrewKimball

**opt: update recursive cte scan with limit that applies to both branches**

This commit adds an optimization rule `ApplyLimitToRecursiveCTEScan` that
matches recursive CTE expressions that have bounded cardinality in both the
anchor and recursive branches, then updates the properties of the recursive
scan to reflect this limit. This can allow other rules to fire in the
recursive branch.

**opt: infer a limit that applies to the recursive branch of a recursive cte**

It is common to see cases where we are unable to infer that the recursive
branch of a recursive CTE returns a limited number of rows because the
properties of the recursive branch depend on the properties of the
recursive `WithScan`. This commit adds a new rule called
`TryAddLimitToRecursiveBranch` that matches cases where the anchor branch
has bounded cardinality but not the recursive branch. It then traverses
the recursive branch until it finds a matching `WithScan`, and checks
whether applying the limit from the anchor branch to the `WithScan` would
cause the recursive branch to have the same upper cardinality bound. If
this is the case, `TryAddLimitToRecursiveBranch` re-optimizes the
recursive branch with a limit added on top of it. Doing so can allow
other rules to fire - for example, `ApplyLimitToRecursiveCTEScan`.

Informs #89954

Release note (performance improvement): The optimizer can now better
calculate the properties of recursive CTE expressions in the presence
of a `LIMIT`.

91789: pkg/util/cgroups: fix cgroup memory lookup when using systemd r=abarganier a=srosenberg

As described in the linked issue, most linux distros enable both v1 and v2 cgroups. This means that /proc/[pid]/mountinfo may contain mounts from both versions.

Previously, the detected version was based on the first match found by parsing /proc/[pid]/mountinfo. Thus, if v2 is found first and systemd is using v1, the memory limit will be looked up against v2, thereby missing the limit specified via systemd.

This change returns both versions and their corresponding mounts. The lookups proceed against v2 and fallback to v1, and only report an error then.
We also add a special case for unlimited cgroup memory in order to formulate a more accurate message lest we mislead the user in thinking they have 8 exabytes of RAM.

Fixes: #59236

Release note (bug fix): Fixes a bug pre-v21.1 wherein cgroup memory limit was undetected when using systemd.

91952: startupmigrations: remove dead code r=andreimatei a=andreimatei

A callback was never run since #84881.

Release note: None
Epic: None

92018: colflow: fix EXPLAIN (VEC) in the bundle r=yuzefovich a=yuzefovich

This commit makes it so that all operators in the vectorized tree are considered when printing out `EXPLAIN (VEC)` when the execution stats collection is enabled. Previously, due to how we structured the vectorized stats collector (namely, `batchInfoCollector` simply wrapped the input `Operator`) we would omit the name of that operator in the EXPLAIN output (because `batchInfoCollector.Child` was the `Child` method of the wrapped operator - i.e. we'd effectively skip the wrapped operator). This commit fixes that issue. I believe the impact is very minor, and probably only the stmt bundles are affected, but it still could cause the confusion.

There is no unit test since it's quite annoying to write (we need to inspect the contents of a file in a stmt bundle), and since this is mostly a debugging tool for our team, the manual QA seems sufficient.

Epic: CRDB-14510

Release note: None

Co-authored-by: DrewKimball <drewk@cockroachlabs.com>
Co-authored-by: Stan Rosenberg <stan@cockroachlabs.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@DrewKimball DrewKimball removed their assignment Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

1 participant