Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove EnforceSingleRowNode support from PlanNodeDecorrelator #19932

Merged
merged 2 commits into from
Dec 5, 2023

Conversation

sopel39
Copy link
Member

@sopel39 sopel39 commented Nov 28, 2023

EnforceSingleRowNode guarantees that exactly single row is produced. Therefore, PlanNodeDecorrelator was incorrectly removing EnforceSingleRowNode even if decorrelated subplan could produce 0 rows.
Additionally, TransformCorrelatedJoinToJoin was completing with TransformCorrelatedScalarSubquery to decorrelate EnforceSingleRowNode.
This changes removes support for EnforceSingleRowNode from PlanNodeDecorrelator and makes TransformCorrelatedScalarSubquery the only rule that (correctly) decorrelates that node.

@sopel39 sopel39 added the no-release-notes This pull request does not require release notes entry label Nov 28, 2023
@cla-bot cla-bot bot added the cla-signed label Nov 28, 2023
@sopel39
Copy link
Member Author

sopel39 commented Nov 28, 2023

This is mostly cleanup, I don't think it's possible to produce wrong plan with this issue.

@sopel39
Copy link
Member Author

sopel39 commented Nov 28, 2023

Fixes #19002

@findepi
Copy link
Member

findepi commented Nov 28, 2023

Fixes #19002

This issue is closed.
Should it be reopened?

@findepi
Copy link
Member

findepi commented Nov 28, 2023

cc @kasiafi @kokosing for decorrelation

@sopel39
Copy link
Member Author

sopel39 commented Nov 28, 2023

This issue is closed.
Should it be reopened?

@findepi this PR is the "proper" fix to the underlying issue (#19002 was not addressing the issue of wrong planning covered by this PR).

However, regression introduced by #19002 was covered in #19922

@@ -124,7 +126,10 @@ public Result apply(CorrelatedJoinNode correlatedJoinNode, Captures captures, Co
correlatedJoinNode.getInput(),
rewrittenSubquery,
correlatedJoinNode.getCorrelation(),
producesSingleRow ? INNER : correlatedJoinNode.getType(),
// EnforceSingleRowNode guarantees that exactly single matching row is produced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that there's an EnforceSingleRowNode, is there? All this rule cares about is whether the subplan is known to produce 0 or 1 row.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no EnforceSingleRowNode, we exit earlier:

        if (!searchFrom(subquery, context.getLookup())
                .where(EnforceSingleRowNode.class::isInstance)
                .recurseOnlyWhen(ProjectNode.class::isInstance)
                .matches()) {
            return Result.empty();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that. It wasn't immediately visible in the diff. However, that should not be necessary for this rule to work in general, as far as I can tell. We omit that check and rely on the cardinality property of the subplan.

I guess my comment is that in order to future-proof this comment from non-local changes, we could just generalize it to say that the subplan is guaranteed to produce one row (regardless of how that's achieved), so this can be an inner join instead of a left join.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the rule insists on finding EnforceSingleRowNode because it also wants (needs?) to remove it.

Copy link
Member Author

@sopel39 sopel39 Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally for this rule we have two sub-rules like ...WithProjection and WithoutProjection as for example for TransformCorrelatedGlobalAggregationWithoutProjection. However, I don't want to refactor this rule as part of this PR.

I guess my comment is that in order to future-proof this comment from non-local changes, we could just generalize it to say that the subplan is guaranteed to produce one row (regardless of how that's achieved), so this can be an inner join instead of a left join.

I don't think it can be generalized like that in a sense that EnforceSingleRowNode is kind of special node, because:

  1. subplan might produce more than one row and EnforceSingleRowOperator will fail then. That check is preserved in decorrelated plan after TransformCorrelatedScalarSubquery. Such check is not needed while decorrelating via PlanNodeDecorrelator.
  2. subplan might produce 0 rows and EnforceSingleRowOperator will produce null row then. This semantics is also preserved by TransformCorrelatedScalarSubquery (and it's only possible to decorrelate such plan when join itself has no filter).

Overall, single-row cases are already handled by TransformCorrelatedJoinToJoin (and PlanNodeDecorrelator) while this rule is dedicated for decorrelation of EnforceSingleRowNode

EnforceSingleRowNode guarantees that exactly single row
is produced. Therefore, PlanNodeDecorrelator was incorrectly
removing EnforceSingleRowNode even if decorrelated subplan
could produce 0 rows.
Additionally, TransformCorrelatedJoinToJoin was competing
with TransformCorrelatedScalarSubquery to decorrelate
EnforceSingleRowNode.
This changes removes support for EnforceSingleRowNode
from PlanNodeDecorrelator and makes TransformCorrelatedScalarSubquery
the only rule that (correctly) decorrelates that node.
@sopel39 sopel39 merged commit 2ac1996 into trinodb:master Dec 5, 2023
87 of 88 checks passed
@github-actions github-actions bot added this to the 435 milestone Dec 5, 2023
@sopel39 sopel39 deleted the ks/fix_decorrelation branch December 5, 2023 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

None yet

3 participants