Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations #43727

Closed
wants to merge 1 commit into from

Conversation

beliefer
Copy link
Contributor

@beliefer beliefer commented Nov 9, 2023

What changes were proposed in this pull request?

#43614 let unreferenced CTE checked by CheckAnalysis0.
This PR follows up #43614 to simplify the code for check unreferenced CTE relations.

Why are the changes needed?

Simplify the code for check unreferenced CTE relations

Does this PR introduce any user-facing change?

'No'.

How was this patch tested?

Exists test cases.

Was this patch authored or co-authored using generative AI tooling?

'No'.

if (refCount == 0) {
checkUnreferencedCTERelations(cteMap, visited, cteId)
}
if (refCount == 0) {
Copy link
Contributor

@amaliujia amaliujia Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this misses some corners cases. For example

WITH
a as (select * from table_exists),
b as (select * from a),
c as (select * from table_non_exists),
d as (select * from c)
SELECT 1

So the code may only check a and b but lose the checking over c and d.

cc @cloud-fan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean your current change may not be able to handle the proposed case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the test case passed.

@beliefer beliefer changed the title [WIP][SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations Nov 10, 2023
// If a CTE relation is never used, it will disappear after inline. Here we explicitly check
// analysis for it, to make sure the entire query plan is valid.
try {
// If a CTE relation ref count is 0, the other CTE relations that reference it
// should also be checked by checkAnalysis0. This code will also guarantee the leaf
// relations that do not reference any others are checked first.
val visited: mutable.Map[Long, Boolean] = mutable.Map.empty.withDefaultValue(false)
cteMap.foreach { case (cteId, _) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize that we were doing a nested loop before, good catch!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right nice catch!

@beliefer beliefer closed this in 6851cb9 Nov 10, 2023
@beliefer
Copy link
Contributor Author

Merged to master
@cloud-fan @amaliujia Thank you!

@amaliujia
Copy link
Contributor

late LGTM. Thank you @beliefer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants