-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38285][SQL] Avoid generator pruning for invalid extractor #35749
Conversation
withTempView("v1") { | ||
val sqlText = | ||
""" | ||
|create or replace temp view v1 as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Shall we capitalize the SQL keywords? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea
|""".stripMargin | ||
sql(sqlText) | ||
|
||
val df = sql("select eo.b.e from (select explode(o) as eo from v1)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
@@ -372,6 +372,13 @@ object GeneratorNestedColumnAliasing { | |||
e.withNewChildren(Seq(extractor)) | |||
} | |||
|
|||
val invalidExtractor = rewrittenG.generator.children.head.collect { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments for this logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will add some.
+1, LGTM. Thank you, @viirya ! |
Thank you @dongjoon-hyun ! |
Merging to master/3.2. |
### What changes were proposed in this pull request? This fixes a bug in generator nested column pruning. The bug happens when the extractor pattern is like `GetArrayStructFields(GetStructField(...), ...)` on the generator output. Once the input to the generator is an array, after replacing with the extractor based on pruning logic, it becomes an extractor of `GetArrayStructFields(GetArrayStructFields(...), ...)` which is not valid. ### Why are the changes needed? To fix a bug in generator nested column pruning. ### Does this PR introduce _any_ user-facing change? Yes, fixing a user-facing bug. ### How was this patch tested? Added unit test. Closes #35749 from viirya/SPARK-38285. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 71991f7) Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
### What changes were proposed in this pull request? This fixes a bug in generator nested column pruning. The bug happens when the extractor pattern is like `GetArrayStructFields(GetStructField(...), ...)` on the generator output. Once the input to the generator is an array, after replacing with the extractor based on pruning logic, it becomes an extractor of `GetArrayStructFields(GetArrayStructFields(...), ...)` which is not valid. ### Why are the changes needed? To fix a bug in generator nested column pruning. ### Does this PR introduce _any_ user-facing change? Yes, fixing a user-facing bug. ### How was this patch tested? Added unit test. Closes apache#35749 from viirya/SPARK-38285. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 71991f7) Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 8cee32d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This fixes a bug in generator nested column pruning. The bug happens when the extractor pattern is like
GetArrayStructFields(GetStructField(...), ...)
on the generator output. Once the input to the generator is an array, after replacing with the extractor based on pruning logic, it becomes an extractor ofGetArrayStructFields(GetArrayStructFields(...), ...)
which is not valid.Why are the changes needed?
To fix a bug in generator nested column pruning.
Does this PR introduce any user-facing change?
Yes, fixing a user-facing bug.
How was this patch tested?
Added unit test.