[SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name #33685

AngersZhuuuu · 2021-08-09T06:35:05Z

What changes were proposed in this pull request?

For added UT, without this patch will failed as below

[info] - SHOW TABLES V2: SPARK-36086: CollapseProject project replace alias should use origin column name *** FAILED *** (4 seconds, 935 milliseconds)
[info]   java.lang.RuntimeException: After applying rule org.apache.spark.sql.catalyst.optimizer.CollapseProject in batch Operator Optimization before Inferring Filters, the structural integrity of the plan is broken.
[info]   at org.apache.spark.sql.errors.QueryExecutionErrors$.structuralIntegrityIsBrokenAfterApplyingRuleError(QueryExecutionErrors.scala:1217)
[info]   at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:229)
[info]   at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
[info]   at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
[info]   at scala.collection.immutable.List.foldLeft(List.scala:91)
[info]   at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
[info]   at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
[info]   at scala.collection.immutable.List.foreach(List.scala:431)
[info]   at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
[info]   at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
[info]   at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)

CollapseProject project replace alias should use origin column name

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added UT

…rigin column name

AngersZhuuuu · 2021-08-09T06:42:44Z

FYI @cloud-fan

SparkQA · 2021-08-09T07:05:45Z

Test build #142212 has started for PR 33685 at commit e06f441.

SparkQA · 2021-08-09T07:56:11Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46725/

SparkQA · 2021-08-09T08:33:08Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46725/

AmplabJenkins · 2021-08-09T08:33:13Z

Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46725/

cloud-fan · 2021-08-09T08:42:06Z

thanks, merging to 3.1!

…use origin column name ### What changes were proposed in this pull request? For added UT, without this patch will failed as below ``` [info] - SHOW TABLES V2: SPARK-36086: CollapseProject project replace alias should use origin column name *** FAILED *** (4 seconds, 935 milliseconds) [info] java.lang.RuntimeException: After applying rule org.apache.spark.sql.catalyst.optimizer.CollapseProject in batch Operator Optimization before Inferring Filters, the structural integrity of the plan is broken. [info] at org.apache.spark.sql.errors.QueryExecutionErrors$.structuralIntegrityIsBrokenAfterApplyingRuleError(QueryExecutionErrors.scala:1217) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:229) [info] at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) [info] at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) [info] at scala.collection.immutable.List.foldLeft(List.scala:91) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) [info] at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) ``` CollapseProject project replace alias should use origin column name ### Why are the changes needed? Fix bug ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added UT Closes #33685 from AngersZhuuuu/SPARK-36086-3.1. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

AmplabJenkins · 2021-08-09T10:04:02Z

Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142212/

…use origin column name ### What changes were proposed in this pull request? For added UT, without this patch will failed as below ``` [info] - SHOW TABLES V2: SPARK-36086: CollapseProject project replace alias should use origin column name *** FAILED *** (4 seconds, 935 milliseconds) [info] java.lang.RuntimeException: After applying rule org.apache.spark.sql.catalyst.optimizer.CollapseProject in batch Operator Optimization before Inferring Filters, the structural integrity of the plan is broken. [info] at org.apache.spark.sql.errors.QueryExecutionErrors$.structuralIntegrityIsBrokenAfterApplyingRuleError(QueryExecutionErrors.scala:1217) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:229) [info] at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) [info] at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) [info] at scala.collection.immutable.List.foldLeft(List.scala:91) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) [info] at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) ``` CollapseProject project replace alias should use origin column name ### Why are the changes needed? Fix bug ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added UT Closes apache#33685 from AngersZhuuuu/SPARK-36086-3.1. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 2061103) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…use origin column name ### What changes were proposed in this pull request? For added UT, without this patch will failed as below ``` [info] - SHOW TABLES V2: SPARK-36086: CollapseProject project replace alias should use origin column name *** FAILED *** (4 seconds, 935 milliseconds) [info] java.lang.RuntimeException: After applying rule org.apache.spark.sql.catalyst.optimizer.CollapseProject in batch Operator Optimization before Inferring Filters, the structural integrity of the plan is broken. [info] at org.apache.spark.sql.errors.QueryExecutionErrors$.structuralIntegrityIsBrokenAfterApplyingRuleError(QueryExecutionErrors.scala:1217) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:229) [info] at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) [info] at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) [info] at scala.collection.immutable.List.foldLeft(List.scala:91) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) [info] at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) [info] at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) ``` CollapseProject project replace alias should use origin column name ### Why are the changes needed? Fix bug ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added UT Closes apache#33685 from AngersZhuuuu/SPARK-36086-3.1. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

[SPARK-36086][SQL] CollapseProject project replace alias should use o…

e06f441

…rigin column name

github-actions bot added the SQL label Aug 9, 2021

AngersZhuuuu changed the title ~~[SPARK-36086][SQL] CollapseProject project replace alias should use origin column name~~ [SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name Aug 9, 2021

cloud-fan closed this Aug 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name #33685

[SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name #33685

AngersZhuuuu commented Aug 9, 2021

AngersZhuuuu commented Aug 9, 2021

SparkQA commented Aug 9, 2021

SparkQA commented Aug 9, 2021

SparkQA commented Aug 9, 2021

AmplabJenkins commented Aug 9, 2021

cloud-fan commented Aug 9, 2021

AmplabJenkins commented Aug 9, 2021

[SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name #33685

[SPARK-36086][SQL][3.1] CollapseProject project replace alias should use origin column name #33685

Conversation

AngersZhuuuu commented Aug 9, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

AngersZhuuuu commented Aug 9, 2021

SparkQA commented Aug 9, 2021

SparkQA commented Aug 9, 2021

SparkQA commented Aug 9, 2021

AmplabJenkins commented Aug 9, 2021

cloud-fan commented Aug 9, 2021

AmplabJenkins commented Aug 9, 2021