[SPARK-39112][SQL] UnsupportedOperationException if spark.sql.ui.explainMode is set to cost#36488
Closed
ulysses-you wants to merge 2 commits intoapache:masterfrom
Closed
[SPARK-39112][SQL] UnsupportedOperationException if spark.sql.ui.explainMode is set to cost#36488ulysses-you wants to merge 2 commits intoapache:masterfrom
ulysses-you wants to merge 2 commits intoapache:masterfrom
Conversation
Contributor
Author
cloud-fan
reviewed
May 10, 2022
| /** | ||
| * A resolved leaf like node that its statistics is no meaning. | ||
| */ | ||
| trait ResolvedLeafObject extends LogicalPlan with LeafLike[LogicalPlan] { |
Contributor
There was a problem hiding this comment.
nit: LeafNodeWithoutStats seems better
cloud-fan
reviewed
May 10, 2022
| } | ||
|
|
||
| test("SPARK-39112: UnsupportedOperationException if spark.sql.ui.explainMode is set to cost") { | ||
| withSQLConf(SQLConf.UI_EXPLAIN_MODE.key -> "cost") { |
Contributor
There was a problem hiding this comment.
can we simply test with EXPLAIN COST ... command?
cloud-fan
reviewed
May 11, 2022
Contributor
There was a problem hiding this comment.
Can we make it an abstract class and extends LeafNode? I'm a bit worried about changing the class hierarchy unnecessarily in this PR.
Contributor
Author
There was a problem hiding this comment.
changed to trait LeafNodeWithoutStats extend LeafNode
cloud-fan
reviewed
May 11, 2022
Contributor
There was a problem hiding this comment.
Suggested change
| * A resolved leaf like node that its statistics is no meaning. | |
| * A resolved leaf node whose statistics has no meaning. |
cloud-fan
reviewed
May 11, 2022
Contributor
There was a problem hiding this comment.
the test name needs update
cloud-fan
approved these changes
May 11, 2022
Contributor
|
thanks, merging to master/3.3! |
cloud-fan
pushed a commit
that referenced
this pull request
May 11, 2022
…ainMode is set to cost ### What changes were proposed in this pull request? Add a new leaf like node `LeafNodeWithoutStats` and apply to the list: - ResolvedDBObjectName - ResolvedNamespace - ResolvedTable - ResolvedView - ResolvedNonPersistentFunc - ResolvedPersistentFunc ### Why are the changes needed? We enable v2 command at 3.3.0 branch by default `spark.sql.legacy.useV1Command`. However this is a behavior change between v1 and c2 command. - v1 command: We resolve logical plan to command at analyzer phase by `ResolveSessionCatalog` - v2 commnd: We resolve logical plan to v2 command at physical phase by `DataSourceV2Strategy` Foe cost explain mode, we will call `LogicalPlanStats.stats` using optimized plan so there is a gap between v1 and v2 command. Unfortunately, the logical plan of v2 command contains the `LeafNode` which does not override the `computeStats`. As a result, there is a error running such sql: ```sql set spark.sql.ui.explainMode=cost; show tables; ``` ``` java.lang.UnsupportedOperationException: at org.apache.spark.sql.catalyst.plans.logical.LeafNode.computeStats(LogicalPlan.scala:171) at org.apache.spark.sql.catalyst.plans.logical.LeafNode.computeStats$(LogicalPlan.scala:171) at org.apache.spark.sql.catalyst.analysis.ResolvedNamespace.computeStats(v2ResolutionPlans.scala:155) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.default(SizeInBytesOnlyStatsPlanVisitor.scala:55) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.default(SizeInBytesOnlyStatsPlanVisitor.scala:27) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlanVisitor.visit(LogicalPlanVisitor.scala:49) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlanVisitor.visit$(LogicalPlanVisitor.scala:25) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.visit(SizeInBytesOnlyStatsPlanVisitor.scala:27) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.$anonfun$stats$1(LogicalPlanStats.scala:37) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.stats(LogicalPlanStats.scala:33) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.stats$(LogicalPlanStats.scala:33) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.stats(LogicalPlan.scala:30) ``` ### Does this PR introduce _any_ user-facing change? yes, bug fix ### How was this patch tested? add test Closes #36488 from ulysses-you/SPARK-39112. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 06fd340) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add a new leaf like node
LeafNodeWithoutStatsand apply to the list:Why are the changes needed?
We enable v2 command at 3.3.0 branch by default
spark.sql.legacy.useV1Command. However this is a behavior change between v1 and c2 command.v1 command:
We resolve logical plan to command at analyzer phase by
ResolveSessionCatalogv2 commnd:
We resolve logical plan to v2 command at physical phase by
DataSourceV2StrategyFoe cost explain mode, we will call
LogicalPlanStats.statsusing optimized plan so there is a gap between v1 and v2 command.Unfortunately, the logical plan of v2 command contains the
LeafNodewhich does not override thecomputeStats. As a result, there is a error running such sql:Does this PR introduce any user-facing change?
yes, bug fix
How was this patch tested?
add test