[SPARK-15734][SQL] Avoids printing internal row in explain output #13471

clockfly · 2016-06-02T19:09:04Z

What changes were proposed in this pull request?

This PR avoids printing internal rows in explain output for some operators.

Before change:

scala> (1 to 10).toSeq.map(_ => (1,2,3)).toDF().createTempView("df3")
scala> spark.sql("select * from df3 where 1=2").explain(true)
...
== Analyzed Logical Plan ==
_1: int, _2: int, _3: int
Project [_1#37,_2#38,_3#39]
+- Filter (1 = 2)
   +- SubqueryAlias df3
      +- LocalRelation [_1#37,_2#38,_3#39], [[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]]
...
== Physical Plan ==
LocalTableScan [_1#37,_2#38,_3#39]

After change:

scala> spark.sql("select * from df3 where 1=2").explain(true)
...
== Analyzed Logical Plan ==
_1: int, _2: int, _3: int
Project [_1#58,_2#59,_3#60]
+- Filter (1 = 2)
   +- SubqueryAlias df3
      +- LocalRelation [_1#58,_2#59,_3#60]
...
== Physical Plan ==
LocalTableScan <empty>, [_1#58,_2#59,_3#60]

How was this patch tested?

Manual test.

clockfly · 2016-06-02T19:17:48Z

cc @cloud-fan

cloud-fan · 2016-06-02T19:26:09Z

LGTM

liancheng · 2016-06-02T20:10:17Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala

-  override protected def stringArgs = Iterator(output)
+  override protected def stringArgs: Iterator[Any] = {
+    if (data.isEmpty) {
+      Iterator("Empty", output)


Maybe "EmptyRelation"?

Just to make sure, do you prefer the following output?

LocalRelation EmptyRelation, [_1#58,_2#59,_3#60]

Instead of

LocalRelation Empty, [_1#58,_2#59,_3#60]

Good point... How about just ""? Both "Empty" and "EmptyRelation" look like some class name, while "" reads more like an annotation. (This is pretty subjective though.)

liancheng · 2016-06-02T20:11:46Z

LGTM except for minor naming issue.

SparkQA · 2016-06-02T21:05:52Z

Test build #59867 has finished for PR 13471 at commit 2e679f7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-02T22:46:43Z

Test build #59874 has finished for PR 13471 at commit f94b909.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-06-02T23:21:17Z

Merging to master and branch-2.0.

## What changes were proposed in this pull request? This PR avoids printing internal rows in explain output for some operators. **Before change:** ``` scala> (1 to 10).toSeq.map(_ => (1,2,3)).toDF().createTempView("df3") scala> spark.sql("select * from df3 where 1=2").explain(true) ... == Analyzed Logical Plan == _1: int, _2: int, _3: int Project [_1#37,_2#38,_3#39] +- Filter (1 = 2) +- SubqueryAlias df3 +- LocalRelation [_1#37,_2#38,_3#39], [[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]] ... == Physical Plan == LocalTableScan [_1#37,_2#38,_3#39] ``` **After change:** ``` scala> spark.sql("select * from df3 where 1=2").explain(true) ... == Analyzed Logical Plan == _1: int, _2: int, _3: int Project [_1#58,_2#59,_3#60] +- Filter (1 = 2) +- SubqueryAlias df3 +- LocalRelation [_1#58,_2#59,_3#60] ... == Physical Plan == LocalTableScan <empty>, [_1#58,_2#59,_3#60] ``` ## How was this patch tested? Manual test. Author: Sean Zhong <seanzhong@databricks.com> Closes #13471 from clockfly/verbose_breakdown_5. (cherry picked from commit 985d532) Signed-off-by: Cheng Lian <lian@databricks.com>

avoid printing internal rows.

2e679f7

clockfly force-pushed the verbose_breakdown_5 branch from 8ec1684 to 2e679f7 Compare June 2, 2016 19:10

liancheng reviewed Jun 2, 2016
View reviewed changes

on liancheng's comment

f94b909

asfgit closed this in 985d532 Jun 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-15734][SQL] Avoids printing internal row in explain output #13471

[SPARK-15734][SQL] Avoids printing internal row in explain output #13471

clockfly commented Jun 2, 2016 •

edited

clockfly commented Jun 2, 2016

cloud-fan commented Jun 2, 2016

liancheng Jun 2, 2016

clockfly Jun 2, 2016 •

edited

liancheng Jun 2, 2016 •

edited

liancheng commented Jun 2, 2016

SparkQA commented Jun 2, 2016

SparkQA commented Jun 2, 2016

liancheng commented Jun 2, 2016

[SPARK-15734][SQL] Avoids printing internal row in explain output #13471

[SPARK-15734][SQL] Avoids printing internal row in explain output #13471

Conversation

clockfly commented Jun 2, 2016 • edited

What changes were proposed in this pull request?

How was this patch tested?

clockfly commented Jun 2, 2016

cloud-fan commented Jun 2, 2016

liancheng Jun 2, 2016

Choose a reason for hiding this comment

clockfly Jun 2, 2016 • edited

Choose a reason for hiding this comment

liancheng Jun 2, 2016 • edited

Choose a reason for hiding this comment

liancheng commented Jun 2, 2016

SparkQA commented Jun 2, 2016

SparkQA commented Jun 2, 2016

liancheng commented Jun 2, 2016

clockfly commented Jun 2, 2016 •

edited

clockfly Jun 2, 2016 •

edited

liancheng Jun 2, 2016 •

edited