Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10176][SQL] Show partially analyzed plans when checkAnswer fails to analyze #8389

Closed
wants to merge 8 commits into from

Conversation

marmbrus
Copy link
Contributor

This PR improves checkAnswer to print the partially analyzed plan in addition to the user friendly error message, in order to aid debugging failing tests.

In doing so, I ran into a conflict with the various ways that we bring a SQLContext into the tests. Depending on the trait we refer to the current context as sqlContext, _sqlContext, ctx or hiveContext with access modifiers public, protected and private depending on the defining class.

I propose we refactor as follows:

  • All tests should only refer to a protected sqlContext when testing general features, and protected hiveContext when it is a method that only exists on a HiveContext.
  • All tests should only import testImplicits._ (i.e., don't import TestHive.implicits._)

I made a pretty good crack at implementing the above, though there may be other outstanding locations. I suggest we address these in a follow up due to the likely hood of conflicts.

@marmbrus marmbrus changed the title [SPARK-10176][SQL] Show partially analyzed plans when checkAnswer fails to analyze [SPARK-10176][SQL][WIP] Show partially analyzed plans when checkAnswer fails to analyze Aug 24, 2015
@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #41441 has finished for PR 8389 at commit f3b5d3a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Conflicts:
	sql/hive/src/test/scala/org/apache/spark/sql/hive/ParquetHiveCompatibilitySuite.scala
@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #41477 has finished for PR 8389 at commit 71fe053.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCompatibilityTest.scala
	sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
	sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala
	sql/hive/src/test/scala/org/apache/spark/sql/sources/hadoopFsRelationSuites.scala
@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41562 has finished for PR 8389 at commit e0d8485.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class LogisticRegressionModel @Since("1.3.0") (
    • class SVMModel @Since("1.1.0") (

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCompatibilityTest.scala
@SparkQA
Copy link

SparkQA commented Aug 29, 2015

Test build #41787 has finished for PR 8389 at commit 3cc6342.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus marmbrus changed the title [SPARK-10176][SQL][WIP] Show partially analyzed plans when checkAnswer fails to analyze [SPARK-10176][SQL] Show partially analyzed plans when checkAnswer fails to analyze Aug 29, 2015
@marmbrus
Copy link
Contributor Author

@andrewor14 this is ready to review

@@ -17,11 +17,12 @@

package org.apache.spark.sql.hive

import org.apache.spark.sql.hive.test.TestHiveSingleton
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit import order, should move down

@andrewor14
Copy link
Contributor

Hi @marmbrus @cloud-fan all my comments are minor and this is mergeable as is, but I have two high level comments:

(1) I think we should always put with TestHiveSingleton at the end to make it really explicit that this is where sqlContext comes from. Otherwise it's less clear who overrides whom.

(2) Could we make TestHiveSingleton extend SQLTestUtils? Then all subclasses can just import testImplicits._ if needed, which is more consistent with the existing test suites. In general it might simplify things if all SQL tests extend SQLTestUtils if they use a SQLContext.

@marmbrus
Copy link
Contributor Author

marmbrus commented Sep 4, 2015

Deferring to #8584

@marmbrus marmbrus closed this Sep 4, 2015
asfgit pushed a commit that referenced this pull request Sep 4, 2015
…ils to analyze

This PR takes over #8389.

This PR improves `checkAnswer` to print the partially analyzed plan in addition to the user friendly error message, in order to aid debugging failing tests.

In doing so, I ran into a conflict with the various ways that we bring a SQLContext into the tests. Depending on the trait we refer to the current context as `sqlContext`, `_sqlContext`, `ctx` or `hiveContext` with access modifiers `public`, `protected` and `private` depending on the defining class.

I propose we refactor as follows:

1. All tests should only refer to a `protected sqlContext` when testing general features, and `protected hiveContext` when it is a method that only exists on a `HiveContext`.
2. All tests should only import `testImplicits._` (i.e., don't import `TestHive.implicits._`)

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes #8584 from cloud-fan/cleanupTests.
kiszk pushed a commit to kiszk/spark-gpu that referenced this pull request Dec 26, 2015
…ils to analyze

This PR takes over apache/spark#8389.

This PR improves `checkAnswer` to print the partially analyzed plan in addition to the user friendly error message, in order to aid debugging failing tests.

In doing so, I ran into a conflict with the various ways that we bring a SQLContext into the tests. Depending on the trait we refer to the current context as `sqlContext`, `_sqlContext`, `ctx` or `hiveContext` with access modifiers `public`, `protected` and `private` depending on the defining class.

I propose we refactor as follows:

1. All tests should only refer to a `protected sqlContext` when testing general features, and `protected hiveContext` when it is a method that only exists on a `HiveContext`.
2. All tests should only import `testImplicits._` (i.e., don't import `TestHive.implicits._`)

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes #8584 from cloud-fan/cleanupTests.
@marmbrus marmbrus deleted the cleanupTests branch March 8, 2016 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants