Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SQL] SPARK-6981: Factor out SparkPlanner and QueryExecution from SQLContext #6122

Closed
wants to merge 12 commits into from

Conversation

evacchi
Copy link
Contributor

@evacchi evacchi commented May 13, 2015

Cleaned-up version of PR #5556

* access to the intermediate phases of query execution for developers.
*/
@DeveloperApi
protected[sql] class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be private[sql] now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is protected[sql] in master.

BTW, this PR is geared towards making it easier for third parties (and HiveContext) to add new processing rules without requiring to subclass (see PR #5556 and SPARK-6320)

I would actually advise making these classes public (or at least protected, without the [sql] qualifier)

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
…refactoring

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@evacchi
Copy link
Contributor Author

evacchi commented May 18, 2015

If everybody agrees, I think we can restart the Jenkins build (this is just the same as the other PR, after all)

@evacchi
Copy link
Contributor Author

evacchi commented May 21, 2015

@rxin may I ask if you can trigger a test build? code is the same as PR #5556

@rxin
Copy link
Contributor

rxin commented May 21, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33237 has finished for PR 6122 at commit ac03efe.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait QueryPlanner[PhysicalPlan <: TreeNode[PhysicalPlan]]
    • protected[sql] class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • protected[sql] class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies
    • protected[sql] class HiveQueryExecution(hiveContext: HiveContext, logicalPlan: LogicalPlan)

@evacchi
Copy link
Contributor Author

evacchi commented May 21, 2015

Build fails because of this:

[info] spark-sql: found 3 potential binary incompatibilities (filtered 328)
[error]  * class org.apache.spark.sql.SQLContext#SparkPlanner does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.SQLContext$SparkPlanner")
[error]  * method prepareForExecution()org.apache.spark.sql.catalyst.rules.RuleExecutor in class org.apache.spark.sql.SQLContext does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.sql.SQLContext.prepareForExecution")
[error]  * class org.apache.spark.sql.SQLContext#QueryExecution does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.SQLContext$QueryExecution")

this is obvious, since that's the purpose of the PR. How should we proceed?

@evacchi
Copy link
Contributor Author

evacchi commented May 21, 2015

Possible solution: add shims to preserve binary compatibility as follows:

class SQLContext {
    ...
   @deprecated
    class SparkPlanner extends org.apache.spark.sql.SparkPlanner
   @deprecated
    class QueryExecution extends org.apache.spark.sql.QueryExecution 
   @deprecated
    lazy val prepareForExecution = ...
    ...
}

class QueryExecution(sqlContext: SQLContext) {
   ...
   lazy val prepareForExecution = sqlContext.prepareForExecution
   ...
}

etc.

However, I wouldn't do that unless it is really necessary, because it really makes extending the non-deprecated classes a bit awkward (must be careful with imports)

…refactoring

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@evacchi
Copy link
Contributor Author

evacchi commented May 22, 2015

addressed in PR #6356

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@asfgit asfgit closed this in cdc36ee Jul 18, 2015
asfgit pushed a commit that referenced this pull request Sep 14, 2015
…LContext

Alternative to PR #6122; in this case the refactored out classes are replaced by inner classes with the same name for backwards binary compatibility

   * process in a lighter-weight, backwards-compatible way

Author: Edoardo Vacchi <uncommonnonsense@gmail.com>

Closes #6356 from evacchi/sqlctx-refactoring-lite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants