Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SQL] SPARK-6981: Factor out SparkPlanner and QueryExecution from SQLContext #6356

Closed
wants to merge 9 commits into from

Conversation

evacchi
Copy link
Contributor

@evacchi evacchi commented May 22, 2015

Alternative to PR #6122; in this case the refactored out classes are replaced by inner classes with the same name for backwards binary compatibility

  • process in a lighter-weight, backwards-compatible way

   * process in a lighter-weight, backwards-compatible way
@rxin
Copy link
Contributor

rxin commented May 22, 2015

I don't think we need to care about the binary compatibility for this case. Maybe just add the exclude rules to MimaExcludes file.

}


class SparkPlanner(val sqlContext: SQLContext) extends org.apache.spark.sql.execution.SparkStrategies {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this into the execution package?

@chenghao-intel
Copy link
Contributor

In order to make a minimal change (To keep the HiveContext / TestHive unchanged), keeping the nested classes QueryExcution and SparkPlanner seems a good idea for me, Or as @rxin suggested, we should do all of change in a single PR.
Thoughts?

@evacchi
Copy link
Contributor Author

evacchi commented May 25, 2015

@chenghao-intel to me both solutions are fine, see PR #6122; however, this solution would retain binary compatibility

@rxin
Copy link
Contributor

rxin commented Jul 18, 2015

@evacchi mind bringing this up to date so we can merge it for 1.5?

@evacchi
Copy link
Contributor Author

evacchi commented Jul 20, 2015

ok, will check today

@evacchi
Copy link
Contributor Author

evacchi commented Jul 20, 2015

@rxin merge conflicts fixed; should pass tests as well

@SparkQA
Copy link

SparkQA commented Jul 20, 2015

Test build #1123 has finished for PR 6356 at commit b01244b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan) extends sparkexecution.QueryExecution(this, logical)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class ShuffledRowRDD(
    • class SparkPlanner(val sqlContext: SQLContext) extends org.apache.spark.sql.execution.SparkStrategies

Project(projectList, filterCondition.map(Filter(_, scan)).getOrElse(scan))
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think u need to add a new line here to not fail scalastyle

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there might be more so u should check the output from jenkins

@evacchi
Copy link
Contributor Author

evacchi commented Jul 21, 2015

@rxin weird, didn't get those during local testing. BTW, they should pass now

@rxin
Copy link
Contributor

rxin commented Jul 21, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #1146 has finished for PR 6356 at commit 253c15e.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies

@evacchi
Copy link
Contributor Author

evacchi commented Jul 23, 2015

@rxin I fixed today's merge conflicts; I would advise doing a merged build once again and then (if, of course, the patch is good enough) to go ahead and merge it ASAP. Since this is a refactoring it will obviously break every day until it will be merged ! :)

@rxin
Copy link
Contributor

rxin commented Jul 23, 2015

Jenkins, test this please.

@evacchi
Copy link
Contributor Author

evacchi commented Jul 24, 2015

test build has not started

@SparkQA
Copy link

SparkQA commented Jul 24, 2015

Test build #1194 has finished for PR 6356 at commit 0089435.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies

@evacchi
Copy link
Contributor Author

evacchi commented Jul 24, 2015

tests should pass now

@marmbrus
Copy link
Contributor

marmbrus commented Sep 3, 2015

@evacchi, sorry for letting this go stale again. Can we close this issue until you have time to bring it up to date?

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala
	sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@evacchi
Copy link
Contributor Author

evacchi commented Sep 4, 2015

@marmbrus I have brought this up to date. Might need fixes in order to merge cleanly, though

@marmbrus
Copy link
Contributor

marmbrus commented Sep 8, 2015

ok to test

}
}
}
@deprecated("use org.apache.spark.sql.SparkPlanner", "1.5.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.6.0

@marmbrus
Copy link
Contributor

marmbrus commented Sep 8, 2015

Minor comments, otherwise LGTM

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42156 has finished for PR 6356 at commit 3bf710b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies

@evacchi
Copy link
Contributor Author

evacchi commented Sep 10, 2015

Hope this still merges cleanly, I'm writing this from my phone. Otherwise it'll have to wait till I'm back from holidays

@SparkQA
Copy link

SparkQA commented Sep 10, 2015

Test build #42246 has finished for PR 6356 at commit 47685cc.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies

@evacchi
Copy link
Contributor Author

evacchi commented Sep 11, 2015

Rerun tests?

@marmbrus
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Sep 11, 2015

Test build #1746 has finished for PR 6356 at commit 22a01ae.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 12, 2015

Test build #42359 has finished for PR 6356 at commit 22a01ae.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[sql] class SparkPlanner extends sparkexecution.SparkPlanner(this)
    • protected[sql] class QueryExecution(logical: LogicalPlan)
    • class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies

@marmbrus
Copy link
Contributor

Thanks, merging to master!

@asfgit asfgit closed this in 64f0415 Sep 14, 2015
@@ -114,7 +114,7 @@ private[sql] object DataFrame {
@Experimental
class DataFrame private[sql](
@transient val sqlContext: SQLContext,
@DeveloperApi @transient val queryExecution: SQLContext#QueryExecution) extends Serializable {
@DeveloperApi @transient val queryExecution: QueryExecution) extends Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmbrus This changes the signature of queryExecution. Is there anyway that we can preserve the binary compatibility?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we preserve binary compatibility for developer API? At least MIMA doesn't complain about it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryExecution extends SQLContext#QueryExecution so binary compatibility should be preserved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of the function to retrieve it has changed though. I think that we should rename this parameter and add a method that constructs a new instance of the inner subclass to return with the name queryExecution. I think its worth rerunning the optimizer to not have to recompile all our perf tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants