[SPARK-29992][CORE] Change getActive access to public#26631
[SPARK-29992][CORE] Change getActive access to public#26631nishkamravi2 wants to merge 2 commits intoapache:masterfrom
Conversation
Can you explain in which case is it useful? |
|
Also, I don't think exposing an API is minor. Can you file a JIRA? |
|
@HyukjinKwon Thanks for the feedback. This would be useful in cases where we want access to an active context (but only when one exists) without creating a new one in its absence. Opened Spark-29992. |
|
Test build #114261 has finished for PR 26631 at commit
|
|
What's an example of a use case? |
|
I'm not super against it but when would you not know whether you intend to execute Spark code or not? |
|
|
||
| /** Return the current active [[SparkContext]] if any. */ | ||
| private[spark] def getActive: Option[SparkContext] = { | ||
| def getActive: Option[SparkContext] = { |
There was a problem hiding this comment.
Hi, @nishkamravi2 .
Like the other review comments, I'm also wondering why you need this.
I'd like to use the existing official API instead of making another public API.
scala> org.apache.spark.sql.SparkSession.getActiveSession.get.sparkContext
res7: org.apache.spark.SparkContext = org.apache.spark.SparkContext@298154d4We recommend users to use SparkSession instead of SparkContext if possible. And, we provide the API you need like the above.
|
cc @rxin , @gatorsmile , @cloud-fan |
|
Yea seems like we can do without this change. |
|
Thanks for the feedback @srowen @dongjoon-hyun @rxin. If an application creates SparkContext outside of SparkSession (which unfortunately is still fairly prevalent), we don't have a public API to get activeContext. We can work around this by using a local org.apache.spark package to call this method or using reflection to bypass encapsulation. But avoiding the hack would have been preferable. I'm okay with closing this PR in the interest of not taking up community's time on something we couldn't quickly agree on. |
|
Does #26631 (comment) not do this? |
|
That would work when spark context is created from SparkSession which isn't always the case. FWIW even Spark examples (e.g, in mllib folder) do 'new SparkContext' everywhere. But in closing the PR I assumed we're trying to be super-frugal with exposing public APIs. |
What changes were proposed in this pull request?
Make access to getActive public (active context is already exposed through getOrCreate)
Why are the changes needed?
getActive was added to make active Spark context available within Spark package. It is useful to be able to access this publicly (as in the case of streamingContext.getActive)
How was this patch tested?
Compiled locally-- minor change, can be tested here directly