Skip to content

[SPARK-29992][CORE] Change getActive access to public#26631

Closed
nishkamravi2 wants to merge 2 commits intoapache:masterfrom
nishkamravi2:context_public
Closed

[SPARK-29992][CORE] Change getActive access to public#26631
nishkamravi2 wants to merge 2 commits intoapache:masterfrom
nishkamravi2:context_public

Conversation

@nishkamravi2
Copy link
Contributor

@nishkamravi2 nishkamravi2 commented Nov 22, 2019

What changes were proposed in this pull request?

Make access to getActive public (active context is already exposed through getOrCreate)

Why are the changes needed?

getActive was added to make active Spark context available within Spark package. It is useful to be able to access this publicly (as in the case of streamingContext.getActive)

How was this patch tested?

Compiled locally-- minor change, can be tested here directly

@HyukjinKwon
Copy link
Member

It is useful to be able to access this publicly

Can you explain in which case is it useful?

@HyukjinKwon
Copy link
Member

Also, I don't think exposing an API is minor. Can you file a JIRA?

@nishkamravi2 nishkamravi2 changed the title [MINOR][CORE] Change getActive access to public [SPARK-29992][CORE] Change getActive access to public Nov 22, 2019
@nishkamravi2
Copy link
Contributor Author

@HyukjinKwon Thanks for the feedback. This would be useful in cases where we want access to an active context (but only when one exists) without creating a new one in its absence. Opened Spark-29992.

@SparkQA
Copy link

SparkQA commented Nov 22, 2019

Test build #114261 has finished for PR 26631 at commit 9cb554a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Nov 22, 2019

What's an example of a use case?

@srowen
Copy link
Member

srowen commented Nov 22, 2019

I'm not super against it but when would you not know whether you intend to execute Spark code or not?


/** Return the current active [[SparkContext]] if any. */
private[spark] def getActive: Option[SparkContext] = {
def getActive: Option[SparkContext] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @nishkamravi2 .
Like the other review comments, I'm also wondering why you need this.
I'd like to use the existing official API instead of making another public API.

scala> org.apache.spark.sql.SparkSession.getActiveSession.get.sparkContext
res7: org.apache.spark.SparkContext = org.apache.spark.SparkContext@298154d4

We recommend users to use SparkSession instead of SparkContext if possible. And, we provide the API you need like the above.

@dongjoon-hyun
Copy link
Member

cc @rxin , @gatorsmile , @cloud-fan

@rxin
Copy link
Contributor

rxin commented Nov 24, 2019

Yea seems like we can do without this change.

@nishkamravi2
Copy link
Contributor Author

Thanks for the feedback @srowen @dongjoon-hyun @rxin. If an application creates SparkContext outside of SparkSession (which unfortunately is still fairly prevalent), we don't have a public API to get activeContext. We can work around this by using a local org.apache.spark package to call this method or using reflection to bypass encapsulation. But avoiding the hack would have been preferable. I'm okay with closing this PR in the interest of not taking up community's time on something we couldn't quickly agree on.

@srowen
Copy link
Member

srowen commented Nov 24, 2019

Does #26631 (comment) not do this?

@nishkamravi2
Copy link
Contributor Author

That would work when spark context is created from SparkSession which isn't always the case. FWIW even Spark examples (e.g, in mllib folder) do 'new SparkContext' everywhere. But in closing the PR I assumed we're trying to be super-frugal with exposing public APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants