Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20129][Core] JavaSparkContext should use SparkContext.getOrCreate #20347

Closed
wants to merge 11 commits into from

Conversation

rekhajoshm
Copy link
Contributor

What changes were proposed in this pull request?

Using SparkContext getOrCreate() instead of recreating new sc in JavaSparkContext.

How was this patch tested?

Existing tests

Pulling functionality from apache spark
pull latest from apache spark
Pulling functionality from apache spark
Pulling functionality from apache spark
pull request from apache/master
pull latest from apache spark
pull latest from apache spark
pull latest apache spark
Apache spark pull latest
@SparkQA
Copy link

SparkQA commented Jan 22, 2018

Test build #86456 has finished for PR 20347 at commit b1ae512.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor

Can you please explain why do we need to change to getOrCreate?

@srowen
Copy link
Member

srowen commented Jan 22, 2018

@mengxr suggested this in the JIRA originally -- what was the reasoning? It makes some sense, but so does leaving the current behavior, where a constructor calls a constructor. It's a behavior change, albeit a slight one.

@jerryshao
Copy link
Contributor

Using getOrCreate in constructor seems change the semantics. Maybe we can add a new static method for such usage in JavaSparkContext.

@srowen
Copy link
Member

srowen commented Jan 23, 2018

Yes, you can already get the new semantics here with new JavaSparkContext(SparkContext.getOrCreate()).

Yes, probably better to add a new method, or else, decide that it's not worth a new API method just as a shortcut for the above. Maybe that's the right conclusion, unless @mengxr comes back with a particular reason to change the behavior slightly.

@jiangxb1987
Copy link
Contributor

My major concern is that, if there is a existing SparkContext, some confs you set may not take effect, as described in SparkContext.getOrCreate(). It's hard to enumerate the use cases but I'm sure there are some that pass in specific confs to create a new JavaSparkContext, so I tend to keep the current behavior here.

On the other hand, the following comment copyed from the comment of the class JavaSparkContext:

 * Only one SparkContext may be active per JVM.  You must `stop()` the active SparkContext before
 * creating a new one.  This limitation may eventually be removed; see SPARK-2243 for more details.

If that is the case, there should be no active SparkContext before we initiate the JavaSparkContext, so the change doesn't bring any advantage in that means.

@srowen
Copy link
Member

srowen commented Jan 26, 2018

@rekhajoshm I think maybe the right resolution here is to do nothing. I haven't heard @mengxr on his old JIRA to make this change. Thank you for chasing down open JIRAs like this of course.

@rekhajoshm
Copy link
Contributor Author

Thank you @srowen I admire you for doing what you do over all the jira/PR's I have studied, and followed up.
If its ok, will keep this PR open for few days, and close if jira is getting to 'Not an issue'/'Won't fix' state.thanks.

@srowen srowen mentioned this pull request May 11, 2018
@asfgit asfgit closed this in 348ddfd May 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants