-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified #13200
Conversation
Test build #58899 has finished for PR 13200 at commit
|
Test build #58907 has finished for PR 13200 at commit
|
sparkContext.addSparkListener(new SparkListener { | ||
override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = { | ||
defaultSession.set(null) | ||
// TODO(rxin): Do we need to also clear SQL listener? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @zsxwing any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually @zsxwing might be good for you to review the entire pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to clear it. Otherwise, after stopping the SparkContext, we leak it in object SQLContext.
cc @andrewor14 too who wrote some of this |
Test build #58911 has finished for PR 13200 at commit
|
OK I've pushed a commit to handle sql listener properly. |
Test build #58913 has finished for PR 13200 at commit
|
We should fix |
That's been updated? |
Ah, yes. |
but not the builder, which still uses |
Test build #58918 has finished for PR 13200 at commit
|
Test build #58919 has finished for PR 13200 at commit
|
I updated Python docs. The Python change seems slightly larger and since it is not user facing, I'm going to defer it to another pr. |
Test build #58921 has finished for PR 13200 at commit
|
Test build #58931 has finished for PR 13200 at commit
|
@marmbrus i know you were looking at this. Did you end up going through it? |
LGTM |
Thanks - merging in master/2.0. |
…pagate config options to existing sessions if specified ## What changes were proposed in this pull request? Currently SparkSession.Builder use SQLContext.getOrCreate. It should probably the the other way around, i.e. all the core logic goes in SparkSession, and SQLContext just calls that. This patch does that. This patch also makes sure config options specified in the builder are propagated to the existing (and of course the new) SparkSession. ## How was this patch tested? Updated tests to reflect the change, and also introduced a new SparkSessionBuilderSuite that should cover all the branches. Author: Reynold Xin <rxin@databricks.com> Closes #13200 from rxin/SPARK-15075. (cherry picked from commit f2ee0ed) Signed-off-by: Reynold Xin <rxin@databricks.com>
} | ||
session = new SparkSession(sparkContext) | ||
options.foreach { case (k, v) => session.conf.set(k, v) } | ||
defaultSession.set(session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin Hi Reynold, i had a minor question just for my understanding. When users do a
new SQLContext() , we create a implicit SparkSession. Should this session be made
the defaultSession ? If we call, 1) new SQLContext 2) builder.getOrCreate() then whats the expected behaviour ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would create a new one in that case ...
I'm not too worried about the legacy corner cases here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin Ok. Got it. Thank you.
Test build #58939 has finished for PR 13200 at commit
|
…verriding confs of existing sessions ## What changes were proposed in this pull request? This fixes the python SparkSession builder to allow setting confs correctly. This was a leftover TODO from #13200. ## How was this patch tested? Python doc tests. cc andrewor14 Author: Eric Liang <ekl@databricks.com> Closes #13289 from ericl/spark-15520. (cherry picked from commit 8239fdc) Signed-off-by: Andrew Or <andrew@databricks.com>
…verriding confs of existing sessions ## What changes were proposed in this pull request? This fixes the python SparkSession builder to allow setting confs correctly. This was a leftover TODO from #13200. ## How was this patch tested? Python doc tests. cc andrewor14 Author: Eric Liang <ekl@databricks.com> Closes #13289 from ericl/spark-15520.
What changes were proposed in this pull request?
Currently SparkSession.Builder use SQLContext.getOrCreate. It should probably the the other way around, i.e. all the core logic goes in SparkSession, and SQLContext just calls that. This patch does that.
This patch also makes sure config options specified in the builder are propagated to the existing (and of course the new) SparkSession.
How was this patch tested?
Updated tests to reflect the change, and also introduced a new SparkSessionBuilderSuite that should cover all the branches.