-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-42667][CONNECT][FOLLOW-UP] SparkSession created by newSession should not share the channel #40346
[SPARK-42667][CONNECT][FOLLOW-UP] SparkSession created by newSession should not share the channel #40346
Conversation
…should not share the channel.
@@ -164,7 +165,7 @@ private[sql] class SparkConnectClient( | |||
} | |||
|
|||
def copy(): SparkConnectClient = { | |||
new SparkConnectClient(userContext, channel, userAgent) | |||
SparkConnectClient.builder().connectionString(connectString).build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only works if you only use a connection string right? So what about SSL, or when the host and port are set manually? Can't we just pass the ChannelBuilder around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you see the current SparkSession builder, I think it only has two versions to build the SparkSession:
- with ConnectionString.
- nothing but default build.
that is why I only keep ConnectString. But If you think keep the builder to preserve everything is better I can update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala
Line 497 in a77bb37
class Builder() extends Logging { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't buy that argument. The problem is that the client builder supports all these things (for good reason), now we are making the assumption that this thing will only be used to create a SparkSession, while there are also other use cases.
Please keep the builder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to use the builder thus we can copy entire channel setup to new client.
@hvanhovell do you know how to fix this
|
In fact, the complete error message says how to fix it.... |
@hvanhovell can you take another look? |
Merging. |
…should not share the channel ### What changes were proposed in this pull request? SparkSession created by newSession should not share the channel. This is because that a SparkSession might be called `stop` in which the channel it uses will be shutdown. If the channel is shared, other non-stop SparkSession that is sharing this channel will get into trouble. ### Why are the changes needed? This fixes the issue when one SparkSession is stopped to cause other active SparkSession not working in Spark Connect. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UT Closes #40346 from amaliujia/rw-session-do-not-share-channel. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit e5f56e5) Signed-off-by: Herman van Hovell <herman@databricks.com>
…should not share the channel ### What changes were proposed in this pull request? SparkSession created by newSession should not share the channel. This is because that a SparkSession might be called `stop` in which the channel it uses will be shutdown. If the channel is shared, other non-stop SparkSession that is sharing this channel will get into trouble. ### Why are the changes needed? This fixes the issue when one SparkSession is stopped to cause other active SparkSession not working in Spark Connect. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UT Closes apache#40346 from amaliujia/rw-session-do-not-share-channel. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit e5f56e5) Signed-off-by: Herman van Hovell <herman@databricks.com>
What changes were proposed in this pull request?
SparkSession created by newSession should not share the channel. This is because that a SparkSession might be called
stop
in which the channel it uses will be shutdown. If the channel is shared, other non-stop SparkSession that is sharing this channel will get into trouble.Why are the changes needed?
This fixes the issue when one SparkSession is stopped to cause other active SparkSession not working in Spark Connect.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing UT