New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29530][SQL] Make SQLConf in SQL parse process thread safe #26187
Conversation
ok to test |
Test build #112364 has finished for PR 26187 at commit
|
retest this please |
@@ -600,6 +600,7 @@ class SparkSession private( | |||
* @since 2.0.0 | |||
*/ | |||
def sql(sqlText: String): DataFrame = { | |||
SparkSession.setActiveSession(this) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who takes responsibility to set this back if we set the active session in this method? For example, the current active session is spark1
when calling spark2.sql(...)
, then the active session will become spark2
. When we will set it back to spark1
, or it will always be spark2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who takes responsibility to set this back if we set the active session in this method? For example, the current active session is
spark1
when callingspark2.sql(...)
, then the active session will becomespark2
. When we will set it back tospark1
, or it will always bespark2
?
Under normal circumstances, our spark program use one SparkSession, if some one write program use multi sparksession, they should control these things.
And for analyzer, it will set this.
spark/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
Line 64 in e99a9f7
SparkSession.setActiveSession(sparkSession) |
And only
SparkSession.cleanupAnyExistingSession()
will call
SparkSession.clearActiveSession()
SparkSession.clearDefaultSession()
@@ -600,6 +600,7 @@ class SparkSession private( | |||
* @since 2.0.0 | |||
*/ | |||
def sql(sqlText: String): DataFrame = { | |||
SparkSession.setActiveSession(this) | |||
val tracker = new QueryPlanningTracker | |||
val plan = tracker.measurePhase(QueryPlanningTracker.PARSING) { | |||
sessionState.sqlParser.parsePlan(sqlText) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we use SQLConf.get
a lot in analyzer/optimizer rules, but does it also true for the parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we use
SQLConf.get
a lot in analyzer/optimizer rules, but does it also true for the parser?
Parser use SQLConf.get
too.
It truly make some configuration about parser mess in SparkThriftServer multi thread mode.
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
Line 94 in e99a9f7
lexer.legacy_setops_precedence_enbled = SQLConf.get.setOpsPrecedenceEnforced |
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
Line 95 in e99a9f7
lexer.ansi = SQLConf.get.ansiEnabled |
It is all about error happened in #26172
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we define a conf: SQLConf
method in ParserDriver
so that we can avoid calling SQLConf.get
in the parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we define a
conf: SQLConf
method inParserDriver
so that we can avoid callingSQLConf.get
in the parser?
How about new change? Test is ok for #26172 . also cc @juliuszsompolski
Test build #112368 has finished for PR 26187 at commit
|
retest this please |
Test build #112383 has finished for PR 26187 at commit
|
LGTM, can you update the PR title? |
Test build #112399 has finished for PR 26187 at commit
|
PR Title ok now? brief and show purpose. |
thanks, merging to master! |
@AngersZhuuuu Could you open a PR against 2.4 branches? |
I will do this. |
What changes were proposed in this pull request?
As I have comment in SPARK-29516
SparkSession.sql() method parse process not under current sparksession's conf, so some configuration about parser is not valid in multi-thread situation.
In this pr, we add a SQLConf parameter to AbstractSqlParser and initial it with SessionState's conf.
Then for each SparkSession's parser process. It will use's it's own SessionState's SQLConf and to be thread safe
Why are the changes needed?
Fix bug
Does this PR introduce any user-facing change?
NO
How was this patch tested?
NO