Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29530][SQL] Make SQLConf in SQL parse process thread safe #26187

Closed
wants to merge 2 commits into from

Conversation

AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Oct 21, 2019

What changes were proposed in this pull request?

As I have comment in SPARK-29516
SparkSession.sql() method parse process not under current sparksession's conf, so some configuration about parser is not valid in multi-thread situation.

In this pr, we add a SQLConf parameter to AbstractSqlParser and initial it with SessionState's conf.
Then for each SparkSession's parser process. It will use's it's own SessionState's SQLConf and to be thread safe

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

NO

How was this patch tested?

NO

@AngersZhuuuu
Copy link
Contributor Author

cc @wangyum @cloud-fan @gatorsmile

@wangyum
Copy link
Member

wangyum commented Oct 21, 2019

ok to test

@wangyum wangyum changed the title Reset active SparkSession before parse sql in SparkSession [SPARK-29530][SQL] Reset active SparkSession before parse sql in SparkSession Oct 21, 2019
@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112364 has finished for PR 26187 at commit f456207.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member

wangyum commented Oct 21, 2019

retest this please

@@ -600,6 +600,7 @@ class SparkSession private(
* @since 2.0.0
*/
def sql(sqlText: String): DataFrame = {
SparkSession.setActiveSession(this)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who takes responsibility to set this back if we set the active session in this method? For example, the current active session is spark1 when calling spark2.sql(...), then the active session will become spark2. When we will set it back to spark1, or it will always be spark2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who takes responsibility to set this back if we set the active session in this method? For example, the current active session is spark1 when calling spark2.sql(...), then the active session will become spark2. When we will set it back to spark1, or it will always be spark2?

Under normal circumstances, our spark program use one SparkSession, if some one write program use multi sparksession, they should control these things.

And for analyzer, it will set this.

And only
SparkSession.cleanupAnyExistingSession() will call

  SparkSession.clearActiveSession()
      SparkSession.clearDefaultSession()

@@ -600,6 +600,7 @@ class SparkSession private(
* @since 2.0.0
*/
def sql(sqlText: String): DataFrame = {
SparkSession.setActiveSession(this)
val tracker = new QueryPlanningTracker
val plan = tracker.measurePhase(QueryPlanningTracker.PARSING) {
sessionState.sqlParser.parsePlan(sqlText)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we use SQLConf.get a lot in analyzer/optimizer rules, but does it also true for the parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we use SQLConf.get a lot in analyzer/optimizer rules, but does it also true for the parser?

Parser use SQLConf.get too.
It truly make some configuration about parser mess in SparkThriftServer multi thread mode.

lexer.legacy_setops_precedence_enbled = SQLConf.get.setOpsPrecedenceEnforced

It is all about error happened in #26172

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we define a conf: SQLConf method in ParserDriver so that we can avoid calling SQLConf.get in the parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we define a conf: SQLConf method in ParserDriver so that we can avoid calling SQLConf.get in the parser?

How about new change? Test is ok for #26172 . also cc @juliuszsompolski

@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112368 has finished for PR 26187 at commit f456207.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112383 has finished for PR 26187 at commit f456207.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM, can you update the PR title?

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29530][SQL] Reset active SparkSession before parse sql in SparkSession [SPARK-29530][SQL] SQLParser use SQLConf parameter instead of SQLConf.get in parser process Oct 21, 2019
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29530][SQL] SQLParser use SQLConf parameter instead of SQLConf.get in parser process [SPARK-29530][SQL] SQLParser use SessionState's SQLConf variable instead of SQLConf.get in parser process Oct 21, 2019
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29530][SQL] SQLParser use SessionState's SQLConf variable instead of SQLConf.get in parser process [SPARK-29530][SQL] Add SessionState's conf as AbstractSqlParser's parameter instead of SQLConf.get in parser process make it thread safe Oct 21, 2019
@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112399 has finished for PR 26187 at commit ad9dba8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • abstract class AbstractSqlParser(conf: SQLConf) extends ParserInterface with Logging
  • class CatalystSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)
  • class SparkSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29530][SQL] Add SessionState's conf as AbstractSqlParser's parameter instead of SQLConf.get in parser process make it thread safe [SPARK-29530][SQL] Make SQLConf in SQL parse process thread safe Oct 22, 2019
@AngersZhuuuu
Copy link
Contributor Author

LGTM, can you update the PR title?

PR Title ok now? brief and show purpose.

@cloud-fan cloud-fan closed this in 484f93e Oct 22, 2019
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@gatorsmile
Copy link
Member

@AngersZhuuuu Could you open a PR against 2.4 branches?

@AngersZhuuuu
Copy link
Contributor Author

@AngersZhuuuu Could you open a PR against 2.4 branches?

I will do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants