Skip to content

[SPARK-26992][STS] Fix STS scheduler pool correct delivery#23895

Closed
cxzl25 wants to merge 2 commits intoapache:masterfrom
cxzl25:thrift_server_scheduler_pool_pollute
Closed

[SPARK-26992][STS] Fix STS scheduler pool correct delivery#23895
cxzl25 wants to merge 2 commits intoapache:masterfrom
cxzl25:thrift_server_scheduler_pool_pollute

Conversation

@cxzl25
Copy link
Contributor

@cxzl25 cxzl25 commented Feb 26, 2019

What changes were proposed in this pull request?

The user sets the value of spark.sql.thriftserver.scheduler.pool.
Spark thrift server saves this value in the LocalProperty of threadlocal type, but does not clean up after running, causing other sessions to run in the previously set pool name.

How was this patch tested?

manual tests

@cxzl25
Copy link
Contributor Author

cxzl25 commented Mar 6, 2019

The user sets the value of spark.sql.thriftserver.scheduler.pool.
Spark thrift server saves this value in the LocalProperty of threadlocal type, but does not clean up after running, causing other sessions to run in the previously set pool name.

For example
The second session does not manually set the pool name. The default pool name should be used, but the pool name of the previous user's settings is used. This is incorrect.

error_session

error_stage

Fix

expect_session

expect_stage

@cxzl25
Copy link
Contributor Author

cxzl25 commented Apr 2, 2019

@srowen
I have already opened the RP.
Can get the current sql commit to which scheduling pool from spark ui.


private def withSchedulerPool[T](body: => T): T = {
val pool = sessionToActivePool.get(parentSession.getSessionHandle)
if (pool != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this looks like it undoes the previous fix, you're saying it will always be null when this method starts right? It seems easier to unconditionally set it, unless there's some overhead I'm missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because only the user submitted sql has set spark.sql.thriftserver.scheduler.pool=xxx, sessionToActivePool will put xxx value, will not be the default scheduler, so the default scheduler pool name is null

Unconditional setting it is worried about ambiguity, because only after the put value, there will be a need to take this value, although there is no overhead

case SetCommand(Some((SQLConf.THRIFTSERVER_POOL.key, Some(value)))) =>
sessionToActivePool.put(parentSession.getSessionHandle, value)
logInfo(s"Setting ${SparkContext.SPARK_SCHEDULER_POOL}=$value for future statements " +
"in this session.")

@SparkQA
Copy link

SparkQA commented Apr 2, 2019

Test build #4678 has finished for PR 23895 at commit 8ae9369.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 3, 2019

It seems reasonable; @caneGuy do you have thoughts on this? would it still address your issue?

@srowen
Copy link
Member

srowen commented Apr 6, 2019

Merged to master

@srowen srowen closed this in 6450c59 Apr 6, 2019
@caneGuy
Copy link
Contributor

caneGuy commented Jun 10, 2019

@srowen sorry for replying so late. I think this solution is reasonable. Thanks @cxzl25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants