Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13906] Ensure that there are at least 2 dispatcher threads. #11728

Closed
wants to merge 1 commit into from

Conversation

yonran
Copy link
Contributor

@yonran yonran commented Mar 15, 2016

What changes were proposed in this pull request?

Force at least two dispatcher-event-loop threads. Since SparkDeploySchedulerBackend (in AppClient) calls askWithRetry to CoarseGrainedScheduler in the same process, there the driver needs at least two dispatcher threads to prevent the dispatcher thread from hanging.

How was this patch tested?

Manual.

Author: Yonathan Randolph yonathan@gmail.com

@srowen
Copy link
Member

srowen commented Mar 15, 2016

@zsxwing do you have an opinion?
I was thinking the max would be applied just to availableProcessors(), but, yeah maybe a good idea to not even let the user set it to 1. Is there any legitimate reason to do that? if we can think of one then this can more conservatively apply the max(2,..) to just the default value.

@yonran
Copy link
Contributor Author

yonran commented Mar 15, 2016

It may be legitimate to set spark.rpc.netty.dispatcher.numThreads=1 for processes other than the driver (sparkMaster, sparkWorker).

@srowen
Copy link
Member

srowen commented Mar 15, 2016

OK then I'd move the max inside, so that the property value can still override it to 1 if needed.

@zsxwing
Copy link
Member

zsxwing commented Mar 15, 2016

I'm ok with max(2, availableProcessors).

However, the root cause is there are blocking calls in the event loops but not enough threads. This could happen in other places (such as netty, akka). Ideally, we should avoid blocking calls in all event loops. However, it's hard to figure out all of them in the huge code bases :(

@yonran yonran force-pushed the SPARK-13906 branch 2 times, most recently from d8234d8 to b746177 Compare March 15, 2016 17:37
@yonran
Copy link
Contributor Author

yonran commented Mar 15, 2016

Changed to math.max(2, Runtime.getRuntime.availableProcessors())

## What changes were proposed in this pull request?

Create at least two dispatcher-event-loop threads by default. Since SparkDeploySchedulerBackend (in AppClient) calls askWithRetry to CoarseGrainedScheduler in the same process, the driver needs at least two dispatcher threads to prevent the dispatcher thread from hanging.

## How was this patch tested?

Manual

Author: Yonathan Randolph <yonathan@gmail.com>
@srowen
Copy link
Member

srowen commented Mar 15, 2016

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53208 has finished for PR 11728 at commit 597f0e6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Mar 16, 2016

Merged to master

@asfgit asfgit closed this in 05ab294 Mar 16, 2016
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
## What changes were proposed in this pull request?

Force at least two dispatcher-event-loop threads. Since SparkDeploySchedulerBackend (in AppClient) calls askWithRetry to CoarseGrainedScheduler in the same process, there the driver needs at least two dispatcher threads to prevent the dispatcher thread from hanging.

## How was this patch tested?

Manual.

Author: Yonathan Randolph <yonathangmail.com>

Author: Yonathan Randolph <yonathan@liftigniter.com>

Closes apache#11728 from yonran/SPARK-13906.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants