[SPARK-11212][Core][Streaming]Make preferred locations support ExecutorCacheTaskLocation and update… #9181

zsxwing · 2015-10-20T14:00:36Z

… ReceiverTracker and ReceiverSchedulingPolicy to use it

This PR includes the following changes:

Add a new preferred location format, executor_<host>_<executorID> (e.g., "executor_localhost_2"), to support specifying the executor locations for RDD.
Use the new preferred location format in ReceiverTracker to optimize the starting time of Receivers when there are multiple executors in a host.

The goal of this PR is to enable the streaming scheduler to place receivers (which run as tasks) in specific executors. Basically, I want to have more control on the placement of the receivers such that they are evenly distributed among the executors. We tried to do this without changing the core scheduling logic. But it does not allow specifying particular executor as preferred location, only at the host level. So if there are two executors in the same host, and I want two receivers to run on them (one on each executor), I cannot specify that. Current code only specifies the host as preference, which may end up launching both receivers on the same executor. We try to work around it but restarting a receiver when it does not launch in the desired executor and hope that next time it will be started in the right one. But that cause lots of restarts, and delays in correctly launching the receiver.

So this change, would allow the streaming scheduler to specify the exact executor as the preferred location. Also this is not exposed to the user, only the streaming scheduler uses this.

… ReceiverTracker and ReceiverSchedulingPolicy to use it

zsxwing · 2015-10-20T14:02:20Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverSchedulingPolicy.scala

Use TaskLocation in the return type because the locations could be host from Receiver.preferredLocation, or ExecutorCacheTaskLocation.

zsxwing · 2015-10-20T14:18:22Z

I tested this patch using 5 workers, 24 executors, 24 receivers and there were no receiver restarting logs in the test.

SparkQA · 2015-10-20T16:39:50Z

Test build #43982 has finished for PR 9181 at commit f5e7c4f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2015-10-20T19:38:57Z

core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala

@andrewor14 @kayousterhout
Can you take a look at this change?

Is this necessary because previously, we didn't allow the user to pass in ExecutorCacheTaskLocations and we just tried to figure them out automatically? (and why doesn't that work here?)

The goal of this PR is to enable the streaming scheduler to place receivers (which run as tasks) in specific executors. Basically, I want to have more control on the placement of the receivers such that they are evenly distributed among the executors. We tried to do this without changing the core scheduling logic. But it does not allow specifying particular executor as preferred location, only at the host level. So if there are two executors in the same host, and I want two receivers to run on them (one on each executor), I cannot specify that. Current code only specifies the host as preference, which may end up launching both receivers on the same executor. We try to work around it but restarting a receiver when it does not launch in the desired executor and hope that next time it will be started in the right one. But that cause lots of restarts, and delays in correctly launching the receiver.

So this change, would allow the streaming scheduler to specify the exact executor as the preferred location. Also this is not exposed to the user, only the streaming scheduler uses this.

Ok sounds good -- can you add this description to the JIRA and to the pull request description (so that it will be in the commit message)? Scheduler changes LGTM.

@zsxwing Please do so. :)

zsxwing · 2015-10-21T00:09:14Z

retest this please

SparkQA · 2015-10-21T01:44:41Z

Test build #44022 has finished for PR 9181 at commit f5e7c4f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2015-10-21T01:46:17Z

retest this please

SparkQA · 2015-10-21T03:29:17Z

Test build #44028 has finished for PR 9181 at commit f5e7c4f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2015-10-21T03:36:17Z

retest this please

SparkQA · 2015-10-21T05:12:58Z

Test build #44035 has finished for PR 9181 at commit f5e7c4f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-10-21T05:46:55Z

@zsxwing I think this duplicated the existing issue / PR #9096 . Does this cover the same logic, and, can you include all of the changes here?

zsxwing · 2015-10-21T05:57:10Z

retest this please

zsxwing · 2015-10-21T05:59:15Z

@zsxwing I think this duplicated the existing issue / PR #9096 . Does this cover the same logic, and, can you include all of the changes here?

This is a different issue that only about ExecutorCacheTaskLocation. #9096 is a minor patch. I can update this one after #9096 gets merged.

SparkQA · 2015-10-21T08:06:53Z

Test build #44048 has finished for PR 9181 at commit f5e7c4f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-10-22T11:00:19Z

@zsxwing I merged #9096

SparkQA · 2015-10-22T16:10:45Z

Test build #44148 has finished for PR 9181 at commit 35f7936.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2015-10-26T22:34:41Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverSchedulingPolicy.scala

nit: grammar. It will try to scheduler receiver such that they are evenly distributed

tdas · 2015-10-26T23:19:07Z

Overall, LGTM, except a few minor refactorings.

zsxwing · 2015-10-27T01:57:26Z

@tdas addressed your comments.

Its probably a good idea to also add executor information, not just the host.

For this one, I prefer to add the executor info in a separate PR and also add it to UI.

zsxwing · 2015-10-27T02:17:33Z

For this one, I prefer to add the executor info in a separate PR and also add it to UI.

Created https://issues.apache.org/jira/browse/SPARK-11333 to track it

SparkQA · 2015-10-27T04:12:16Z

Test build #44390 has finished for PR 9181 at commit 63c66eb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2015-10-27T08:24:12Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverSchedulingPolicy.scala

nit:shouldnt this map be in the line above? I think .map { loc => would fit.

SparkQA · 2015-10-27T12:29:56Z

Test build #44420 has finished for PR 9181 at commit 46c5197.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2015-10-27T23:13:22Z

Merging this to master. Thanks @zsxwing

Make preferred locations support ExecutorCacheTaskLocation and update…

f5e7c4f

… ReceiverTracker and ReceiverSchedulingPolicy to use it

zsxwing reviewed Oct 20, 2015
View reviewed changes

tdas reviewed Oct 20, 2015
View reviewed changes

srowen mentioned this pull request Oct 21, 2015

[SPARK-11121][Core] Correct the TaskLocation type #9096

Closed

zsxwing added 2 commits October 22, 2015 21:55

Merge branch 'master' into executor-location

d4b6ce6

Add a unit test

35f7936

tdas reviewed Oct 26, 2015
View reviewed changes

Minor refactorings to address comments

63c66eb

tdas reviewed Oct 27, 2015
View reviewed changes

Fix the style

46c5197

asfgit closed this in 9fbd75a Oct 27, 2015

zsxwing deleted the executor-location branch October 28, 2015 00:30

[SPARK-11212][Core][Streaming]Make preferred locations support ExecutorCacheTaskLocation and update… #9181

[SPARK-11212][Core][Streaming]Make preferred locations support ExecutorCacheTaskLocation and update… #9181

Uh oh!

Conversation

zsxwing commented Oct 20, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zsxwing commented Oct 20, 2015

Uh oh!

SparkQA commented Oct 20, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zsxwing commented Oct 21, 2015

Uh oh!

SparkQA commented Oct 21, 2015

Uh oh!

zsxwing commented Oct 21, 2015

Uh oh!

SparkQA commented Oct 21, 2015

Uh oh!

zsxwing commented Oct 21, 2015

Uh oh!

SparkQA commented Oct 21, 2015

Uh oh!

srowen commented Oct 21, 2015

Uh oh!

zsxwing commented Oct 21, 2015

Uh oh!

zsxwing commented Oct 21, 2015

Uh oh!

SparkQA commented Oct 21, 2015

Uh oh!

srowen commented Oct 22, 2015

Uh oh!

SparkQA commented Oct 22, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdas commented Oct 26, 2015

Uh oh!

zsxwing commented Oct 27, 2015

Uh oh!

zsxwing commented Oct 27, 2015

Uh oh!

SparkQA commented Oct 27, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 27, 2015

Uh oh!

tdas commented Oct 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants