-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19753][CORE] Un-register all shuffle output on a host in case of slave lost or fetch failure #17088
[SPARK-19753][CORE] Un-register all shuffle output on a host in case of slave lost or fetch failure #17088
Changes from 1 commit
74ca88b
6898c2b
32a2315
c7c3129
f96ec68
d4979e3
8787db1
4ca9527
9f64e29
be3b3db
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -394,6 +394,31 @@ class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with Timeou | |
assertDataStructuresEmpty() | ||
} | ||
|
||
test("All shuffle files should on the slave should be cleaned up when slave lost") { | ||
// reset the test context with the right shuffle service config | ||
afterEach() | ||
val conf = new SparkConf() | ||
conf.set("spark.shuffle.service.enabled", "true") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we add a another test with |
||
init(conf) | ||
runEvent(ExecutorAdded("exec-hostA1", "hostA")) | ||
runEvent(ExecutorAdded("exec-hostA2", "hostA")) | ||
runEvent(ExecutorAdded("exec-hostB", "hostB")) | ||
val shuffleMapRdd = new MyRDD(sc, 3, Nil) | ||
val shuffleDep = new ShuffleDependency(shuffleMapRdd, new HashPartitioner(1)) | ||
val shuffleId = shuffleDep.shuffleId | ||
val reduceRdd = new MyRDD(sc, 1, List(shuffleDep), tracker = mapOutputTracker) | ||
submit(reduceRdd, Array(0)) | ||
complete(taskSets(0), Seq( | ||
(Success, makeMapStatus("hostA", 1)), | ||
(Success, makeMapStatus("hostA", 1)), | ||
(Success, makeMapStatus("hostB", 1)))) | ||
runEvent(ExecutorLost("exec-hostA1", SlaveLost("", true))) | ||
val mapStatus = mapOutputTracker.mapStatuses.get(0).get.filter(_!= null) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there are a couple of problems with this test.
I think this is better: submit(reduceRdd, Array(0))
// map stage completes successfully, with one task on each executor
complete(taskSets(0), Seq(
(Success,
MapStatus(BlockManagerId("exec-hostA1", "hostA", 12345), Array.fill[Long](1)(2))),
(Success,
MapStatus(BlockManagerId("exec-hostA2", "hostA", 12345), Array.fill[Long](1)(2))),
(Success, makeMapStatus("hostB", 1))
))
// make sure our test setup is correct
val initialMapStatus = mapOutputTracker.mapStatuses.get(0).get
assert(initialMapStatus.count(_ != null) === 3)
assert(initialMapStatus.map{_.location.executorId}.toSet ===
Set("exec-hostA1", "exec-hostA2", "exec-hostB"))
// reduce stage fails with a fetch failure from one host
complete(taskSets(1), Seq(
(FetchFailed(BlockManagerId("exec-hostA2", "hostA", 12345), shuffleId, 0, 0, "ignored"),
null)
))
// Here is the main assertion -- make sure that we de-register the map output from both executors on hostA
val mapStatus = mapOutputTracker.mapStatuses.get(0).get
assert(mapStatus.count(_ != null) === 1)
assert(mapStatus(2).location.executorId === "exec-hostB")
assert(mapStatus(2).location.host === "hostB") this version fails until you reverse the if / else I pointed out in the dagscheduler. it would also be nice if this included map output from multiple stages registered on the given host, so you could check that all output is deregistered, not just the one shuffleId which had an error. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for providing a better test case, I also modified it to include map output from multiple stages. |
||
assert(mapStatus.size === 1) | ||
assert(mapStatus(0).location.executorId === "exec-hostB") | ||
assert(mapStatus(0).location.host === "hostB") | ||
} | ||
|
||
test("zero split job") { | ||
var numResults = 0 | ||
var failureReason: Option[Exception] = None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could use
bmAddress.host
, then you wouldn't need to store anotherexecToHost
map (though it would require a little more refactoring)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's correct to assume that a FetchFailure means that all of the executors on the slave were lost. You could have a failure because one executor died, but the other executors on the host are OK, right? (UPDATED: I realized this is the same comment @mridulm made above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kayousterhout - This change applies only when external shuffle service is enabled, in that case, a fetch failure would mean that the external shuffle service is unavailable, so we should remove all the output on that host, right? For case, when shuffle service is not enabled, this change should be a no-op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squito - Good point, will do.