set RestartPolicy=Never for executor #367

honkiko · 2017-07-04T09:01:42Z

As for current implementation the RestartPolicy of executor pod is
not set, so the default value "OnFailure" is in effect. But this
causes problem.

If an executor is terminated unexpectedly, for example, exit by
java.lang.OutOfMemoryError, it'll be restarted by k8s with the
same executor ID. When the new executor tries to fetch a block hold by
the last executor, ShuffleBlockFetcherIterator.splitLocalRemoteBlocks()
think it's a local block and tries to read it from it's local dir.
But the executor's local dir is changed because random generated ID is
part of local dir. FetchFailedException will raise and the stage will
fail.

The rolling Error message:

17/06/29 01:54:56 WARN KubernetesTaskSetManager: Lost task 0.1 in stage
2.0 (TID 7, 172.16.75.92, executor 1): FetchFailed(BlockManagerId(1,
172.16.75.92, 40539, None), shuffleId=2, mapId=0, reduceId=0, message=
org.apache.spark.shuffle.FetchFailedException:
/data2/spark/blockmgr-0e228d3c-8727-422e-aa97-2841a877c42a/32/shuffle_2_0_0.index
(No such file or directory)
        at
org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357)
        at
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332)
        at
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)

What changes were proposed in this pull request?

set RestartPolicy=Never for executor

How was this patch tested?

Tested on my local testbed and the executor RestartPolicy is now "Never".
Should work together with #244

Please review http://spark.apache.org/contributing.html before opening a pull request.

As for current implementation the RestartPolicy of executor pod is not set, so the default value "OnFailure" is in effect. But this causes problem. If an executor is terminated unexpectedly, for example, exit by java.lang.OutOfMemoryError, it'll be restarted by k8s with the same executor ID. When the new executor tries to fetch a block hold by the last executor, ShuffleBlockFetcherIterator.splitLocalRemoteBlocks() think it's a **local** block and tries to read it from it's local dir. But the executor's local dir is changed because random generated ID is part of local dir. FetchFailedException will raise and the stage will fail. The rolling Error message: 17/06/29 01:54:56 WARN KubernetesTaskSetManager: Lost task 0.1 in stage 2.0 (TID 7, 172.16.75.92, executor 1): FetchFailed(BlockManagerId(1, 172.16.75.92, 40539, None), shuffleId=2, mapId=0, reduceId=0, message= org.apache.spark.shuffle.FetchFailedException: /data2/spark/blockmgr-0e228d3c-8727-422e-aa97-2841a877c42a/32/shuffle_2_0_0.index (No such file or directory) at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)

erikerlandson · 2017-07-06T00:15:48Z

This seems like it will work in the Dynamic Allocation scenario; if dynamic allocation isn't on, does this mean the cluster size remains diminished by one?

foxish · 2017-07-06T22:13:36Z

We added this for the driver in #303. However, the expectation was that the executor will recover. However, it seems like having it reconnect with the same ID doesn't really let us accomplish that. This is okay as long as executor recovery semantics handle this and respawn executors when lost anyway. @varunkatta can you confirm that we'll get the same behavior even if we disable executor restarts?

foxish · 2017-07-07T06:16:47Z

/cc @kimoonkim

ash211 · 2017-07-19T17:09:45Z

Minor merge conflict, but +1 from me. We shouldn't be restarting executors with the same ID, so better to do the restart at the Spark driver level than k8s.

foxish · 2017-07-19T18:36:43Z

Will merge on green

varunkatta · 2017-07-19T21:01:01Z

ExectuorRecovery should work even with restartPolicy set to Never. This change is good to go, I think.

* set RestartPolicy=Never for executor As for current implementation the RestartPolicy of executor pod is not set, so the default value "OnFailure" is in effect. But this causes problem. If an executor is terminated unexpectedly, for example, exit by java.lang.OutOfMemoryError, it'll be restarted by k8s with the same executor ID. When the new executor tries to fetch a block hold by the last executor, ShuffleBlockFetcherIterator.splitLocalRemoteBlocks() think it's a **local** block and tries to read it from it's local dir. But the executor's local dir is changed because random generated ID is part of local dir. FetchFailedException will raise and the stage will fail. The rolling Error message: 17/06/29 01:54:56 WARN KubernetesTaskSetManager: Lost task 0.1 in stage 2.0 (TID 7, 172.16.75.92, executor 1): FetchFailed(BlockManagerId(1, 172.16.75.92, 40539, None), shuffleId=2, mapId=0, reduceId=0, message= org.apache.spark.shuffle.FetchFailedException: /data2/spark/blockmgr-0e228d3c-8727-422e-aa97-2841a877c42a/32/shuffle_2_0_0.index (No such file or directory) at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) * Update KubernetesClusterSchedulerBackend.scala

…8s#367) This reverts commit a3c2539.

* set RestartPolicy=Never for executor As for current implementation the RestartPolicy of executor pod is not set, so the default value "OnFailure" is in effect. But this causes problem. If an executor is terminated unexpectedly, for example, exit by java.lang.OutOfMemoryError, it'll be restarted by k8s with the same executor ID. When the new executor tries to fetch a block hold by the last executor, ShuffleBlockFetcherIterator.splitLocalRemoteBlocks() think it's a **local** block and tries to read it from it's local dir. But the executor's local dir is changed because random generated ID is part of local dir. FetchFailedException will raise and the stage will fail. The rolling Error message: 17/06/29 01:54:56 WARN KubernetesTaskSetManager: Lost task 0.1 in stage 2.0 (TID 7, 172.16.75.92, executor 1): FetchFailed(BlockManagerId(1, 172.16.75.92, 40539, None), shuffleId=2, mapId=0, reduceId=0, message= org.apache.spark.shuffle.FetchFailedException: /data2/spark/blockmgr-0e228d3c-8727-422e-aa97-2841a877c42a/32/shuffle_2_0_0.index (No such file or directory) at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) * Update KubernetesClusterSchedulerBackend.scala

foxish added 2 commits July 19, 2017 12:35

Merge branch 'branch-2.1-kubernetes' into never_restart

22f1437

Update KubernetesClusterSchedulerBackend.scala

7ee7b0c

foxish merged commit e1ff2f0 into apache-spark-on-k8s:branch-2.1-kubernetes Jul 19, 2017

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019

Revert "update upstream (apache-spark-on-k8s#366)" (apache-spark-on-k…

17f07ed

…8s#367) This reverts commit a3c2539.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set RestartPolicy=Never for executor #367

set RestartPolicy=Never for executor #367

honkiko commented Jul 4, 2017 •

edited

erikerlandson commented Jul 6, 2017

foxish commented Jul 6, 2017

foxish commented Jul 7, 2017

ash211 commented Jul 19, 2017

foxish commented Jul 19, 2017

varunkatta commented Jul 19, 2017

set RestartPolicy=Never for executor #367

set RestartPolicy=Never for executor #367

Conversation

honkiko commented Jul 4, 2017 • edited

What changes were proposed in this pull request?

How was this patch tested?

erikerlandson commented Jul 6, 2017

foxish commented Jul 6, 2017

foxish commented Jul 7, 2017

ash211 commented Jul 19, 2017

foxish commented Jul 19, 2017

varunkatta commented Jul 19, 2017

honkiko commented Jul 4, 2017 •

edited