From 05a3f451c37f1586fc9309b7ee18cc3941f0ea03 Mon Sep 17 00:00:00 2001 From: Michael Gummelt Date: Mon, 25 Apr 2016 11:19:56 -0700 Subject: [PATCH] Fix dynamic allocation docs to address cached data. --- docs/job-scheduling.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/job-scheduling.md b/docs/job-scheduling.md index 083c020caa5db..40b6cd99cc27f 100644 --- a/docs/job-scheduling.md +++ b/docs/job-scheduling.md @@ -158,8 +158,9 @@ executors will fetch shuffle files from the service instead of from each other. shuffle state written by an executor may continue to be served beyond the executor's lifetime. In addition to writing shuffle files, executors also cache data either on disk or in memory. -When an executor is removed, however, all cached data will no longer be accessible. There is -currently not yet a solution for this in Spark 1.2. In future releases, the cached data may be +When an executor is removed, however, all cached data will no longer be accessible. To mitigate this, +by default executors containing cached data are never removed. You can configure this behavior with +`spark.dynamicAllocation.cachedExecutorIdleTimeout`. In future releases, the cached data may be preserved through an off-heap storage similar in spirit to how shuffle files are preserved through the external shuffle service.