Skip to content

Commit

Permalink
[SPARK-33891][DOCS][CORE] Update dynamic allocation related documents
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR aims to update the followings.
- Remove the outdated requirement for `spark.shuffle.service.enabled` in `configuration.md`
- Dynamic allocation section in `job-scheduling.md`

### Why are the changes needed?

To make the document up-to-date.

### Does this PR introduce _any_ user-facing change?

No, it's a documentation update.

### How was this patch tested?

Manual.

**BEFORE**
![Screen Shot 2020-12-23 at 2 22 04 AM](https://user-images.githubusercontent.com/9700541/102986441-ae647f80-44c5-11eb-97a3-87c2d368952a.png)
![Screen Shot 2020-12-23 at 2 22 34 AM](https://user-images.githubusercontent.com/9700541/102986473-bcb29b80-44c5-11eb-8eae-6802001c6dfa.png)

**AFTER**
![Screen Shot 2020-12-23 at 2 25 36 AM](https://user-images.githubusercontent.com/9700541/102986767-2df24e80-44c6-11eb-8540-e74856a4c313.png)
![Screen Shot 2020-12-23 at 2 21 13 AM](https://user-images.githubusercontent.com/9700541/102986366-8e34c080-44c5-11eb-8054-1efd07c9458c.png)

Closes #30906 from dongjoon-hyun/SPARK-33891.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 47d1aa4)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
  • Loading branch information
dongjoon-hyun authored and HyukjinKwon committed Dec 23, 2020
1 parent a0d51ec commit 6a04775
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 10 deletions.
3 changes: 1 addition & 2 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -928,8 +928,7 @@ Apart from these, the following properties are also available, and may be useful
<td>false</td>
<td>
Enables the external shuffle service. This service preserves the shuffle files written by
executors so the executors can be safely removed. This must be enabled if
<code>spark.dynamicAllocation.enabled</code> is "true". The external shuffle service
executors so the executors can be safely removed. The external shuffle service
must be set up in order to enable it. See
<a href="job-scheduling.html#configuration-and-setup">dynamic allocation
configuration and setup documentation</a> for more information.
Expand Down
17 changes: 9 additions & 8 deletions docs/job-scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,18 +79,19 @@ are no longer used and request them again later when there is demand. This featu
useful if multiple applications share resources in your Spark cluster.

This feature is disabled by default and available on all coarse-grained cluster managers, i.e.
[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), and
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes).
[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html),
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes) and [K8s mode](running-on-kubernetes.html).


### Configuration and Setup

There are two requirements for using this feature. First, your application must set
`spark.dynamicAllocation.enabled` to `true`. Second, you must set up an *external shuffle service*
on each worker node in the same cluster and set `spark.shuffle.service.enabled` to true in your
application. The purpose of the external shuffle service is to allow executors to be removed
There are two ways for using this feature.
First, your application must set both `spark.dynamicAllocation.enabled` and `spark.dynamicAllocation.shuffleTracking.enabled` to `true`.
Second, your application must set both `spark.dynamicAllocation.enabled` and `spark.shuffle.service.enabled` to `true`
after you set up an *external shuffle service* on each worker node in the same cluster.
The purpose of the shuffle tracking or the external shuffle service is to allow executors to be removed
without deleting shuffle files written by them (more detail described
[below](job-scheduling.html#graceful-decommission-of-executors)). The way to set up this service
varies across cluster managers:
[below](job-scheduling.html#graceful-decommission-of-executors)). While it is simple to enable shuffle tracking, the way to set up the external shuffle service varies across cluster managers:

In standalone mode, simply start your workers with `spark.shuffle.service.enabled` set to `true`.

Expand Down

0 comments on commit 6a04775

Please sign in to comment.