[SPARK-16817][CORE][WIP] Use Alluxio to improve stability of shuffle by replication of shuffle data #22005

Chopinxb · 2018-08-06T08:56:17Z

What changes were proposed in this pull request?

(In the PR, I propose to use Alluxio to help store shuffle data in order to improve the stability of complicated OLAP task.
Motivation
In original ways, when there is a shuffle fetch failure (NodeManager(shuffle service) crashed), spark will rerun previous stage to reproduce shuffle data. This way works well, but in some cases we cannot accept the recalculation price.
In this PR, when there is a shuffle fetch failure , reduce will retry fetch shuffle data from Alluxio to avoid recalculation
Usage

Enable this feature in spark-default.conf.
spark.alluxio.shuffle.enabled ture

How was this patch tested?

manual tests

AmplabJenkins · 2018-08-06T08:58:03Z

Can one of the admins verify this patch?

jerryshao · 2018-08-13T02:45:41Z

I believe such kind of PR requires SPIP and community discussion first.

XiaoBang added 3 commits August 6, 2018 13:10

use alluxio to improve stability of shuffle

a4371cf

update style

6565988

update style

20cabe1

Merge branch 'master' into spark-shuffle-alluxio

b6fde21

srowen mentioned this pull request Nov 10, 2018

[INFRA] Close stale PRs #23001

Closed

asfgit closed this in a3ba3a8 Nov 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-16817][CORE][WIP] Use Alluxio to improve stability of shuffle by replication of shuffle data #22005

[SPARK-16817][CORE][WIP] Use Alluxio to improve stability of shuffle by replication of shuffle data #22005

Chopinxb commented Aug 6, 2018

AmplabJenkins commented Aug 6, 2018

jerryshao commented Aug 13, 2018

[SPARK-16817][CORE][WIP] Use Alluxio to improve stability of shuffle by replication of shuffle data #22005

[SPARK-16817][CORE][WIP] Use Alluxio to improve stability of shuffle by replication of shuffle data #22005

Conversation

Chopinxb commented Aug 6, 2018

What changes were proposed in this pull request?

How was this patch tested?

AmplabJenkins commented Aug 6, 2018

jerryshao commented Aug 13, 2018