Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10469][DOC] Try and document the three options #8638

Closed

Conversation

holdenk
Copy link
Contributor

@holdenk holdenk commented Sep 7, 2015

From JIRA:
Add documentation for tungsten-sort.
From the mailing list "I saw a new "spark.shuffle.manager=tungsten-sort" implemented in
https://issues.apache.org/jira/browse/SPARK-7081, but it can't be found its
corresponding description in
http://people.apache.org/~pwendell/spark-releases/spark-1.5.0-rc3-docs/configuration.html(Currenlty
there are only 'sort' and 'hash' two options)."

@holdenk holdenk changed the title [SPARK-10469] Try and document the three options [SPARK-10469][DOC] Try and document the three options Sep 7, 2015
@SparkQA
Copy link

SparkQA commented Sep 7, 2015

Test build #42084 has finished for PR 8638 at commit 4978681.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Implementation to use for shuffling data. There are three implementations available:
<code>sort</code>, <code>hash</code> and the new <code>tungsten-sort</code>.
Sort-based shuffle is more memory-efficient and is the default option starting in 1.2.
Tungsten-sort is similar to sort based shuffle, with a direct binary cache-friendly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits: "to the sort-based". Does this need qualification as being experimental, and available from 1.5 onward?

@SparkQA
Copy link

SparkQA commented Sep 7, 2015

Test build #42088 has finished for PR 8638 at commit 0aada24.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

LGTM, merging into master 1.5.

asfgit pushed a commit that referenced this pull request Sep 10, 2015
From JIRA:
Add documentation for tungsten-sort.
From the mailing list "I saw a new "spark.shuffle.manager=tungsten-sort" implemented in
https://issues.apache.org/jira/browse/SPARK-7081, but it can't be found its
corresponding description in
http://people.apache.org/~pwendell/spark-releases/spark-1.5.0-rc3-docs/configuration.html(Currenlty
there are only 'sort' and 'hash' two options)."

Author: Holden Karau <holden@pigscanfly.ca>

Closes #8638 from holdenk/SPARK-10469-document-tungsten-sort.

(cherry picked from commit a76bde9)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in a76bde9 Sep 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants