Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21243][Core] Limit no. of map outputs in a shuffle fetch #18691

Closed
wants to merge 1 commit into from

Conversation

dhruve
Copy link
Contributor

@dhruve dhruve commented Jul 20, 2017

For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote host, it puts the NM under extra pressure and can crash it. This change introduces a configuration spark.reducer.maxBlocksInFlightPerAddress , to limit the no. of map outputs being fetched from a given remote address. The changes applied here are applicable for both the scenarios - when external shuffle is enabled as well as disabled.

Ran the job with the default configuration which does not change the existing behavior and ran it with few configurations of lower values -10,20,50,100. The job ran fine and there is no change in the output. (I will update the metrics related to NM in some time.)

Author: Dhruve Ashar dhruveashar@gmail.com

Closes #18487 from dhruve/impr/SPARK-21243.

For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote host, it puts the NM under extra pressure and can crash it. This change introduces a configuration `spark.reducer.maxBlocksInFlightPerAddress` , to limit the no. of map outputs being fetched from a given remote address. The changes applied here are applicable for both the scenarios - when external shuffle is enabled as well as disabled.

Ran the job with the default configuration which does not change the existing behavior and ran it with few configurations of lower values -10,20,50,100. The job ran fine and there is no change in the output. (I will update the metrics related to NM in some time.)

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes apache#18487 from dhruve/impr/SPARK-21243.
@SparkQA
Copy link

SparkQA commented Jul 20, 2017

Test build #79802 has finished for PR 18691 at commit 036eb17.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dhruve
Copy link
Contributor Author

dhruve commented Jul 21, 2017

@cloud-fan @tgravescs I have resolved the merge conflicts for 2.2. This was just related to remove extra configs.

@tgravescs
Copy link
Contributor

+1

asfgit pushed a commit that referenced this pull request Jul 21, 2017
For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote host, it puts the NM under extra pressure and can crash it. This change introduces a configuration `spark.reducer.maxBlocksInFlightPerAddress` , to limit the no. of map outputs being fetched from a given remote address. The changes applied here are applicable for both the scenarios - when external shuffle is enabled as well as disabled.

Ran the job with the default configuration which does not change the existing behavior and ran it with few configurations of lower values -10,20,50,100. The job ran fine and there is no change in the output. (I will update the metrics related to NM in some time.)

Author: Dhruve Ashar <dhruveashargmail.com>

Closes #18487 from dhruve/impr/SPARK-21243.

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes #18691 from dhruve/branch-2.2.
@tgravescs
Copy link
Contributor

merged

@dhruve
Copy link
Contributor Author

dhruve commented Jul 21, 2017

Thanks @tgravescs Closing the PR.

@dhruve dhruve closed this Jul 22, 2017
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote host, it puts the NM under extra pressure and can crash it. This change introduces a configuration `spark.reducer.maxBlocksInFlightPerAddress` , to limit the no. of map outputs being fetched from a given remote address. The changes applied here are applicable for both the scenarios - when external shuffle is enabled as well as disabled.

Ran the job with the default configuration which does not change the existing behavior and ran it with few configurations of lower values -10,20,50,100. The job ran fine and there is no change in the output. (I will update the metrics related to NM in some time.)

Author: Dhruve Ashar <dhruveashargmail.com>

Closes apache#18487 from dhruve/impr/SPARK-21243.

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes apache#18691 from dhruve/branch-2.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants