Skip to content

Conversation

@q2w
Copy link

@q2w q2w commented Jun 14, 2021

What changes were proposed in this pull request?

This PR adds a config which makes block manager decommissioner to migrate block data on disk only.

Why are the changes needed?

While migrating block data, if enough memory is not available on peer block managers existing blocks are dropped. After this PR migrating blocks won't drop any existing blocks.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT in BlockManagerSuite

@github-actions github-actions bot added the CORE label Jun 14, 2021
@q2w q2w marked this pull request as ready for review June 14, 2021 11:05
@q2w q2w changed the title [SPARK-35754][CORE][WIP] Add config to put migrating blocks on disk only [SPARK-35754][CORE] Add config to put migrating blocks on disk only Jun 14, 2021
@q2w
Copy link
Author

q2w commented Jun 16, 2021

@Ngone51 Please have a look.

@dongjoon-hyun
Copy link
Member

cc @holdenk

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jun 27, 2021

Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action.

[info] *** 1 TEST FAILED ***
[error] Failed: Total 3061, Failed 1, Errors 0, Passed 3060, Ignored 7, Canceled 1
[error] Failed tests:
[error] 	org.apache.spark.storage.BlockManagerDecommissionUnitSuite

…in number of arguments in replicateBlocks method in BlockManager
@q2w q2w closed this Jun 29, 2021
@q2w q2w reopened this Jun 29, 2021
@HyukjinKwon
Copy link
Member

@q2w, can you rebase and create a new PR? Seems like GA in this PR is somehow messed up.

@SparkQA
Copy link

SparkQA commented Jun 29, 2021

Test build #140375 has finished for PR 32902 at commit c462df9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Jun 30, 2021

This would be inconsistent with the expectation of the storage level set on the RDD/DF.
Instead, what we can do is prefer disk in case it is available as an option (if storageLevel.useDisk == true)

@q2w
Copy link
Author

q2w commented Jul 1, 2021

This would be inconsistent with the expectation of the storage level set on the RDD/DF.
Instead, what we can do is prefer disk in case it is available as an option (if storageLevel.useDisk == true)

@mridulm That's correct. In the current flow of decommissioning,the blocks being offloaded can replace existing cached blocks on peer block manager if enough memory is not available. This way we are saving one cached block by loosing another.
If this change is in, we can avoid loosing cached blocks to place offloading cached blocks.
This PR came out as a result of discussion on another related PR. #32677

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Oct 10, 2021
@github-actions github-actions bot closed this Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants