Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jun 4, 2021

What changes were proposed in this pull request?

This PR aims to change DiskBlockManager like the following to allow ShuffleDataIO to decide the behavior of shuffle file deletion.

- private[spark] class DiskBlockManager(conf: SparkConf, deleteFilesOnStop: Boolean)
+ private[spark] class DiskBlockManager(conf: SparkConf, var deleteFilesOnStop: Boolean)

Why are the changes needed?

SparkContext creates

  1. SparkEnv (with BlockManager and its DiskBlockManager)
  2. loads ShuffleDataIO
  3. initialize block manager.
_env = createSparkEnv(_conf, isLocal, listenerBus)

...
_shuffleDriverComponents = ShuffleDataIOUtils.loadShuffleDataIO(config).driver()
    _shuffleDriverComponents.initializeApplication().asScala.foreach { case (k, v) =>
      _conf.set(ShuffleDataIOUtils.SHUFFLE_SPARK_CONF_PREFIX + k, v)
    }
...

_env.blockManager.initialize(_applicationId)
...

DiskBlockManager is created first at BlockManager constructor and we cannot change deleteFilesOnStop later at ShuffleDataIO. By switching to var, we can implement enhanced shuffle data management feature via ShuffleDataIO like #32730 .

  val diskBlockManager = {
    // Only perform cleanup if an external service is not serving our shuffle files.
    val deleteFilesOnStop =
      !externalShuffleServiceEnabled || executorId == SparkContext.DRIVER_IDENTIFIER
    new DiskBlockManager(conf, deleteFilesOnStop)
  }

Does this PR introduce any user-facing change?

No. This is a private class.

How was this patch tested?

N/A

@dongjoon-hyun
Copy link
Member Author

Hi, @mridulm and @viirya . I spin off this part from the original PR.
Could you review this independently, please?

@github-actions github-actions bot added the CORE label Jun 4, 2021
@SparkQA
Copy link

SparkQA commented Jun 4, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43861/

@SparkQA
Copy link

SparkQA commented Jun 4, 2021

Test build #139339 has finished for PR 32784 at commit 1a86882.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

cc @Ngone51 too FYI

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @mridulm , @viirya , @HyukjinKwon .
Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-35654 branch June 6, 2021 16:21
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
…teFilesOnStop

### What changes were proposed in this pull request?

This PR aims to change `DiskBlockManager` like the following to allow `ShuffleDataIO` to decide the behavior of shuffle file deletion.
```scala
- private[spark] class DiskBlockManager(conf: SparkConf, deleteFilesOnStop: Boolean)
+ private[spark] class DiskBlockManager(conf: SparkConf, var deleteFilesOnStop: Boolean)
```

### Why are the changes needed?

`SparkContext` creates
1. `SparkEnv` (with `BlockManager` and its `DiskBlockManager`)
2. loads `ShuffleDataIO`
3. initialize block manager.
```scala
_env = createSparkEnv(_conf, isLocal, listenerBus)

...
_shuffleDriverComponents = ShuffleDataIOUtils.loadShuffleDataIO(config).driver()
    _shuffleDriverComponents.initializeApplication().asScala.foreach { case (k, v) =>
      _conf.set(ShuffleDataIOUtils.SHUFFLE_SPARK_CONF_PREFIX + k, v)
    }
...

_env.blockManager.initialize(_applicationId)
...
```

`DiskBlockManager` is created first at `BlockManager` constructor and we cannot change `deleteFilesOnStop` later at `ShuffleDataIO`. By switching to `var`, we can implement enhanced shuffle data management feature via `ShuffleDataIO` like apache#32730 .
```
  val diskBlockManager = {
    // Only perform cleanup if an external service is not serving our shuffle files.
    val deleteFilesOnStop =
      !externalShuffleServiceEnabled || executorId == SparkContext.DRIVER_IDENTIFIER
    new DiskBlockManager(conf, deleteFilesOnStop)
  }
```

### Does this PR introduce _any_ user-facing change?

No. This is a private class.

### How was this patch tested?

N/A

Closes apache#32784 from dongjoon-hyun/SPARK-35654.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants