[SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded #45052

sunchao · 2024-02-06T23:34:15Z

What changes were proposed in this pull request?

This changes the initialization of SparkEnv.memoryManager to after the DriverPlugin is loaded, to allow the plugin to customize memory related configurations.

A minor fix has been made to Task to make sure that it uses the same BlockManager through out the task execution. Previous a different BlockManager could be used in some corner cases. Also added a test for the fix.

Why are the changes needed?

Today, there is no way for a custom DriverPlugin to override memory configurations such as spark.executor.memory, spark.executor.memoryOverhead, spark.memory.offheap.size etc This is because the memory manager is initialized before DriverPlugin is loaded.

A similar change has been made to shuffleManager in #43627.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests. Also added new tests.

Was this patch authored or co-authored using generative AI tooling?

No

core/src/main/scala/org/apache/spark/SparkContext.scala

core/src/main/scala/org/apache/spark/SparkEnv.scala

dongjoon-hyun

FYI, this is a documented API, @sunchao .

https://spark.apache.org/docs/3.5.0/api/java/org/apache/spark/SparkEnv.html

I believe we can add a new contractor for Driver-plugin use case only.

sunchao · 2024-02-06T23:47:51Z

I'll add some tests later. Marking as a draft for now to run through all existing tests.

This reverts commit 1fd8807.

mridulm

Took a quick look, and it looks fine to me overall.
Please do ping me when it is ready for review though !

sunchao · 2024-02-15T21:45:12Z

Sure, thanks @mridulm in advance!

mridulm · 2024-02-21T18:48:44Z

core/src/main/scala/org/apache/spark/SparkEnv.scala

@@ -77,6 +76,12 @@ class SparkEnv (

  def shuffleManager: ShuffleManager = _shuffleManager

+  // We initialize the MemoryManager later in SparkContext after DriverPlugin is loaded
+  // to allow the plugin to overwrite memory configurations
+  private var _memoryManager: MemoryManager = _


nit: move the definition along with _shuffleManager above ?

sure will do.

mridulm · 2024-02-21T18:51:46Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

 */
 private[spark] class BlockManager(
    val executorId: String,
    rpcEnv: RpcEnv,
    val master: BlockManagerMaster,
    val serializerManager: SerializerManager,
    val conf: SparkConf,
-    memoryManager: MemoryManager,
+    var memoryManager: MemoryManager,


Do we want to follow the same pattern as what shuffleManager does here ?

I tried that at the beginning, but found out in certain cases there may be race conditions in:

private lazy val _memoryManager = Option(_memoryManager).getOrElse(SparkEnv.get.memoryManager)

Since a different thread can call SparkEnv.set right after the memoryManger is updated in the current SparkEnv. As result, the memoryManager could be null. This is revealed in JobCancellationSuite.

The current approach makes the memoryManager a mutable field and updated later when the driver plugin is loaded.

I am not sure I follow the corner case - can you point me to which test is causing this issue ? Thanks !
BlockManager being used would be associated with the corresponding SparkEnv - and if SparkEnv is being mutated, the new env is what we should be referencing.

Sure. It is "job group with interruption". I think it is flaky though and doesn't always happen. When I tried this locally it doesn't always reproduce. The job link: https://github.com/sunchao/spark/actions/runs/7923522243/job/21637267400

[info] - job group with interruption *** FAILED *** (34 milliseconds) [info] java.lang.NullPointerException: Cannot invoke "org.apache.spark.memory.MemoryManager.maxOnHeapStorageMemory()" because the return value of "org.apache.spark.storage.BlockManager.memoryManager()" is null [info] at org.apache.spark.storage.BlockManager.maxOnHeapMemory$lzycompute(BlockManager.scala:243) [info] at org.apache.spark.storage.BlockManager.maxOnHeapMemory(BlockManager.scala:243) [info] at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:565) [info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:633) [info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:159) [info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:172) [info] at org.apache.spark.JobCancellationSuite.$anonfun$new$41(JobCancellationSuite.scala:397) [info] at org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127) [info] at org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282) [info] at org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231) [info] at org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230) [info] at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)

I think we should be able to change shuffleManager in the same manner to avoid the potential concurrency issue.

I will need to look more - I dont this my hypothesis is the RC here ... even though _env.blockManager.initialize happens after _taskScheduler.start(), _env.initializeMemoryManager happens before and should have initialized the state.

The stack trace of the NPE that we saw earlier was part of spark context initialization ... not an access from task, right ?

Thanks @mridulm for checking! I think that stack trace doesn't reveal the root cause of the issue. I added a bunch of debugging messages in the code and found out the task that was causing the issue:

setting active env to org.apache.spark.SparkEnv@5ab3ee8b in pool-1-thread-1-ScalaTest-running-JobCancellationSuite active env = org.apache.spark.SparkEnv@5ab3ee8b, thread = Executor task launch worker for task 0.0 in stage 0.0 (TID 0) java.base/java.lang.Thread.getStackTrace(Thread.java:1619) org.apache.spark.storage.BlockManager.memoryManager$lzycompute(BlockManager.scala:210) org.apache.spark.storage.BlockManager.memoryManager(BlockManager.scala:204) org.apache.spark.storage.BlockManager.memoryStore$lzycompute(BlockManager.scala:248) org.apache.spark.storage.BlockManager.memoryStore(BlockManager.scala:247) org.apache.spark.scheduler.Task.$anonfun$run$3(Task.scala:146) org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287) org.apache.spark.scheduler.Task.run(Task.scala:144) org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:633) org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:96) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:636) java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) java.base/java.lang.Thread.run(Thread.java:840) memory manager of org.apache.spark.SparkEnv@5ab3ee8b is null, _memoryManager = null, thread = Executor task launch worker for task 0.0 in stage 0.0 (TID 0) set memory manager for org.apache.spark.SparkEnv@5ab3ee8b, threadName = pool-1-thread-1-ScalaTest-running-JobCancellationSuite java.base/java.lang.Thread.getStackTrace(Thread.java:1619) org.apache.spark.SparkContext.<init>(SparkContext.scala:584) org.apache.spark.SparkContext.<init>(SparkContext.scala:141) org.apache.spark.JobCancellationSuite.$anonfun$new$45(JobCancellationSuite.scala:430) org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127) org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282) org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231) org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230) org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69) org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:155) org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) org.scalatest.Transformer.apply(Transformer.scala:22) org.scalatest.Transformer.apply(Transformer.scala:20) org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)

The "setting active env" and "set memory manager" messages are logged in SparkContext initialization, while the "active env =" and "memory manager of " are logged in BlockManager when trying to access the memoryManager. The first stack trace shows it is from the separate worker thread.

Sorry, the line number may not match since I added several changes in my local repo for debugging purpose.

@sunchao Can you please take a look at this ?
It should fix the issue we are discussing - the test is for illustration purpose only, please do adapt and clean it up :-)

Essentially it is a minor modification to your fix - the issue is that blockManager referenced in task cleanup in finally is incorrect - as you had fixed.
The only change I introduced is to not need this to be passed in at a Task level - but simply grab it at task start time - and also added a test which validates this is indeed the issue.

This also means that, given the risk with blockManager being in potentially inconsistent state until initialization is complete - we have to add some documentation to it - so that this buggy pattern does not get introduced in future again.

Thanks @mridulm . I like your solution which is simpler. Saving the blockManager at the beginning of Task.run should be sufficient. Let me adapt the code and the test case in this PR.

mridulm · 2024-02-21T18:54:54Z

+CC @dongjoon-hyun and @tgravescs

dongjoon-hyun

+1, LGTM for AS-IS PR from my side.

For the on-going discussion, feel free to continue.

core/src/main/scala/org/apache/spark/SparkEnv.scala

tgravescs · 2024-02-23T14:54:21Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

 */
 private[spark] class BlockManager(
    val executorId: String,
    rpcEnv: RpcEnv,
    val master: BlockManagerMaster,
    val serializerManager: SerializerManager,
    val conf: SparkConf,
-    memoryManager: MemoryManager,
+    var memoryManager: MemoryManager,


I'm missing how this is failing, can you clarify? The blockmanager initialize is called on line 632 after the memory manager is initialized in the block manager, so how do we get a null?

mridulm

Looks good to me, thanks for working on this @sunchao and going over the various corner cases !

tgravescs · 2024-02-26T14:44:44Z

+1

sunchao · 2024-02-26T18:18:23Z

Thanks @mridulm @dongjoon-hyun @tgravescs for the review! merged to master.

…plugin is loaded ### What changes were proposed in this pull request? This changes the initialization of `SparkEnv.memoryManager` to after the `DriverPlugin` is loaded, to allow the plugin to customize memory related configurations. A minor fix has been made to `Task` to make sure that it uses the same `BlockManager` through out the task execution. Previous a different `BlockManager` could be used in some corner cases. Also added a test for the fix. ### Why are the changes needed? Today, there is no way for a custom `DriverPlugin` to override memory configurations such as `spark.executor.memory`, `spark.executor.memoryOverhead`, `spark.memory.offheap.size` etc This is because the memory manager is initialized before `DriverPlugin` is loaded. A similar change has been made to `shuffleManager` in apache#43627. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Also added new tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#45052 from sunchao/SPARK-46947. Authored-by: Chao Sun <sunchao@apache.org> Signed-off-by: Chao Sun <sunchao@apache.org>

Ngone51 · 2024-06-05T07:19:41Z

core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala

+              releaseTaskSem.acquire()
+            } catch {
+              case _: InterruptedException =>
+                // ignore thread interruption


@sunchao What's the purpose to leave a running task from the old SparkContext?

Ok I see it's probably for the correct usage of BlockManager for the local test with SparkContext restart.

initial commit

b38a321

github-actions bot added the CORE label Feb 6, 2024

dongjoon-hyun reviewed Feb 6, 2024

View reviewed changes

core/src/main/scala/org/apache/spark/SparkContext.scala Show resolved Hide resolved

dongjoon-hyun reviewed Feb 6, 2024

View reviewed changes

core/src/main/scala/org/apache/spark/SparkEnv.scala Show resolved Hide resolved

dongjoon-hyun reviewed Feb 6, 2024

View reviewed changes

update

23c66af

dongjoon-hyun mentioned this pull request Feb 7, 2024

[SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order #43627

Closed

fix

1fd8807

sunchao force-pushed the SPARK-46947 branch from 48b4088 to 1fd8807 Compare February 7, 2024 16:47

github-actions bot added the DSTREAM label Feb 7, 2024

sunchao added 3 commits February 14, 2024 23:40

fix

638bb65

Revert "fix"

b79f95d

This reverts commit 1fd8807.

add tests

b7f790c

github-actions bot removed the DSTREAM label Feb 15, 2024

mridulm reviewed Feb 15, 2024

View reviewed changes

stop SparkContext

a6502e0

sunchao force-pushed the SPARK-46947 branch from 3e201f3 to 0022fcb Compare February 16, 2024 22:28

sunchao marked this pull request as ready for review February 20, 2024 18:45

sunchao changed the title ~~[WIP][SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded~~ [SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded Feb 20, 2024

mridulm reviewed Feb 21, 2024

View reviewed changes

dongjoon-hyun approved these changes Feb 23, 2024

View reviewed changes

tgravescs reviewed Feb 23, 2024

View reviewed changes

sunchao added 2 commits February 23, 2024 13:56

set env in Task

ac21273

clean up

a980e07

sunchao force-pushed the SPARK-46947 branch from 13f8914 to a980e07 Compare February 23, 2024 21:57

sunchao added 2 commits February 23, 2024 16:32

try use BlockManager

af68ece

review comments

59a2acb

sunchao added 2 commits February 25, 2024 16:51

cleanup block manager test

0f41c4b

comments

00effe7

mridulm approved these changes Feb 26, 2024

View reviewed changes

sunchao closed this in 0ea318f Feb 26, 2024

sunchao mentioned this pull request Feb 29, 2024

feat: Introduce CometTaskMemoryManager and native side memory pool apache/datafusion-comet#83

Merged

mridulm mentioned this pull request Feb 29, 2024

[SPARK-47146][CORE] Possible thread leak when doing sort merge join #45327

Closed

ulysses-you mentioned this pull request May 29, 2024

[CORE][SPARK-4.0] Only require one config spark.plugins to enable gluten apache/incubator-gluten#5915

Open

Ngone51 reviewed Jun 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded #45052

[SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded #45052

sunchao commented Feb 6, 2024 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

sunchao commented Feb 6, 2024

mridulm left a comment

sunchao commented Feb 15, 2024

mridulm Feb 21, 2024

sunchao Feb 21, 2024

mridulm Feb 21, 2024

sunchao Feb 21, 2024

mridulm Feb 22, 2024

sunchao Feb 23, 2024

sunchao Feb 23, 2024

mridulm Feb 25, 2024

sunchao Feb 25, 2024

sunchao Feb 25, 2024

mridulm Feb 25, 2024 •

edited

Loading

sunchao Feb 25, 2024

mridulm commented Feb 21, 2024

dongjoon-hyun left a comment

tgravescs Feb 23, 2024

mridulm left a comment

tgravescs commented Feb 26, 2024

sunchao commented Feb 26, 2024

Ngone51 Jun 5, 2024

Ngone51 Jun 5, 2024 •

edited

Loading

[SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded #45052

[SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded #45052

Conversation

sunchao commented Feb 6, 2024 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun left a comment • edited Loading

Choose a reason for hiding this comment

sunchao commented Feb 6, 2024

mridulm left a comment

Choose a reason for hiding this comment

sunchao commented Feb 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mridulm Feb 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mridulm commented Feb 21, 2024

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mridulm left a comment

Choose a reason for hiding this comment

tgravescs commented Feb 26, 2024

sunchao commented Feb 26, 2024

Choose a reason for hiding this comment

Ngone51 Jun 5, 2024 • edited Loading

Choose a reason for hiding this comment

sunchao commented Feb 6, 2024 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

mridulm Feb 25, 2024 •

edited

Loading

Ngone51 Jun 5, 2024 •

edited

Loading