[SPARK-43221][CORE][3.5] Host local block fetching should use a block status of a block stored on disk by attilapiros · Pull Request #50260 · apache/spark

attilapiros · 2025-03-12T21:08:03Z

This is a backport to branch-3.5 from master.

Thanks for @yorksity who reported this error and even provided a PR for it.
This solution very different from #40883 as BlockManagerMasterEndpoint#getLocationsAndStatus() needed some refactoring.

What changes were proposed in this pull request?

This PR fixes an error which can be manifested in the following exception:

25/02/20 09:58:31 ERROR util.Utils: [Executor task launch worker for task 61.0 in stage 67.0 (TID 9391)]: Exception encountered
java.lang.ArrayIndexOutOfBoundsException: 0
  at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:185) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2]
  at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23) ~[scala-library-2.12.15.jar:?]
  at scala.collection.immutable.List.foreach(List.scala:431) ~[scala-library-2.12.15.jar:?]
  at org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:171) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2]

The PR is changing BlockManagerMasterEndpoint#getLocationsAndStatus().

The BlockManagerMasterEndpoint#getLocationsAndStatus() function is giving back an optional BlockLocationsAndStatus which consist of 3 parts:

locations: all the locations where the block can be found (as a sequence of block manager IDs)
status: one block status
localDirs: optional directory paths which can be used to read block if the block is found in the disk of an executor running on the same host

The block (either RDD blocks, shuffle blocks or torrent blocks) can be stored in many executors with different storage levels: disk or memory.

This PR changing how the block status and the block manager ID for the localDirs is found to guarantee they belong together.

Why are the changes needed?

Before this PR the BlockManagerMasterEndpoint#getLocationsAndStatus() was searching for the block status (status) and the localDirs separately. The block status actually was computed as the very first one where the block can be found. This way it can easily happen this block status was representing an in-memory block (where the disk size is 0 as it is stored in the memory) but the localDirs was filled out based on a host local block instance which was stored on disk.

This situation can be very frequent but only causing problems (exceptions as above) when encryption is on (spark.io.encryption.enabled=true) as for a not encrypted block the whole file containing the block is read, see
https://github.com/apache/spark/blob/branch-3.5/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1244

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Host local block fetching was already covered by some existing unit tests but a new unit test is provided for this exact case: "SPARK-43221: Host local block fetching should use a block status with disk size".

The number of block mangers and the order of the blocks was chosen after some experimentation as the block status order is depends on a HashSet, see:

  private val blockLocations = new JHashMap[BlockId, mutable.HashSet[BlockManagerId]]

This test was executed with the old code too to validate the issue is reproduced:

BlockManagerSuite:
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
- SPARK-43221: Host local block fetching should use a block status with disk size *** FAILED ***
  0 was not greater than 0 The block size must be greater than 0 for a nonempty block! (BlockManagerSuite.scala:491)
Run completed in 6 seconds, 705 milliseconds.
Total number of tests run: 1
Suites: completed 1, aborted 0
Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***

Was this patch authored or co-authored using generative AI tooling?

No.

(cherry picked from commit 997e599)

…us of a block stored on disk Thanks for yorksity who reported this error and even provided a PR for it. This solution very different from apache#40883 as `BlockManagerMasterEndpoint#getLocationsAndStatus()` needed some refactoring. ### What changes were proposed in this pull request? This PR fixes an error which can be manifested in the following exception: ``` 25/02/20 09:58:31 ERROR util.Utils: [Executor task launch worker for task 61.0 in stage 67.0 (TID 9391)]: Exception encountered java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:185) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2] at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23) ~[scala-library-2.12.15.jar:?] at scala.collection.immutable.List.foreach(List.scala:431) ~[scala-library-2.12.15.jar:?] at org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:171) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2] ``` The PR is changing `BlockManagerMasterEndpoint#getLocationsAndStatus()`. The `BlockManagerMasterEndpoint#getLocationsAndStatus()` function is giving back an optional `BlockLocationsAndStatus` which consist of 3 parts: - `locations`: all the locations where the block can be found (as a sequence of block manager IDs) - `status`: one block status - `localDirs`: optional directory paths which can be used to read block if the block is found in the disk of an executor running on the same host The block (either RDD blocks, shuffle blocks or torrent blocks) can be stored in many executors with different storage levels: disk or memory. This PR changing how the block status and the block manager ID for the `localDirs` is found to guarantee they belong together. ### Why are the changes needed? Before this PR the `BlockManagerMasterEndpoint#getLocationsAndStatus()` was searching for the block status (`status`) and the `localDirs` separately. The block status actually was computed as the very first one where the block can be found. This way it can easily happen this block status was representing an in-memory block (where the disk size is 0 as it is stored in the memory) but the `localDirs` was filled out based on a host local block instance which was stored on disk. This situation can be very frequent but only causing problems (exceptions as above) when encryption is on (spark.io.encryption.enabled=true) as for a not encrypted block the whole file containing the block is read, see https://github.com/apache/spark/blob/branch-3.5/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1244 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Host local block fetching was already covered by some existing unit tests but a new unit test is provided for this exact case: "SPARK-43221: Host local block fetching should use a block status with disk size". The number of block mangers and the order of the blocks was chosen after some experimentation as the block status order is depends on a `HashSet`, see: ``` private val blockLocations = new JHashMap[BlockId, mutable.HashSet[BlockManagerId]] ``` This test was executed with the old code too to validate the issue is reproduced: ``` BlockManagerSuite: OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended - SPARK-43221: Host local block fetching should use a block status with disk size *** FAILED *** 0 was not greater than 0 The block size must be greater than 0 for a nonempty block! (BlockManagerSuite.scala:491) Run completed in 6 seconds, 705 milliseconds. Total number of tests run: 1 Suites: completed 1, aborted 0 Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0 *** 1 TEST FAILED *** ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50122 from attilapiros/SPARK-43221. Authored-by: attilapiros <piros.attila.zsolt@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 997e599)

dongjoon-hyun

Could you fix the compilation failure, @attilapiros ?

[error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala:865:10: value orElse is not a member of Equals
[error] possible cause: maybe a semicolon is missing before `value orElse'?
[error]         .orElse {
[error]          ^
[error] one error found

attilapiros · 2025-03-13T02:43:48Z

Fixed. It was a difference between Scala 2.12 vs 2.13.
The Option#zip gives back an Iterable on Scala 2.12 instead of an Option:

# scala
Welcome to Scala 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_442).
Type in expressions for evaluation. Or try :help.

scala> val option = Some(1)
option: Some[Int] = Some(1)

scala> option.zip(option)
res1: Iterable[(Int, Int)] = List((1,1))

scala> None.zip(None)
res2: Iterable[(Nothing, Nothing)] = List()

scala> res1.headOption
res3: Option[(Int, Int)] = Some((1,1))

scala> res2.headOption
res4: Option[(Nothing, Nothing)] = None

Although the documentation does not says differently: https://www.scala-lang.org/api/2.12.8/scala/Option.html#zip[B](that:scala.collection.GenIterable[B]):Option[(A,B)]

dongjoon-hyun

+1, LGTM (Pending CIs). Thank you, @attilapiros .

… status of a block stored on disk **This is a backport to branch-3.5 from master.** Thanks for yorksity who reported this error and even provided a PR for it. This solution very different from #40883 as `BlockManagerMasterEndpoint#getLocationsAndStatus()` needed some refactoring. ### What changes were proposed in this pull request? This PR fixes an error which can be manifested in the following exception: ``` 25/02/20 09:58:31 ERROR util.Utils: [Executor task launch worker for task 61.0 in stage 67.0 (TID 9391)]: Exception encountered java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:185) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2] at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23) ~[scala-library-2.12.15.jar:?] at scala.collection.immutable.List.foreach(List.scala:431) ~[scala-library-2.12.15.jar:?] at org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:171) ~[spark-core_2.12-3.3.2.3.3.7190.5-2.jar:3.3.2.3.3.7190.5-2] ``` The PR is changing `BlockManagerMasterEndpoint#getLocationsAndStatus()`. The `BlockManagerMasterEndpoint#getLocationsAndStatus()` function is giving back an optional `BlockLocationsAndStatus` which consist of 3 parts: - `locations`: all the locations where the block can be found (as a sequence of block manager IDs) - `status`: one block status - `localDirs`: optional directory paths which can be used to read block if the block is found in the disk of an executor running on the same host The block (either RDD blocks, shuffle blocks or torrent blocks) can be stored in many executors with different storage levels: disk or memory. This PR changing how the block status and the block manager ID for the `localDirs` is found to guarantee they belong together. ### Why are the changes needed? Before this PR the `BlockManagerMasterEndpoint#getLocationsAndStatus()` was searching for the block status (`status`) and the `localDirs` separately. The block status actually was computed as the very first one where the block can be found. This way it can easily happen this block status was representing an in-memory block (where the disk size is 0 as it is stored in the memory) but the `localDirs` was filled out based on a host local block instance which was stored on disk. This situation can be very frequent but only causing problems (exceptions as above) when encryption is on (spark.io.encryption.enabled=true) as for a not encrypted block the whole file containing the block is read, see https://github.com/apache/spark/blob/branch-3.5/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1244 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Host local block fetching was already covered by some existing unit tests but a new unit test is provided for this exact case: "SPARK-43221: Host local block fetching should use a block status with disk size". The number of block mangers and the order of the blocks was chosen after some experimentation as the block status order is depends on a `HashSet`, see: ``` private val blockLocations = new JHashMap[BlockId, mutable.HashSet[BlockManagerId]] ``` This test was executed with the old code too to validate the issue is reproduced: ``` BlockManagerSuite: OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended - SPARK-43221: Host local block fetching should use a block status with disk size *** FAILED *** 0 was not greater than 0 The block size must be greater than 0 for a nonempty block! (BlockManagerSuite.scala:491) Run completed in 6 seconds, 705 milliseconds. Total number of tests run: 1 Suites: completed 1, aborted 0 Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0 *** 1 TEST FAILED *** ``` ### Was this patch authored or co-authored using generative AI tooling? No. (cherry picked from commit 997e599) Closes #50260 from attilapiros/SPARK-43221_branch-3.5. Authored-by: attilapiros <piros.attila.zsolt@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun · 2025-03-13T13:42:55Z

Merged to branch-3.5.

github-actions bot added the CORE label Mar 12, 2025

attilapiros mentioned this pull request Mar 12, 2025

[SPARK-43221][CORE] Host local block fetching should use a block status of a block stored on disk #50122

Closed

dongjoon-hyun reviewed Mar 13, 2025

View reviewed changes

fix compile: Option.zip on Scala 2.12 returns an Iterable

63d128f

dongjoon-hyun approved these changes Mar 13, 2025

View reviewed changes

dongjoon-hyun closed this Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-43221][CORE][3.5] Host local block fetching should use a block status of a block stored on disk#50260

[SPARK-43221][CORE][3.5] Host local block fetching should use a block status of a block stored on disk#50260
attilapiros wants to merge 2 commits intoapache:branch-3.5from
attilapiros:SPARK-43221_branch-3.5

attilapiros commented Mar 12, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

attilapiros commented Mar 13, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

attilapiros commented Mar 12, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

attilapiros commented Mar 13, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants