Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25081][Core]Nested spill in ShuffleExternalSorter should not access released memory page (branch-2.2) #22072

Closed
wants to merge 1 commit into from

Commits on Aug 10, 2018

  1. Nested spill in ShuffleExternalSorter should not access released memo…

    …ry page
    
    This issue is pretty similar to [SPARK-21907](https://issues.apache.org/jira/browse/SPARK-21907).
    
    "allocateArray" in [ShuffleInMemorySorter.reset](https://github.com/apache/spark/blob/9b8521e53e56a53b44c02366a99f8a8ee1307bbf/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java#L99) may trigger a spill and cause ShuffleInMemorySorter access the released `array`. Another task may get the same memory page from the pool. This will cause two tasks access the same memory page. When a task reads memory written by another task, many types of failures may happen. Here are some examples I  have seen:
    
    - JVM crash. (This is easy to reproduce in a unit test as we fill newly allocated and deallocated memory with 0xa5 and 0x5a bytes which usually points to an invalid memory address)
    - java.lang.IllegalArgumentException: Comparison method violates its general contract!
    - java.lang.NullPointerException at org.apache.spark.memory.TaskMemoryManager.getPage(TaskMemoryManager.java:384)
    - java.lang.UnsupportedOperationException: Cannot grow BufferHolder by size -536870912 because the size after growing exceeds size limitation 2147483632
    
    This PR resets states in `ShuffleInMemorySorter.reset` before calling `allocateArray` to fix the issue.
    
    The new unit test will make JVM crash without the fix.
    
    Closes apache#22062 from zsxwing/SPARK-25081.
    
    Authored-by: Shixiong Zhu <zsxwing@gmail.com>
    Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
    zsxwing committed Aug 10, 2018
    Copy the full SHA
    1a6452e View commit details
    Browse the repository at this point in the history