Skip to content

Commit

Permalink
[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in Byt…
Browse files Browse the repository at this point in the history
…esToBytesMap

## What changes were proposed in this pull request?

BytesToBytesMap currently does not release the in-memory storage (the longArray variable) after it spills to disk. This is typically not a problem during aggregation because the longArray should be much smaller than the pages, and because we grow the longArray at a conservative rate.

However this can lead to an OOM when an already running task is allocated more than its fair share, this can happen because of a scheduling delay. In this case the longArray can grow beyond the fair share of memory for the task. This becomes problematic when the task spills and the long array is not freed, that causes subsequent memory allocation requests to be denied by the memory manager resulting in an OOM.

This PR fixes this issuing by freeing the longArray when the BytesToBytesMap spills.

## How was this patch tested?

Existing tests and tested on realworld workloads.

Author: Jie Xiong <jiexiong@fb.com>
Author: jiexiong <jiexiong@gmail.com>

Closes #15722 from jiexiong/jie_oom_fix.

(cherry picked from commit c496d03)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
  • Loading branch information
Jie Xiong authored and hvanhovell committed Dec 7, 2016
1 parent f5c5a07 commit e05ad88
Showing 1 changed file with 5 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,8 @@ public final class BytesToBytesMap extends MemoryConsumer {

private long peakMemoryUsedBytes = 0L;

private final int initialCapacity;

private final BlockManager blockManager;
private final SerializerManager serializerManager;
private volatile MapIterator destructiveIterator = null;
Expand Down Expand Up @@ -201,6 +203,7 @@ public BytesToBytesMap(
throw new IllegalArgumentException("Page size " + pageSizeBytes + " cannot exceed " +
TaskMemoryManager.MAXIMUM_PAGE_SIZE_BYTES);
}
this.initialCapacity = initialCapacity;
allocate(initialCapacity);
}

Expand Down Expand Up @@ -897,12 +900,12 @@ public LongArray getArray() {
public void reset() {
numKeys = 0;
numValues = 0;
longArray.zeroOut();

freeArray(longArray);
while (dataPages.size() > 0) {
MemoryBlock dataPage = dataPages.removeLast();
freePage(dataPage);
}
allocate(initialCapacity);
currentPage = null;
pageCursor = 0;
}
Expand Down

0 comments on commit e05ad88

Please sign in to comment.