Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13210] [SQL] catch OOM when allocate memory and expand array #11095

Closed
wants to merge 2 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Feb 5, 2016

There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.

The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.

And also, we could not grow the array in insertRecord of InMemorySorter (it was there just for easy testing).

@davies
Copy link
Contributor Author

davies commented Feb 5, 2016

cc @JoshRosen

try {
page = memoryManager.tungstenMemoryAllocator().allocate(acquired);
} catch (OutOfMemoryError e) {
// there is no enough memory actually, it means the actual free memory is smaller than
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should log something here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFO or WARN ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do WARN

@SparkQA
Copy link

SparkQA commented Feb 5, 2016

Test build #50835 has finished for PR 11095 at commit ff77170.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 6, 2016

Test build #50845 has finished for PR 11095 at commit 7ec7660.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Feb 6, 2016

@JoshRosen Does this look good to you?

@JoshRosen
Copy link
Contributor

This change makes sense to me. An OOM in allocatePage means that user-code (or some other source) is using memory that is not tracked by Spark's memory manager, so Spark mistakenly believes that it has more available memory for managed use than it actually does. The key idea behind this patch is that the unaccounted-for-memory-use can be handled by updating the memory bookkeping structures after an OOM: if we OOM, we count the size of the original failed request as acquiredButNotUsed and then re-request, which will cause Spark to evict / spill pages because its own estimate of how much managed memory is available will now be more accurate.

This strategy effectively counts the unmanaged memory as belonging to an arbitrary task (the one that triggered the OOM) and assumes that the memory will not be freed until that task finishes. This isn't perfectly accurate, but I don't really see how we can do much better: we don't have any clue as to where the memory came from, so if we didn't attribute it to an arbitrary task then we'd have the problem of determining when to consider the arbitrary memory to be freed. I suppose we could try to measure the difference between the JVM's total memory usage and the sum of all tasks' managed memory in order to estimate the amount of unmanaged memory, but that approach seems much more complex and might not even be possible.

Therefore, this seems reasonable to me. Also, note that this should't really come into play unless spark.memoryFraction is inaccurate.

@@ -75,6 +75,9 @@ public void testBasicSorting() throws Exception {
// Write the records into the data page and store pointers into the sorter
long position = dataPage.getBaseOffset();
for (String str : dataToSort) {
if (!sorter.hasSpaceForAnotherRecord()) {
sorter.expandPointerArray(consumer.allocateArray(sorter.numRecords() * 2 * 2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should only be * 2 because the shuffle sorter only uses one array entry per record instead of the pair of entires which is used by the more general prefix sorter. I can fix this up myself on merge.

@JoshRosen
Copy link
Contributor

LGTM, so I'm merging to master and cherry-picking to branch-1.6.

@asfgit asfgit closed this in 37bc203 Feb 8, 2016
asfgit pushed a commit that referenced this pull request Feb 8, 2016
There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.

The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.

And also, we could not grow the array in `insertRecord` of `InMemorySorter` (it was there just for easy testing).

Author: Davies Liu <davies@databricks.com>

Closes #11095 from davies/fix_expand.
allocatedPages.clear(pageNumber);
}
// this could trigger spilling to free some pages.
return allocatePage(size, consumer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just now saw this commit but:
Is this tail recursion a problem? if you keep running out of memory it keeps calling itself.
acquiredButNotUsed += acquired needs to be synchronized too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are continue hold some memory, the amount of free memory should become smaller and smaller, it will fail to acquire soon.

yes, it's better to move acquiredButNotUsed += acquired into the synchronized sections. If acquiredButNotUsed is not calculated correctly (because risk conditions), you will only saw an warning message at the end of a task.

asfgit pushed a commit that referenced this pull request Mar 18, 2016
## What changes were proposed in this pull request?

This change fixes the executor OOM which was recently introduced in PR #11095
(Please fill in changes proposed in this fix)

## How was this patch tested?
Tested by running a spark job on the cluster.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

… Sorter

Author: Sital Kedia <skedia@fb.com>

Closes #11794 from sitalkedia/SPARK-13958.

(cherry picked from commit 2e0c528)
Signed-off-by: Davies Liu <davies.liu@gmail.com>
asfgit pushed a commit that referenced this pull request Mar 18, 2016
## What changes were proposed in this pull request?

This change fixes the executor OOM which was recently introduced in PR #11095
(Please fill in changes proposed in this fix)

## How was this patch tested?
Tested by running a spark job on the cluster.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

… Sorter

Author: Sital Kedia <skedia@fb.com>

Closes #11794 from sitalkedia/SPARK-13958.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
## What changes were proposed in this pull request?

This change fixes the executor OOM which was recently introduced in PR apache#11095
(Please fill in changes proposed in this fix)

## How was this patch tested?
Tested by running a spark job on the cluster.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

… Sorter

Author: Sital Kedia <skedia@fb.com>

Closes apache#11794 from sitalkedia/SPARK-13958.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants