-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13210] [SQL] catch OOM when allocate memory and expand array #11095
Conversation
cc @JoshRosen |
try { | ||
page = memoryManager.tungstenMemoryAllocator().allocate(acquired); | ||
} catch (OutOfMemoryError e) { | ||
// there is no enough memory actually, it means the actual free memory is smaller than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should log something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INFO or WARN ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do WARN
Test build #50835 has finished for PR 11095 at commit
|
Test build #50845 has finished for PR 11095 at commit
|
@JoshRosen Does this look good to you? |
This change makes sense to me. An OOM in This strategy effectively counts the unmanaged memory as belonging to an arbitrary task (the one that triggered the OOM) and assumes that the memory will not be freed until that task finishes. This isn't perfectly accurate, but I don't really see how we can do much better: we don't have any clue as to where the memory came from, so if we didn't attribute it to an arbitrary task then we'd have the problem of determining when to consider the arbitrary memory to be freed. I suppose we could try to measure the difference between the JVM's total memory usage and the sum of all tasks' managed memory in order to estimate the amount of unmanaged memory, but that approach seems much more complex and might not even be possible. Therefore, this seems reasonable to me. Also, note that this should't really come into play unless |
@@ -75,6 +75,9 @@ public void testBasicSorting() throws Exception { | |||
// Write the records into the data page and store pointers into the sorter | |||
long position = dataPage.getBaseOffset(); | |||
for (String str : dataToSort) { | |||
if (!sorter.hasSpaceForAnotherRecord()) { | |||
sorter.expandPointerArray(consumer.allocateArray(sorter.numRecords() * 2 * 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should only be * 2
because the shuffle sorter only uses one array entry per record instead of the pair of entires which is used by the more general prefix sorter. I can fix this up myself on merge.
LGTM, so I'm merging to master and cherry-picking to branch-1.6. |
There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid. The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling. And also, we could not grow the array in `insertRecord` of `InMemorySorter` (it was there just for easy testing). Author: Davies Liu <davies@databricks.com> Closes #11095 from davies/fix_expand.
allocatedPages.clear(pageNumber); | ||
} | ||
// this could trigger spilling to free some pages. | ||
return allocatePage(size, consumer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just now saw this commit but:
Is this tail recursion a problem? if you keep running out of memory it keeps calling itself.
acquiredButNotUsed += acquired
needs to be synchronized too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are continue hold some memory, the amount of free memory should become smaller and smaller, it will fail to acquire soon.
yes, it's better to move acquiredButNotUsed += acquired
into the synchronized sections. If acquiredButNotUsed is not calculated correctly (because risk conditions), you will only saw an warning message at the end of a task.
## What changes were proposed in this pull request? This change fixes the executor OOM which was recently introduced in PR #11095 (Please fill in changes proposed in this fix) ## How was this patch tested? Tested by running a spark job on the cluster. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) … Sorter Author: Sital Kedia <skedia@fb.com> Closes #11794 from sitalkedia/SPARK-13958. (cherry picked from commit 2e0c528) Signed-off-by: Davies Liu <davies.liu@gmail.com>
## What changes were proposed in this pull request? This change fixes the executor OOM which was recently introduced in PR #11095 (Please fill in changes proposed in this fix) ## How was this patch tested? Tested by running a spark job on the cluster. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) … Sorter Author: Sital Kedia <skedia@fb.com> Closes #11794 from sitalkedia/SPARK-13958.
## What changes were proposed in this pull request? This change fixes the executor OOM which was recently introduced in PR apache#11095 (Please fill in changes proposed in this fix) ## How was this patch tested? Tested by running a spark job on the cluster. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) … Sorter Author: Sital Kedia <skedia@fb.com> Closes apache#11794 from sitalkedia/SPARK-13958.
There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.
The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.
And also, we could not grow the array in
insertRecord
ofInMemorySorter
(it was there just for easy testing).