-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31952][SQL]Fix incorrect memory spill metric when doing Aggregate #28780
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -104,11 +104,13 @@ public static UnsafeExternalSorter createWithExistingInMemorySorter( | |
int initialSize, | ||
long pageSizeBytes, | ||
int numElementsForSpillThreshold, | ||
UnsafeInMemorySorter inMemorySorter) throws IOException { | ||
UnsafeInMemorySorter inMemorySorter, | ||
long existingMemoryConsumption) throws IOException { | ||
UnsafeExternalSorter sorter = new UnsafeExternalSorter(taskMemoryManager, blockManager, | ||
serializerManager, taskContext, recordComparatorSupplier, prefixComparator, initialSize, | ||
pageSizeBytes, numElementsForSpillThreshold, inMemorySorter, false /* ignored */); | ||
sorter.spill(Long.MAX_VALUE, sorter); | ||
taskContext.taskMetrics().incMemoryBytesSpilled(existingMemoryConsumption); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just for reference, where do we update the disk spill metrics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can see it at the end of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the end of |
||
// The external sorter will be used to insert records, in-memory sorter is not needed. | ||
sorter.inMemSorter = null; | ||
return sorter; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why didn't you set this value in the caller side (
UnsafeKVExternalSorter
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong preference on this. One benefits is that others who calls
createWithExistingInMemorySorter
will not forget to update memory spilled any more. (though onlyUnsafeKVExternalSorter
used this function currently)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @maropu @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the same to use
inMemorySorter.getMemoryUsage
?Also, shall we update
sorter.totalSpillBytes
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, if we can do
final long spillSize = freeMemory() + inMemorySorter.getMemoryUsage()
insorter.spill()
?Seem like the size of inMemorySorter is included in log info by
getMemoryUsage()
, whilespillSize
doesn't.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inMemorySorter.getMemoryUsage
.When we insert a record into
UnsafeExternalSorter
, the record itself is copied to sorter'sallocatedPages
, and stores the memory address intoInMemorySorter
. So when we spill it, the memory size is the sum ofinMemorySorter.getMemoryUsage
and
allocatePages
, that's what it does inUnsafeExternalSorter.spill
.But when we do
sorter.spill
inUnsafeExternalSorter.createWithExistingInMemorySorter
, the sorter'sallocatedPages
is empty since we didn't copy any records to it. The memory address ininMemorySorter
points to the memory pages inBytesToBytesMap
.The passed parameter denotes the size of memory pages in
BytesToBytesMap
.sorter.totalSpillBytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for your explanation.