-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21860][core]Improve memory reuse for heap memory in HeapMemoryAllocator
#19077
Conversation
This and the JIRA need a better title |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel so qualified to review this but it does make sense. In practice, is it common to allocate arrays of size +/- 8 bytes? just wondering if we have any idea of the impact
@@ -47,24 +47,25 @@ private boolean shouldPool(long size) { | |||
|
|||
@Override | |||
public MemoryBlock allocate(long size) throws OutOfMemoryError { | |||
if (shouldPool(size)) { | |||
int arraySize = (int)((size + 7) / 8); | |||
if (shouldPool(arraySize * 8)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factor out arraySize * 8
as alignedSize
or something
f168325
to
6862403
Compare
HeapMemoryAllocator
Test build #81208 has finished for PR 19077 at commit
|
Test build #81212 has finished for PR 19077 at commit
|
6862403
to
b00b685
Compare
Test build #81251 has finished for PR 19077 at commit
|
if (pool != null) { | ||
while (!pool.isEmpty()) { | ||
final WeakReference<MemoryBlock> blockReference = pool.pop(); | ||
final MemoryBlock memory = blockReference.get(); | ||
if (memory != null) { | ||
assert (memory.size() == size); | ||
assert ((int)((memory.size() + 7) / 8) == arraySize); | ||
memory.resetSize(size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, from my understanding the size of MemoryBlock
is always the actual size, not the aligned size, so looks like we dont need to reset the size here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, MemoryBlock
is always the actual size, if we reuse the previous memory,we should use the actual size to modify the size of MemoryBlock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it, thanks for the explanation.
Can you please add some unit test to verify your changes. |
@jerryshao Thanks,i will add unit tests. |
fc8b895
to
ba5717e
Compare
Test build #81272 has finished for PR 19077 at commit
|
Test build #81273 has finished for PR 19077 at commit
|
This PR generally looks fine to me, my concern is that will this change bring in subtle impact on the code which leverage it. CC @JoshRosen to take a review. |
Just curious: do you know where are we allocating these close-in-size chunks of memory? I understand the motivation, but just curious to know what's causing this pattern. I think the original idea here was that most allocations would come from a small set of sizes (usually the page size, or a configurable buffer size) and would not generally be arbitrary sized allocations. |
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) { | |||
|
|||
@Override | |||
public MemoryBlock allocate(long size) throws OutOfMemoryError { | |||
if (shouldPool(size)) { | |||
int arraySize = (int)((size + 7) / 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to use ByteAraryMethods.roundNumberOfBytesToNearestWord
for this, which we'e done for similar rounding elsewhere. Makes it a bit easier to spot what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the type of input parameter for roundNumberOfBytesToNearestWord
is int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make the method to tackle long values.
@jerryshao @JoshRosen yes, it would not generally be arbitrary sized allocations. Basically, we allocate memory in multiples of 4 or 8 bytes,even so, I think this change is also beneficial . |
ba5717e
to
729df24
Compare
@@ -47,23 +48,29 @@ private boolean shouldPool(long size) { | |||
|
|||
@Override | |||
public MemoryBlock allocate(long size) throws OutOfMemoryError { | |||
if (shouldPool(size)) { | |||
long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. POOLING_THRESHOLD_BYTES
- 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah,I think it's acceptable
@@ -48,6 +48,13 @@ public long size() { | |||
} | |||
|
|||
/** | |||
* Reset the size of the memory block. | |||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks,i will add a check.
729df24
to
0c6647c
Compare
Test build #81491 has finished for PR 19077 at commit
|
Test build #81492 has finished for PR 19077 at commit
|
retest this please |
Test build #81508 has finished for PR 19077 at commit
|
gentle ping @jerryshao for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change itself looks fine to me. However, I'm not sure if there's any potential impact on the code which relies on it, hopes someone could take another look.
val unsafeArraySizeInBytes = | ||
UnsafeArrayData.calculateHeaderPortionInBytes(numElements) + | ||
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) | ||
ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: why don't we inline this instead of creating a new variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of this line is larger than 200 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really inline that.
…MemoryAllocator` apache#19077 In HeapMemoryAllocator, when allocating memory from pool, and the key of pool is memory size. Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes. In this case, we can improve memory reuse.
PR-apache#19077 introduced a Java style error (too long line). Quick fix.
What changes were proposed in this pull request?
In
HeapMemoryAllocator
, when allocating memory from pool, and the key of pool is memory size.Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes.
In this case, we can improve memory reuse.
How was this patch tested?
Existing tests and added unit tests