Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21860][core]Improve memory reuse for heap memory in HeapMemoryAllocator #19077

Closed
wants to merge 1 commit into from

Conversation

10110346
Copy link
Contributor

@10110346 10110346 commented Aug 29, 2017

What changes were proposed in this pull request?

In HeapMemoryAllocator, when allocating memory from pool, and the key of pool is memory size.
Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes.
In this case, we can improve memory reuse.

How was this patch tested?

Existing tests and added unit tests

@srowen
Copy link
Member

srowen commented Aug 29, 2017

This and the JIRA need a better title

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel so qualified to review this but it does make sense. In practice, is it common to allocate arrays of size +/- 8 bytes? just wondering if we have any idea of the impact

@@ -47,24 +47,25 @@ private boolean shouldPool(long size) {

@Override
public MemoryBlock allocate(long size) throws OutOfMemoryError {
if (shouldPool(size)) {
int arraySize = (int)((size + 7) / 8);
if (shouldPool(arraySize * 8)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factor out arraySize * 8 as alignedSize or something

@10110346 10110346 changed the title [SPARK-21860][core]Optimize heap memory allocator [SPARK-21860][core]Improve memory reuse for heap memory in HeapMemoryAllocator Aug 29, 2017
@SparkQA
Copy link

SparkQA commented Aug 29, 2017

Test build #81208 has finished for PR 19077 at commit f168325.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 29, 2017

Test build #81212 has finished for PR 19077 at commit 6862403.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81251 has finished for PR 19077 at commit b00b685.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (pool != null) {
while (!pool.isEmpty()) {
final WeakReference<MemoryBlock> blockReference = pool.pop();
final MemoryBlock memory = blockReference.get();
if (memory != null) {
assert (memory.size() == size);
assert ((int)((memory.size() + 7) / 8) == arraySize);
memory.resetSize(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, from my understanding the size of MemoryBlock is always the actual size, not the aligned size, so looks like we dont need to reset the size here.

Copy link
Contributor Author

@10110346 10110346 Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, MemoryBlock is always the actual size, if we reuse the previous memory,we should use the actual size to modify the size of MemoryBlock

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it, thanks for the explanation.

@jerryshao
Copy link
Contributor

Can you please add some unit test to verify your changes.

@10110346
Copy link
Contributor Author

@jerryshao Thanks,i will add unit tests.

@10110346 10110346 force-pushed the headmemoptimize branch 2 times, most recently from fc8b895 to ba5717e Compare August 31, 2017 01:43
@SparkQA
Copy link

SparkQA commented Aug 31, 2017

Test build #81272 has finished for PR 19077 at commit fc8b895.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 31, 2017

Test build #81273 has finished for PR 19077 at commit ba5717e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor

This PR generally looks fine to me, my concern is that will this change bring in subtle impact on the code which leverage it.

CC @JoshRosen to take a review.

@JoshRosen
Copy link
Contributor

Just curious: do you know where are we allocating these close-in-size chunks of memory? I understand the motivation, but just curious to know what's causing this pattern. I think the original idea here was that most allocations would come from a small set of sizes (usually the page size, or a configurable buffer size) and would not generally be arbitrary sized allocations.

@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {

@Override
public MemoryBlock allocate(long size) throws OutOfMemoryError {
if (shouldPool(size)) {
int arraySize = (int)((size + 7) / 8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to use ByteAraryMethods.roundNumberOfBytesToNearestWord for this, which we'e done for similar rounding elsewhere. Makes it a bit easier to spot what's happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the type of input parameter for roundNumberOfBytesToNearestWord is int

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make the method to tackle long values.

@10110346
Copy link
Contributor Author

10110346 commented Sep 1, 2017

@jerryshao @JoshRosen yes, it would not generally be arbitrary sized allocations. Basically, we allocate memory in multiples of 4 or 8 bytes,even so, I think this change is also beneficial .
Also,I think this change will not impact on the code which leverage it, because MemoryBlock is not changed

@@ -47,23 +48,29 @@ private boolean shouldPool(long size) {

@Override
public MemoryBlock allocate(long size) throws OutOfMemoryError {
if (shouldPool(size)) {
long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. POOLING_THRESHOLD_BYTES - 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah,I think it's acceptable

@@ -48,6 +48,13 @@ public long size() {
}

/**
* Reset the size of the memory block.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,i will add a check.

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81491 has finished for PR 19077 at commit 729df24.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81492 has finished for PR 19077 at commit 0c6647c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@10110346
Copy link
Contributor Author

10110346 commented Sep 7, 2017

retest this please

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81508 has finished for PR 19077 at commit 0c6647c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Oct 6, 2017

gentle ping @jerryshao for review

Copy link
Contributor

@jerryshao jerryshao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself looks fine to me. However, I'm not sure if there's any potential impact on the code which relies on it, hopes someone could take another look.

val unsafeArraySizeInBytes =
UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: why don't we inline this instead of creating a new variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of this line is larger than 200 bytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really inline that.

zzcclp added a commit to zzcclp/spark that referenced this pull request Sep 20, 2019
…MemoryAllocator` apache#19077

In HeapMemoryAllocator, when allocating memory from pool, and the key of pool is memory size.
Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes.
In this case, we can improve memory reuse.
zzcclp added a commit to zzcclp/spark that referenced this pull request Sep 20, 2019
PR-apache#19077 introduced a Java style error (too long line). Quick fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants