New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PARQUET-2429: Reduce direct input buffer churn #1270
Conversation
Currently input buffers are grown one chunk at a time as the compressor or decompressor receives successive setInput calls. When decompressing a 64MB block using a 4KB chunk size, this leads to thousands of allocations and deallocations totaling GBs of memory. By growing the buffer 2x each time, we avoid this and instead use a modest number of allocations.
4880577
to
2eaca26
Compare
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/codec/NonBlockedCompressor.java
Show resolved
Hide resolved
Is that a real use case? Usually we don't expect a page to be as large as this. |
I did encounter this in the real world, on some Snappy-compressed Parquet files that were written by Spark. I don't have access to the Spark cluster or job info, though, so unfortunately I don't have more details than that. |
Could you please make the CI happy? |
@gianm, I agree with @wgtmac's concern about the expected size. For compression/decompression we are targeting the page size. The page size is limited by two configs, |
I did encounter these in the real world, although it's always possible that they were built with some abnormally large values for some reason.
I'm ok with doing whichever. FWIW, the setting |
@gianm, Page size is managed by ParquetProperties.getPageSizeThreshold(), the default value is ParquetProperties.DEFAULT_PAGE_SIZE. |
@gszadovszky I'm trying to switch the codecs to use I don't see a way to get the relevant Any suggestions are welcome. I could also go back to the approach where the initial buffer size isn't configurable, and hard-code it at 4KB or 1MB or what seems most reasonable. With the doubling-every-allocation approach introduced in this patch, it isn't going to be the end of the world if the initial size is too small. |
In this case I wouldn't spend to much time on actually passing the configured value, and as you said, it might not even possible because of the caching. |
This reverts commit 996a1e9.
OK, thanks for the feedback. I have pushed up a change to start with the max of 4KB and the initial chunk passed to |
if (inputBuffer.capacity() == 0) { | ||
newBufferSize = Math.max(INITIAL_INPUT_BUFFER_SIZE, len); | ||
} else { | ||
newBufferSize = Math.max(inputBuffer.position() + len, inputBuffer.capacity() * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we set an upper bound to it instead of blindly doubling the capacity? In the new code, we may see much larger peak memory compared to the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some analysis:
With doubling, peak memory usage could be up to about double the size of really required memory.
If the target size is 64MB (the abnormally large size that I encountered in the wild), starting at 4KB and doubling gets us there in 14 iterations, allocating and deallocating 134MB of total memory.
We could set an upper bound for each allocation at 1MB, so peak memory usage would be at most 1MB more than the amount of really required memory. If we start at 4KB and double up to 1MB, then go in 1MB increments, we get there in 71 iterations, allocating and deallocating 2GB of total memory.
We could also use * 1.2
instead of * 2
, which would make the peak memory usage at most 20% of the amount of really required memory. Starting at 4KB and increasing by 20% each allocation gets us there in 53 iterations, allocating and deallocating 380MB of total memory.
Perhaps 20% growth is a good balance, since it still gets us to target pretty quickly compared to using a 1MB cap, and peak memory usage is at most 20% higher than what is really needed. Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the through analysis! * 1.2
sounds reasonable to me. Did you have the time spent on different strides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't measured it, but probably the time spent is a function of the number of iterations and the total amount of memory allocated and deallocated. Compared to what I was seeing without any minimum-increase factor at all, * 1.2
and * 2
are both really big improvements.
I just changed the patch to do * 1.2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
Hi- would it be possible to commit this, now that it's approved? |
Sorry for the delay. I just merged this. Thanks! |
thank you! |
Addresses https://issues.apache.org/jira/browse/PARQUET-2429.
Currently input buffers are grown one chunk at a time as the compressor or decompressor receives successive setInput calls. When decompressing a 64MB block using a 4KB chunk size, this leads to thousands of allocations and deallocations totaling GBs of memory. By growing the buffer 2x each time, we avoid this and instead use a modest number of allocations.