-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-6910: [C++][Python] Set jemalloc default configuration to release dirty pages more aggressively back to the OS dirty_decay_ms and muzzy_decay_ms to 0 by default, add C++ / Python option to configure this #5701
Conversation
@ursabot benchmark |
AMD64 Ubuntu 18.04 C++ Benchmark (#71877) builder failed. Revision: e0b2a70 Archery:
|
@fsaintjacques seems that the benchmark command got broken |
Guess it could be a legitimate failure. I'll look later |
Have you tried "background_thread:true" instead, as mentioned in jemalloc/jemalloc#1128 ? I fear that returning memory immediately to the OS (which setting the "decay" options to 0 may be doing) may reduce performance on some workloads. |
I tried background_thread:true with 1 second decay instead of 10 seconds (which is recommended in that issue thread) and here's the output, with some 1 second sleep inserted
Does this seem acceptable? It seems to do a better job of not letting peak RSS get out of hand |
(I don't know how much CPU the background thread takes up, hopefully not a lot) |
Probably less than doing things eagerly (also it should avoid adding latency to the main program). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a couple details.
cpp/src/arrow/memory_pool.h
Outdated
/// \brief Set jemalloc memory page purging behavior for future-created arenas | ||
/// to the indicated number of milliseconds. See dirty_decay_ms and | ||
/// muzzy_decay_ms options in jemalloc for a description of what these do. The | ||
/// default is configured to 0 which releases memory more aggressively to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update comment? It doesn't default to 0 anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} \ | ||
} while (0) | ||
|
||
Status jemalloc_set_decay_ms(int ms) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for this? The mallctl
calls are not trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
("oversize_threshold:0," | ||
"dirty_decay_ms:1000," | ||
"muzzy_decay_ms:1000," | ||
"background_thread:true"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add a comment explaining how we came to these settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Yes, I think that's the main goal here. |
Thanks for the review. I'll update the comments and add a test. It seems that this change wreaks havoc on the Plasma valgrind tests. I'm just going to disable valgrind in the Plasma Python tests and hope for the best. |
Hmm, I wasn't aware that Valgrind was still enabled somewhere in our test suite. Are you sure that is the case? |
https://github.com/apache/arrow/blob/master/.travis.yml#L97 I don't think it's worth the effort of figuring out how to fix this. |
Just to confirm, it is a failed assert more than an archery benchmark failure? |
@wesm Changes look good from my side. It would be nice when you could do a mini-blogpost somewhere. These microoptimizations are a really good point why someone should use Arrow's own implementation around the memory format. |
…unction to set the values to something else
@ursabot benchmark |
AMD64 Ubuntu 18.04 C++ Benchmark (#72178) builder failed. Revision: 9290470 Archery:
|
Now some OS X / C Glib build fails for another reason that I don't understand (can't find LLVM?): I'd be tempted to merge anyway, as it seems clear that this build is currently broken, including on master. @kou please voice up quickly if you oppose it. (also cc @xhochy ) |
AppVeyor build: https://ci.appveyor.com/project/pitrou/arrow/builds/28288488 |
This reverts commit ab67abb.
I'll merge without the protobuf warnings fix, actually, since those warnings also fail on git master. |
No objection. We can fix it later. |
…se dirty pages more aggressively back to the OS dirty_decay_ms and muzzy_decay_ms to 0 by default, add C++ / Python option to configure this The current default behavior causes applications dealing in large datasets to hold on to a large amount of physical operating system memory. While this may improve performance in some cases, it empirically seems to be causing problems for users. There's some discussion of this issue in some other contexts here jemalloc/jemalloc#1128 Here is a test script I used to check the RSS while reading a large Parquet file (~10GB in memory) in a loop (requires downloading the file http://public-parquet-test-data.s3.amazonaws.com/big.snappy.parquet) https://gist.github.com/wesm/c75ad3b6dcd37231aaacf56a80a5e401 This patch enables jemalloc background page reclamation and reduces the time decay from 10 seconds to 1 second so that memory is returned to the OS more aggressively. Closes apache#5701 from wesm/ARROW-6910 and squashes the following commits: 8fc8aa8 <Antoine Pitrou> Revert "Try to fix protobuf-related clang warning" ab67abb <Antoine Pitrou> Try to fix protobuf-related clang warning 9290470 <Wes McKinney> Review comments, disable PLASMA_VALGRIND in Travis 8c4d367 <Wes McKinney> Use background_thread:true and 1000ms decay daa5416 <Wes McKinney> Set jemalloc dirty_decay_ms and muzzy_decay_ms to 0 by default, add function to set the values to something else Lead-authored-by: Wes McKinney <wesm+git@apache.org> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
The current default behavior causes applications dealing in large datasets to hold on to a large amount of physical operating system memory. While this may improve performance in some cases, it empirically seems to be causing problems for users.
There's some discussion of this issue in some other contexts here
jemalloc/jemalloc#1128
Here is a test script I used to check the RSS while reading a large Parquet file (~10GB in memory) in a loop (requires downloading the file http://public-parquet-test-data.s3.amazonaws.com/big.snappy.parquet)
https://gist.github.com/wesm/c75ad3b6dcd37231aaacf56a80a5e401
This patch enables jemalloc background page reclamation and reduces the time decay from 10 seconds to 1 second so that memory is returned to the OS more aggressively.