-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap Profiler #2035
Heap Profiler #2035
Conversation
c3882d2
to
f1bc67d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me from a (arguably casual) glance.
This introduces a new C API: ``` TILEDB_EXPORT int32_t tiledb_heap_profiler_enable( const char* file_name_prefix, uint64_t dump_interval_ms, uint64_t dump_interval_bytes, uint64_t dump_threshold_bytes); ``` Executing the API does the following: - The singleton HeapProfiler is enabled, which may only be done once in the lifetime of the process. - On an uncaught `std::bad_alloc` (anywhere in the process) we will perform a stat dump and kill the process. - If `dump_interval_ms` is non-zero, a thread is started which will dump the heap stats periodically. - If `dump_interval_bytes` is non-zero, the next allocation that increases the lifetime total allocation byte size by this amount will perform a dump. - If `dump_threshold_bytes` non-zero, labeled allocations with a total alloc size less than this amount will be ignored. This is used to reduce verbosity of the dump for trivial allocations. This is dead code at the moment. The idea is that all dynamic memory allocations in our source code will be accompanied by a call HeapProfiler::record_alloc(). When called, this save stats. Additionally, failed dynamic memory allocations may invoke HeapProfiler::dump_and_terminate(). Similarly for dellocations. The HeapProfiler is responsible for: - Setting up the uncaught `std::bad_alloc` hook. - Setting up a periodic dump thread. - Reserving memory at initialization. This is an opaque block of dynamic memory that is freed when entering dump_and_terminate(). The intention is to release memory after a fail allocation to provide enough memory for the dump to be successful. - Keeps a "labels cache" which is basically an object pool of strings to prevent duplication of string labels. For example, a caller may do: ``` tiledb::common::heap_profiler.record_alloc(p1, size1, "reader.cc::1923"); tiledb::common::heap_profiler.record_alloc(p2, size2, "reader.cc::1923"); tiledb::common::heap_profiler.record_alloc(p3, size3, "reader.cc::1923"); ``` In this scenario, we don't want to haev 3x "reader.cc::1923" strings allocated. We share them in the HeapProfiler. The next step is to refactor our code base to find-and-replace all dynamic memory APIs with a macro/typedef that will record alloc/dealloc stats as they are used.
f1bc67d
to
1cad516
Compare
The history file needs updating |
Thanks -- I'll get it on the next related patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 Minor comment, otherwise LGTM.
void HeapProfiler::dump_and_terminate_internal() { | ||
// Free our reserved heap memory in an attempt to | ||
// ensure there is enough memory to dump the stats. | ||
free(reserved_memory_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good and useful as-is, but down the line we might want to do a simple bump allocator and placement new into this reserved space for the dump path data structures (we're still at the mercy of the memory manager if there's something else continuously eating memory on the system).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah -- agreed. My initial approach was to allocate everything up-front but I decided that the label map would be valuable and that trying to manage+tune for a bunch of strings and hashes would be more trouble than it's immediately worth. Additionally, we'll still be at the mercy of the system if anything else in the dump path needs heap memory (for example, opening + writing to the output file). Finally, we don't have robust hooks for when we actually run out of memory. For example: if a malloc fails elsewhere in the process, we won't catch it.
For all those reasons, I see the preferred profiling as a periodic/interval dump. If we're lucky, we'll also get a dump at the time of going out of memory.
This introduces a new C API:
Executing the API does the following:
lifetime of the process.
std::bad_alloc
(anywhere in the process) we will performa stat dump and kill the process.
dump_interval_ms
is non-zero, a thread is started which will dumpthe heap stats periodically.
dump_interval_bytes
is non-zero, the next allocation that increasesthe lifetime total allocation byte size by this amount will perform a dump.
dump_threshold_bytes
non-zero, labeled allocations with a totalalloc size less than this amount will be ignored. This is used to reduce
verbosity of the dump for trivial allocations.
This is dead code at the moment. The idea is that all dynamic memory allocations
in our source code will be accompanied by a call HeapProfiler::record_alloc().
When called, this save stats. Additionally, failed dynamic memory allocations
may invoke HeapProfiler::dump_and_terminate(). Similarly for dellocations.
The HeapProfiler is responsible for:
std::bad_alloc
hook.that is freed when entering dump_and_terminate(). The intention is to release
memory after a fail allocation to provide enough memory for the dump to be
successful.
duplication of string labels. For example, a caller may do:
std::string(__FILE__ + "::" + __LINE__)
.In this scenario, we don't want to haev 3x "reader.cc::1923" strings allocated.
We share them in the HeapProfiler.
The next step is to refactor our code base to find-and-replace all dynamic
memory APIs with a macro/typedef that will record alloc/dealloc stats as they
are used.