Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap Profiler #2035

Merged
merged 1 commit into from
Jan 25, 2021
Merged

Heap Profiler #2035

merged 1 commit into from
Jan 25, 2021

Conversation

joe-maley
Copy link
Contributor

@joe-maley joe-maley commented Jan 22, 2021

This introduces a new C API:

TILEDB_EXPORT int32_t tiledb_heap_profiler_enable(
    const char* file_name_prefix,
    uint64_t dump_interval_ms,
    uint64_t dump_interval_bytes,
    uint64_t dump_threshold_bytes);

Executing the API does the following:

  • The singleton HeapProfiler is enabled, which may only be done once in the
    lifetime of the process.
  • On an uncaught std::bad_alloc (anywhere in the process) we will perform
    a stat dump and kill the process.
  • If dump_interval_ms is non-zero, a thread is started which will dump
    the heap stats periodically.
  • If dump_interval_bytes is non-zero, the next allocation that increases
    the lifetime total allocation byte size by this amount will perform a dump.
  • If dump_threshold_bytes non-zero, labeled allocations with a total
    alloc size less than this amount will be ignored. This is used to reduce
    verbosity of the dump for trivial allocations.

This is dead code at the moment. The idea is that all dynamic memory allocations
in our source code will be accompanied by a call HeapProfiler::record_alloc().
When called, this save stats. Additionally, failed dynamic memory allocations
may invoke HeapProfiler::dump_and_terminate(). Similarly for dellocations.

The HeapProfiler is responsible for:

  • Setting up the uncaught std::bad_alloc hook.
  • Setting up a periodic dump thread.
  • Reserving memory at initialization. This is an opaque block of dynamic memory
    that is freed when entering dump_and_terminate(). The intention is to release
    memory after a fail allocation to provide enough memory for the dump to be
    successful.
  • Keeps a "labels cache" which is basically an object pool of strings to prevent
    duplication of string labels. For example, a caller may do:
tiledb::common::heap_profiler.record_alloc(p1, size1, "reader.cc::1923");
tiledb::common::heap_profiler.record_alloc(p2, size2, "reader.cc::1923");
tiledb::common::heap_profiler.record_alloc(p3, size3, "reader.cc::1923");
  • Where "reader.cc::1923" could be generated by compilers macros, e.g. std::string(__FILE__ + "::" + __LINE__).
    In this scenario, we don't want to haev 3x "reader.cc::1923" strings allocated.
    We share them in the HeapProfiler.

The next step is to refactor our code base to find-and-replace all dynamic
memory APIs with a macro/typedef that will record alloc/dealloc stats as they
are used.

Copy link
Member

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from a (arguably casual) glance.

This introduces a new C API:
```
TILEDB_EXPORT int32_t tiledb_heap_profiler_enable(
    const char* file_name_prefix,
    uint64_t dump_interval_ms,
    uint64_t dump_interval_bytes,
    uint64_t dump_threshold_bytes);
```

Executing the API does the following:
- The singleton HeapProfiler is enabled, which may only be done once in the
lifetime of the process.
- On an uncaught `std::bad_alloc` (anywhere in the process) we will perform
a stat dump and kill the process.
- If `dump_interval_ms` is non-zero, a thread is started which will dump
the heap stats periodically.
- If `dump_interval_bytes` is non-zero, the next allocation that increases
the lifetime total allocation byte size by this amount will perform a dump.
- If `dump_threshold_bytes` non-zero, labeled allocations with a total
alloc size less than this amount will be ignored. This is used to reduce
verbosity of the dump for trivial allocations.

This is dead code at the moment. The idea is that all dynamic memory allocations
in our source code will be accompanied by a call HeapProfiler::record_alloc().
When called, this save stats. Additionally, failed dynamic memory allocations
may invoke HeapProfiler::dump_and_terminate(). Similarly for dellocations.

The HeapProfiler is responsible for:
- Setting up the uncaught `std::bad_alloc` hook.
- Setting up a periodic dump thread.
- Reserving memory at initialization. This is an opaque block of dynamic memory
  that is freed when entering dump_and_terminate(). The intention is to release
  memory after a fail allocation to provide enough memory for the dump to be
  successful.
- Keeps a "labels cache" which is basically an object pool of strings to prevent
  duplication of string labels. For example, a caller may do:
```
tiledb::common::heap_profiler.record_alloc(p1, size1, "reader.cc::1923");
tiledb::common::heap_profiler.record_alloc(p2, size2, "reader.cc::1923");
tiledb::common::heap_profiler.record_alloc(p3, size3, "reader.cc::1923");
```
  In this scenario, we don't want to haev 3x "reader.cc::1923" strings allocated.
  We share them in the HeapProfiler.

The next step is to refactor our code base to find-and-replace all dynamic
memory APIs with a macro/typedef that will record alloc/dealloc stats as they
are used.
@Shelnutt2
Copy link
Member

The history file needs updating

@joe-maley
Copy link
Contributor Author

The history file needs updating

Thanks -- I'll get it on the next related patch.

Copy link
Member

@ihnorton ihnorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 Minor comment, otherwise LGTM.

void HeapProfiler::dump_and_terminate_internal() {
// Free our reserved heap memory in an attempt to
// ensure there is enough memory to dump the stats.
free(reserved_memory_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good and useful as-is, but down the line we might want to do a simple bump allocator and placement new into this reserved space for the dump path data structures (we're still at the mercy of the memory manager if there's something else continuously eating memory on the system).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- agreed. My initial approach was to allocate everything up-front but I decided that the label map would be valuable and that trying to manage+tune for a bunch of strings and hashes would be more trouble than it's immediately worth. Additionally, we'll still be at the mercy of the system if anything else in the dump path needs heap memory (for example, opening + writing to the output file). Finally, we don't have robust hooks for when we actually run out of memory. For example: if a malloc fails elsewhere in the process, we won't catch it.

For all those reasons, I see the preferred profiling as a periodic/interval dump. If we're lucky, we'll also get a dump at the time of going out of memory.

@Shelnutt2 Shelnutt2 deleted the jpm/heap-profiler branch June 8, 2021 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants