Skip to content

Commit

Permalink
Release the lifetime (deallocation) profiler.
Browse files Browse the repository at this point in the history
Lifetime profiling reports a list of object lifetimes as pairs of allocation and deallocation records. Profiling is initiated by calling MallocExtension::StartLifetimeProfiling(). Profiling continues until `Stop` is invoked on the token. Lifetimes are only reported for objects where allocation *and* deallocation are observed while profiling is active.

The design and prototype of the deallocation oriented, lifetime profiler came out of an internship by Mohammad Mahdi Javanmard. Martin Maas brought the prototype to production where it has been used since mid-2019.

PiperOrigin-RevId: 477807613
Change-Id: I36a35a4e5ef4e90aec6c83940f28b0ae45909ee4
  • Loading branch information
snehasish authored and Copybara-Service committed Sep 29, 2022
1 parent f08977a commit 4d048a4
Show file tree
Hide file tree
Showing 16 changed files with 1,197 additions and 0 deletions.
11 changes: 11 additions & 0 deletions docs/sampling.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,14 @@ profile.

While the allocation sampler is active it is added to the list of samplers for
allocations and removed from the list when it is claimed.

## How Do We Handle Lifetime Profiling

Lifetime profiling reports a list of object lifetimes as pairs of allocation and
deallocation records. Profiling is initiated by calling
[MallocExtension::StartLifetimeProfiling()](https://github.com/google/tcmalloc/blob/master/tcmalloc/malloc_extension.h).
Profiling continues until `Stop` is invoked on the token. Lifetimes are only
reported for objects where allocation *and* deallocation are observed while
profiling is active. A description of the sampling based lifetime profiler can
be found in Section 4 of
["Learning-based Memory Allocation for C++ Server Workloads, ASPLOS 2020"](https://research.google/pubs/pub49008/).
3 changes: 3 additions & 0 deletions tcmalloc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ common_srcs = [
"common.h",
"cpu_cache.cc",
"cpu_cache.h",
"deallocation_profiler.cc",
"legacy_size_classes.cc",
"experimental_pow2_size_class.cc",
"experimental_cfl_aware_size_class.cc",
Expand Down Expand Up @@ -212,6 +213,7 @@ common_hdrs = [
"central_freelist.h",
"common.h",
"cpu_cache.h",
"deallocation_profiler.h",
"global_stats.h",
"guarded_page_allocator.h",
"hinted_tracker_lists.h",
Expand Down Expand Up @@ -259,6 +261,7 @@ common_deps = [
"@com_google_absl//absl/base:core_headers",
"@com_google_absl//absl/base:dynamic_annotations",
"@com_google_absl//absl/container:fixed_array",
"@com_google_absl//absl/container:flat_hash_map", # For lifetime (deallocation) profiler
"@com_google_absl//absl/debugging:stacktrace",
"@com_google_absl//absl/debugging:symbolize",
"@com_google_absl//absl/synchronization",
Expand Down
9 changes: 9 additions & 0 deletions tcmalloc/allocation_sampling.h
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,9 @@ static void* SampleifyAllocation(State& state, Policy policy,
tmp.requested_alignment = requested_alignment;
tmp.requested_size_returning = capacity != nullptr;
tmp.allocated_size = allocated_size;
tmp.sampled_alloc_handle = state.sampled_alloc_handle_generator.fetch_add(
1, std::memory_order_relaxed) +
1;
tmp.access_hint = static_cast<uint8_t>(policy.access());
tmp.cold_allocated = allocated_cold;
tmp.weight = weight;
Expand All @@ -257,6 +260,8 @@ static void* SampleifyAllocation(State& state, Policy policy,

state.allocation_samples.ReportMalloc(tmp);

state.deallocation_samples.ReportMalloc(tmp);

// The SampledAllocation object is visible to readers after this. Readers only
// care about its various metadata (e.g. stack trace, weight) to generate the
// heap profile, and won't need any information from Span::Sample() next.
Expand Down Expand Up @@ -295,6 +300,8 @@ inline void MaybeUnsampleAllocation(State& state, void* ptr, Span* span) {
// frequency (weight) and its size.
const double allocation_estimate =
static_cast<double>(weight) / (requested_size + 1);
AllocHandle sampled_alloc_handle =
sampled_allocation->sampled_stack.sampled_alloc_handle;
state.sampled_allocation_recorder().Unregister(sampled_allocation);

// Adjust our estimate of internal fragmentation.
Expand All @@ -309,6 +316,8 @@ inline void MaybeUnsampleAllocation(State& state, void* ptr, Span* span) {
state.sampled_internal_fragmentation_.Add(-sampled_fragmentation);
}

state.deallocation_samples.ReportFree(sampled_alloc_handle);

if (proxy) {
const auto policy = CppPolicy().InSameNumaPartitionAs(proxy);
size_t size_class;
Expand Down
Loading

0 comments on commit 4d048a4

Please sign in to comment.