Skip to content

2.25.2.0-b61

@kai-franz kai-franz tagged this 26 Feb 19:17
Summary:
Adds a YSQL interface for TCMalloc's sampling/heap snapshot functions, allowing us to see the heap snapshot for PG backend processes.

NOTE: Postgres has its own internal memory allocators that it uses on top of `malloc`—Postgres code usually calls `palloc`, which calls the allocator for the current memory context (usually `AllocSetAlloc`). For AllocSet contexts, the allocator only mallocs large chunks at a time, then it divides up these chunks into smaller pieces that it returns when `palloc` is called. As a result, TCMalloc is only able to sample the call stack when `AllocSetAlloc` has to request a new chunk from `malloc`, and there's no way for it to sample every malloc. So the TCMalloc heap snapshot will not offer a complete view of all the allocations Postgres is doing, even if TCMalloc is configured to sample every allocation. In practice, however, if there is a call site that is using a lot of memory, the Postgres allocator will call `malloc` many times from there and it will show up on the heap snapshot.

### Adjusting sample period
TCMalloc sampling period is exposed as the GUC `yb_tcmalloc_sample_period` (also the gflag `ysql_yb_tcmalloc_sample_period`), which is 0 (disabled) by default. This can be used as follows:
```
yugabyte=# show yb_tcmalloc_sample_period;
 yb_tcmalloc_sample_period
-------------------------
 0
(1 row)

yugabyte=# set yb_tcmalloc_sample_period = 128;
SET
yugabyte=# set yb_tcmalloc_sample_period = 128;
SET
yugabyte=# show yb_tcmalloc_sample_period;
 yb_tcmalloc_sample_period
-------------------------
 128
(1 row)

yugabyte=# set yb_tcmalloc_sample_period = '1MB';
SET
yugabyte=# show yb_tcmalloc_sample_period;
 yb_tcmalloc_sample_period
-------------------------
 1048576
(1 row)
```

### Getting the heap snapshot in YSQL
The current tcmalloc heap snapshot is exposed through a set-returning function `yb_backend_heap_snapshot()`. The peak heap snapshot is also accessible through `yb_backend_heap_snapshot_peak()`.
```
yugabyte=# select * from yb_backend_heap_snapshot();
 estimated_bytes | estimated_count | avg_bytes_per_allocation | sampled_bytes | sample_count |                                call_stack
-----------------+-----------------+--------------------------+---------------+--------------+---------------------------------------------------------------------------
                 |                 |                          |       2208080 |           72 | tcmalloc::allocate_full_malloc_oom()                                     +
                 |                 |                          |               |              | _malloc_zone_malloc_instrumented_or_legacy                               +
                 |                 |                          |               |              | AllocSetAlloc                                                            +
                 |                 |                          |               |              | palloc                                                                   +
                 |                 |                          |               |              | CatalogCacheCreateEntry                                                  +
                 |                 |                          |               |              | SetCatCacheList                                                          +
                 |                 |                          |               |              | YbPreloadCatalogCache                                                    +
                 |                 |                          |               |              | YbFillCache                                                              +
                 |                 |                          |               |              | YbPreloadRelCacheImpl                                                    +
                 |                 |                          |               |              | YbRunWithPrefetcherImpl                                                  +
                 |                 |                          |               |              | YbRunWithPrefetcher                                                      +
                 |                 |                          |               |              | YBPreloadRelCache                                                        +
                 |                 |                          |               |              | YBRefreshCache                                                           +
                 |                 |                          |               |              | YBCheckSharedCatalogCacheVersion                                         +
                 |                 |                          |               |              | PostgresMain                                                             +
                 |                 |                          |               |              | report_fork_failure_to_client                                            +
                 |                 |                          |               |              | BackendStartup                                                           +
                 |                 |                          |               |              | ServerLoop                                                               +
                 |                 |                          |               |              | PostmasterMain                                                           +
                 |                 |                          |               |              | startup_hacks                                                            +
                 |                 |                          |               |              | main                                                                     +
                 |                 |                          |               |              |
                 |                 |                          |       2208080 |           72 | tcmalloc::allocate_full_malloc_oom()                                     +
                 |                 |                          |               |              | _malloc_zone_malloc_instrumented_or_legacy                               +
                 |                 |                          |               |              | AllocSetAlloc                                                            +
...
```

The function is restricted to superusers and the yb_db_admin role.

### Dumping the heap snapshot to PG logs

This diff also adds the function `yb_log_backend_heap_snapshot(pid)`, which dumps the given backend's heap snapshot to logs. To log the current backend's heap snapshot, you can do `SELECT yb_log_backend_heap_snapshot(pg_backend_pid())`.

This diff also adds the `YBCDumpTcMallocHeapProfile` function from D31084, which is meant to be called from a debugger. This will cause the current backend to dump its heap snapshot to PG logs:

```
I1214 19:54:49.425766 521265 ybc_pggate.cc:504] Heap Profile:
I1214 19:54:49.425778 521265 ybc_pggate.cc:508] estimated bytes: 198,443,008, estimated count: 757, sampled_allocated bytes: 198,443,008, sampled count: 757, call stack:
tcmalloc::tcmalloc_internal::SampleifyAllocation<>()
slow_alloc<>()
__libc_malloc
yb::HeapBufferAllocator::AllocateInternal()
yb::PreallocatedBufferAllocator::AllocateInternal()
yb::internal::ArenaBase<>::NewBuffer()
yb::internal::ArenaBase<>::AllocateBytesFallback()
yb::pggate::PgDocReadOp::DoPopulateDmlByYbctidOps()
yb::pggate::PgDocOp::PopulateDmlByYbctidOps()
yb::pggate::PgDml::ProcessSecondaryIndexRequest()
yb::pggate::PgDml::FetchDataFromServer()
yb::pggate::PgDml::Fetch()
yb::pggate::PgApiImpl::DmlFetch()
YBCPgDmlFetch
ybc_getnext_heaptuple
ybcingettuple
index_getnext_tid
index_getnext
IndexNext
ExecScan
ExecModifyTable
standard_ExecutorRun
pgss_ExecutorRun
pullRpczEntries
ProcessQuery
PortalRunMulti
PortalRun
yb_exec_simple_query_impl
yb_exec_query_wrapper_one_attempt
PostgresMain
Failed to symbolize
Failed to symbolize
PostmasterMain
Failed to symbolize
main
__libc_start_main
================
...
```

### Other changes

This diff also renames all mentions of "sample frequency" to "sample period" for clarity. This includes renaming the `profiler_sample_freq_bytes` flag to `profile_sample_period_bytes` and deprecating the old flag. The only exception to this is the YSQL metric `yb_pg_metrics.webserver_profiler_sample_freq_bytes`, which we don't touch in case it is in use by customers.
Jira: DB-10752

Test Plan:
```
./yb_build.sh release --cxx-test pg_heap_snapshot-test
```

Manual testing:
1. Make 2 connections to ysqlsh.
2. On connection #1, do `SET yb_tcmalloc_sample_interval = 128;`
3. On connection #2, create a table and trigger a catalog version bump: `CREATE TABLE t (a INT); ALTER TABLE t ADD COLUMN b INT;`
4. On connection #1, trigger a cache refresh: `SELECT * FROM t;`
5. On connection #1, check the heap snapshot: `SELECT * FROM yb_backend_heap_snapshot();`
6. On connection #1, connect LLDB and dump the heap snapshot to logs:
```
proc interrupt
call YBCDumpTcMallocHeapProfile(false, 10)
continue
```

Reviewers: telgersma, asrivastava, #db-approvers

Reviewed By: telgersma, asrivastava, #db-approvers

Subscribers: smishra, svc_phabricator, mihnea, bogdan, esheng, ybase, jason, yql

Differential Revision: https://phorge.dev.yugabyte.com/D33735
Assets 2
Loading