Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more columns/dimensions to pg_wait_sampling #97

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@ MODULE_big = pg_wait_sampling
OBJS = pg_wait_sampling.o collector.o

EXTENSION = pg_wait_sampling
DATA = pg_wait_sampling--1.1.sql pg_wait_sampling--1.0--1.1.sql
DATA = pg_wait_sampling--1.1.sql pg_wait_sampling--1.0--1.1.sql pg_wait_sampling--1.1--1.2.sql

REGRESS = load queries

135 changes: 107 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -9,13 +9,13 @@ Introduction

PostgreSQL provides information about current wait event of particular
process. However, in order to gather descriptive statistics of server
behavior user have to sample current wait event multiple times.
behavior users have to sample current wait events multiple times.
`pg_wait_sampling` is an extension for collecting sampling statistics of wait
events.

The module must be loaded by adding `pg_wait_sampling` to
`shared_preload_libraries` in postgresql.conf, because it requires additional
shared memory and launches background worker. This means that a server restart
shared memory and launches a background worker. This means that a server restart
is needed to add or remove the module.

When used with `pg_stat_statements` it is recommended to put `pg_stat_statements`
@@ -25,17 +25,16 @@ utility statements are not rewritten by the former.
When `pg_wait_sampling` is enabled, it collects two kinds of statistics.

* History of waits events. It's implemented as in-memory ring buffer where
samples of each process wait events are written with given (configurable)
samples of each process' wait events are written with given (configurable)
period. Therefore, for each running process user can see some number of
recent samples depending on history size (configurable). Assuming there is
a client who periodically read this history and dump it somewhere, user
can have continuous history.
* Waits profile. It's implemented as in-memory hash table where count
of samples are accumulated per each process and each wait event
(and each query with `pg_stat_statements`). This hash
table can be reset by user request. Assuming there is a client who
periodically dumps profile and resets it, user can have statistics of
intensivity of wait events among time.
recent samples depending on history size (configurable). Assuming there is
a client who periodically reads this history and dumps it somewhere, user
can have continuous history of wait events.
* Waits profile. It's implemented as in-memory hash table where samples
are accumulated per each wait event and can be divided by process,
query and other dimensions. This hash table can be reset by user request.
Assuming there is a client who periodically dumps profile and resets it,
user can have statistics of wait events over time.

In combination with `pg_stat_statements` this extension can also provide
per query statistics.
@@ -66,10 +65,10 @@ Manual build
higher. Before build and install you should ensure following:

* PostgreSQL version is 13 or higher.
* You have development package of PostgreSQL installed or you built
* You have development package of PostgreSQL installed or you have built
PostgreSQL from source.
* Your PATH variable is configured so that `pg_config` command available, or
set PG_CONFIG variable.
PG_CONFIG variable is set.

Typical installation procedure may look like this:

@@ -98,9 +97,9 @@ Usage
`pg_wait_sampling` interacts with user by set of views and functions.

`pg_wait_sampling_current` view – information about current wait events for
all processed including background workers.
all processes including background workers.

| Column name | Column type | Description |
| Column name | Column type | Description |
| ----------- | ----------- | ----------------------- |
| pid | int4 | Id of process |
| event_type | text | Name of wait event type |
@@ -110,41 +109,108 @@ all processed including background workers.
`pg_wait_sampling_get_current(pid int4)` returns the same table for single given
process.

`pg_wait_sampling_current_extended` view – information about current wait events for
all processes including background workers. Structure of this view can be changed
between verions.

| Column name | Column type | Description |
| ------------------- | ----------- | --------------------------- |
| pid | int4 | Id of process |
| event_type | text | Name of wait event type |
| event | text | Name of wait event |
| queryid | int8 | Id of query |
| role_id | int4 | Id of role |
| database_id | int4 | Id of database |
| parallel_leader_pid | int4 | Id of parallel query leader |
| backend_type | text | Name of backend type |
| backend_state | text | Name of backend state |
| proc_start | timestamptz | Timestamp of process start |
| client_addr | text | Client address |
| client_hostname | text | Client hostname |
| appname | text | Application name |

`pg_wait_sampling_get_current_extended(pid int4)` returns the same table for single given
process.

`pg_wait_sampling_history` view – history of wait events obtained by sampling into
in-memory ring buffer.

| Column name | Column type | Description |
| Column name | Column type | Description |
| ----------- | ----------- | ----------------------- |
| pid | int4 | Id of process |
| ts | timestamptz | Sample timestamp |
| event_type | text | Name of wait event type |
| event | text | Name of wait event |
| queryid | int8 | Id of query |

`pg_wait_sampling_history_extended` view – history of wait events obtained by
sampling into in-memory ring buffer. Structure of this view can be changed
between verions

| Column name | Column type | Description |
| ------------------- | ----------- | --------------------------- |
| pid | int4 | Id of process |
| ts | timestamptz | Sample timestamp |
| event_type | text | Name of wait event type |
| event | text | Name of wait event |
| queryid | int8 | Id of query |
| role_id | int4 | Id of role |
| database_id | int4 | Id of database |
| parallel_leader_pid | int4 | Id of parallel query leader |
| backend_type | text | Name of backend type |
| backend_state | text | Name of backend state |
| proc_start | timestamptz | Timestamp of process start |
| client_addr | text | Client address |
| client_hostname | text | Client hostname |
| appname | text | Application name |

`pg_wait_sampling_profile` view – profile of wait events obtained by sampling into
in-memory hash table.

| Column name | Column type | Description |
| Column name | Column type | Description |
| ----------- | ----------- | ----------------------- |
| pid | int4 | Id of process |
| event_type | text | Name of wait event type |
| event | text | Name of wait event |
| queryid | int8 | Id of query |
| count | text | Count of samples |
| count | int8 | Count of samples |

`pg_wait_sampling_profile_extended` view – history of wait events obtained by
sampling into in-memory ring buffer. Structure of this view can be changed
between verions

| Column name | Column type | Description |
| ------------------- | ----------- | --------------------------- |
| pid | int4 | Id of process |
| event_type | text | Name of wait event type |
| event | text | Name of wait event |
| queryid | int8 | Id of query |
| role_id | int4 | Id of role |
| database_id | int4 | Id of database |
| parallel_leader_pid | int4 | Id of parallel query leader |
| backend_type | text | Name of backend type |
| backend_state | text | Name of backend state |
| proc_start | timestamptz | Timestamp of process start |
| client_addr | text | Client address |
| client_hostname | text | Client hostname |
| appname | text | Application name |
| count | int8 | Count of samples |

`pg_wait_sampling_reset_profile()` function resets the profile.

The work of wait event statistics collector worker is controlled by following
GUCs.

| Parameter name | Data type | Description | Default value |
|----------------------------------| --------- |---------------------------------------------|--------------:|
| pg_wait_sampling.history_size | int4 | Size of history in-memory ring buffer | 5000 |
| pg_wait_sampling.history_period | int4 | Period for history sampling in milliseconds | 10 |
| pg_wait_sampling.profile_period | int4 | Period for profile sampling in milliseconds | 10 |
| pg_wait_sampling.profile_pid | bool | Whether profile should be per pid | true |
| pg_wait_sampling.profile_queries | enum | Whether profile should be per query | top |
| pg_wait_sampling.sample_cpu | bool | Whether on CPU backends should be sampled | true |
| Parameter name | Data type | Description | Default value |
|-------------------------------------| --------- |---------------------------------------------|--------------:|
| pg_wait_sampling.history_size | int4 | Size of history in-memory ring buffer | 5000 |
| pg_wait_sampling.history_period | int4 | Period for history sampling in milliseconds | 10 |
| pg_wait_sampling.profile_period | int4 | Period for profile sampling in milliseconds | 10 |
| pg_wait_sampling.profile_pid | bool | Whether profile should be per pid | true |
| pg_wait_sampling.profile_queries | enum | Whether profile should be per query | top |
| pg_wait_sampling.sample_cpu | bool | Whether on CPU backends should be sampled | true |
| pg_wait_sampling.history_dimensions | text | Additional columns in extended history view | 'none' |
| pg_wait_sampling.profile_dimensions | text | Additional columns in extended profile view | 'none' |

If `pg_wait_sampling.profile_pid` is set to false, sampling profile wouldn't be
collected in per-process manner. In this case the value of pid could would
@@ -158,6 +224,19 @@ If `pg_wait_sampling.sample_cpu` is set to true then processes that are not
waiting on anything are also sampled. The wait event columns for such processes
will be NULL.

`pg_wait_sampling.history_dimenstions` and `pg_wait_sampling.profile_dimensions`
determine what additional columns will be sampled in `history/profile_extended`
views. Possible values are `none`, `all`, `role_id`, `database_id`,
`parallel_leader_pid`, `backend_type`, `backend_state`, `backend_start_time`,
`client_addr`, `client_hostname`, `appname` and any combination of column names.
`none` and `all` cannot be used together with any other values and must be used alone.

> [!WARNING]
> Turning on any of the following columns: `backend_type`, `backend_state`,
> `backend_start_time`, `client_addr`, `client_hostname`, `appname` will reduce
> performance compared to sampling none of those due to the need to look into
> BackendStatusTable. This is especially noticeable with PostgreSQL 13-16

Values of these GUC variables can be changed only in config file or with ALTER SYSTEM.
Then you need to reload server's configuration (such as with pg_reload_conf function)
for changes to take effect.
@@ -170,7 +249,7 @@ Contribution
------------

Please, notice, that `pg_wait_sampling` is still under development and while
it's stable and tested, it may contains some bugs. Don't hesitate to raise
it's stable and tested, it may contain some bugs. Don't hesitate to raise
[issues at github](https://github.com/postgrespro/pg_wait_sampling/issues) with
your bug reports.

119 changes: 106 additions & 13 deletions collector.c
Original file line number Diff line number Diff line change
@@ -10,6 +10,7 @@
#include "postgres.h"

#include <signal.h>
#include <time.h>

#include "compat.h"
#include "miscadmin.h"
@@ -30,6 +31,13 @@
#include "utils/resowner.h"
#include "utils/timestamp.h"

#define check_bestatus_dimensions(dimensions) \
(dimensions & (PGWS_DIMENSIONS_BE_TYPE |\
PGWS_DIMENSIONS_BE_STATE |\
PGWS_DIMENSIONS_BE_START_TIME |\
PGWS_DIMENSIONS_CLIENT_ADDR |\
PGWS_DIMENSIONS_CLIENT_HOSTNAME |\
PGWS_DIMENSIONS_APPNAME))
static volatile sig_atomic_t shutdown_requested = false;

static void handle_sigterm(SIGNAL_ARGS);
@@ -162,25 +170,103 @@ probe_waits(History *observations, HTAB *profile_hash,
LWLockAcquire(ProcArrayLock, LW_SHARED);
for (i = 0; i < ProcGlobal->allProcCount; i++)
{
HistoryItem item,
HistoryItem item_history,
*observation;
ProfileItem item_profile;
PGPROC *proc = &ProcGlobal->allProcs[i];
int pid;
uint32 wait_event_info;

if (!pgws_should_sample_proc(proc, &item.pid, &item.wait_event_info))
/* Check if we need to sample this process */
if (!pgws_should_sample_proc(proc, &pid, &wait_event_info))
continue;

/* We zero whole HistoryItem to avoid doing it field-by-field */
memset(&item_history, 0, sizeof(HistoryItem));
memset(&item_profile, 0, sizeof(ProfileItem));

item_history.pid = pid;
item_profile.pid = pid;

item_history.wait_event_info = wait_event_info;
item_profile.wait_event_info = wait_event_info;

if (pgws_profileQueries)
item.queryId = pgws_proc_queryids[i];
else
item.queryId = 0;
{
item_history.queryId = pgws_proc_queryids[i];
item_profile.queryId = pgws_proc_queryids[i];
}

item.ts = ts;
item_history.ts = ts;

/* Copy everything we need from PGPROC */
if (pgws_history_dimensions & PGWS_DIMENSIONS_ROLE_ID)
item_history.role_id = proc->roleId;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_ROLE_ID)
item_profile.role_id = proc->roleId;

if (pgws_history_dimensions & PGWS_DIMENSIONS_DB_ID)
item_history.database_id = proc->databaseId;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_DB_ID)
item_profile.database_id = proc->databaseId;

if (pgws_history_dimensions & PGWS_DIMENSIONS_PARALLEL_LEADER_PID)
item_history.parallel_leader_pid = (proc->lockGroupLeader ?
proc->lockGroupLeader->pid :
0);
if (pgws_profile_dimensions & PGWS_DIMENSIONS_PARALLEL_LEADER_PID)
item_profile.parallel_leader_pid = (proc->lockGroupLeader ?
proc->lockGroupLeader->pid :
0);
/* Look into BackendStatus only if necessary */
if (check_bestatus_dimensions(pgws_history_dimensions) ||
check_bestatus_dimensions(pgws_profile_dimensions))
{
#if PG_VERSION_NUM >= 170000
PgBackendStatus *bestatus = pgstat_get_beentry_by_proc_number(GetNumberFromPGProc(proc));
#else
PgBackendStatus *bestatus = get_beentry_by_procpid(proc->pid);
#endif
/* Copy everything we need from BackendStatus */
if (bestatus)
{
if (pgws_history_dimensions & PGWS_DIMENSIONS_BE_TYPE)
item_history.backend_type = bestatus->st_backendType;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_BE_TYPE)
item_profile.backend_type = bestatus->st_backendType;

if (pgws_history_dimensions & PGWS_DIMENSIONS_BE_STATE)
item_history.backend_state = bestatus->st_state;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_BE_STATE)
item_profile.backend_state = bestatus->st_state;

if (pgws_history_dimensions & PGWS_DIMENSIONS_BE_START_TIME)
item_history.proc_start = bestatus->st_proc_start_timestamp;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_BE_START_TIME)
item_profile.proc_start = bestatus->st_proc_start_timestamp;

if (pgws_history_dimensions & PGWS_DIMENSIONS_CLIENT_ADDR)
item_history.client_addr = bestatus->st_clientaddr;
if (pgws_profile_dimensions & PGWS_DIMENSIONS_CLIENT_ADDR)
item_profile.client_addr = bestatus->st_clientaddr;

if (pgws_history_dimensions & PGWS_DIMENSIONS_CLIENT_HOSTNAME)
strcpy(item_history.client_hostname, bestatus->st_clienthostname);
if (pgws_profile_dimensions & PGWS_DIMENSIONS_CLIENT_HOSTNAME)
strcpy(item_profile.client_hostname, bestatus->st_clienthostname);

if (pgws_history_dimensions & PGWS_DIMENSIONS_APPNAME)
strcpy(item_history.appname, bestatus->st_appname);
if (pgws_profile_dimensions & PGWS_DIMENSIONS_APPNAME)
strcpy(item_profile.appname, bestatus->st_appname);
}
}

/* Write to the history if needed */
if (write_history)
{
observation = get_next_observation(observations);
*observation = item;
*observation = item_history;
}

/* Write to the profile if needed */
@@ -190,16 +276,21 @@ probe_waits(History *observations, HTAB *profile_hash,
bool found;

if (!profile_pid)
item.pid = 0;
item_profile.pid = 0;

profileItem = (ProfileItem *) hash_search(profile_hash, &item, HASH_ENTER, &found);
profileItem = (ProfileItem *) hash_search(profile_hash, &item_profile, HASH_ENTER, &found);
if (found)
profileItem->count++;
else
profileItem->count = 1;
}
}
LWLockRelease(ProcArrayLock);
#if PG_VERSION_NUM >= 140000
pgstat_clear_backend_activity_snapshot();
#else
pgstat_clear_snapshot();
#endif
}

/*
@@ -287,10 +378,12 @@ make_profile_hash()
{
HASHCTL hash_ctl;

if (pgws_profileQueries)
hash_ctl.keysize = offsetof(ProfileItem, count);
else
hash_ctl.keysize = offsetof(ProfileItem, queryId);
/*
* Since adding additional dimensions we include everyting except count
* into hashtable key. This is fine for cases when some fields are 0 since
* it doesn't impede our ability to search the hash table for entries
*/
hash_ctl.keysize = offsetof(ProfileItem, count);

hash_ctl.entrysize = sizeof(ProfileItem);
return hash_create("Waits profile hash", 1024, &hash_ctl,
Loading
Oops, something went wrong.