Skip to content

Commit

Permalink
Change custom wait events to use dynamic shared hash tables
Browse files Browse the repository at this point in the history
Currently, the names of the custom wait event must be registered for
each backend, requiring all these to link to the shared memory area of
an extension, even if these are not loaded with
shared_preload_libraries.

This patch relaxes the constraints related to this infrastructure by
storing the wait events and their names in two dynamic hash tables in
shared memory.  This has the advantage to simplify the registration of
custom wait events to a single routine call that returns an event ID
ready for consumption:
uint32 WaitEventExtensionNew(const char *wait_event_name);

The caller of this routine can then cache locally the ID returned, to be
used for pgstat_report_wait_start(), WaitLatch() or a similar routine.

The implementation uses two hash tables: one with a key based on the
event name to avoid duplicates and a second using the event ID as key
for event lookups, like on pg_stat_activity.  These tables can hold a
minimum of 16 entries, and a maximum of 128 entries, which should be plenty
enough.

The code changes done in worker_spi show how things are simplified (most
of the code removed in this commit comes from there):
- worker_spi_init() is gone.
- No more shared memory hooks required (size requested and
initialization).
- The custom wait event ID is cached in the process that needs to set
it, with one single call to WaitEventExtensionNew() to retrieve it.

Per suggestion from Andres Freund.

Author: Masahiro Ikeda, with a few tweaks from me.
Discussion: https://postgr.es/m/20230801032349.aaiuvhtrcvvcwzcx@awork3.anarazel.de
  • Loading branch information
michaelpq committed Aug 14, 2023
1 parent 2a8b40e commit af720b4
Show file tree
Hide file tree
Showing 10 changed files with 165 additions and 238 deletions.
5 changes: 2 additions & 3 deletions doc/src/sgml/monitoring.sgml
Expand Up @@ -1121,9 +1121,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<literal>LWLock</literal> types
to the list shown in <xref linkend="wait-event-extension-table"/> and
<xref linkend="wait-event-lwlock-table"/>. In some cases, the name
assigned by an extension will not be available in all server processes;
so an <literal>Extension</literal> or <literal>LWLock</literal> wait
event might be reported as just
of <literal>LWLock</literal> assigned by an extension will not be
available in all server processes; It might be reported as just
<quote><literal>extension</literal></quote> rather than the
extension-assigned name.
</para>
Expand Down
26 changes: 4 additions & 22 deletions doc/src/sgml/xfunc.sgml
Expand Up @@ -3454,33 +3454,15 @@ if (!ptr)
</sect2>

<sect2 id="xfunc-addin-wait-events">
<title>Shared Memory and Custom Wait Events</title>
<title>Custom Wait Events</title>

<para>
Add-ins can define custom wait events under the wait event type
<literal>Extension</literal>. The add-in's shared library must be
preloaded by specifying it in <literal>shared_preload_libraries</literal>,
and register a <literal>shmem_request_hook</literal> and a
<literal>shmem_startup_hook</literal> in its
<function>_PG_init</function> function.
<literal>shmem_request_hook</literal> can request a shared memory size
to be later used at startup by calling:
<literal>Extension</literal> by calling:
<programlisting>
void RequestAddinShmemSpace(int size)
</programlisting>
</para>
<para>
<literal>shmem_startup_hook</literal> can allocate in shared memory
custom wait events by calling while holding the LWLock
<function>AddinShmemInitLock</function> to avoid any race conditions:
<programlisting>
uint32 WaitEventExtensionNew(void)
</programlisting>
Next, each process needs to associate the wait event allocated previously
to a user-facing custom string, which is something done by calling:
<programlisting>
void WaitEventExtensionRegisterName(uint32 wait_event_info, const char *wait_event_name)
uint32 WaitEventExtensionNew(const char *wait_event_name)
</programlisting>
The wait event is associated to a user-facing custom string.
An example can be found in <filename>src/test/modules/worker_spi</filename>
in the PostgreSQL source tree.
</para>
Expand Down
1 change: 1 addition & 0 deletions src/backend/storage/lmgr/lwlocknames.txt
Expand Up @@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
WaitEventExtensionLock 48
218 changes: 137 additions & 81 deletions src/backend/utils/activity/wait_event.c
Expand Up @@ -45,6 +45,41 @@ uint32 *my_wait_event_info = &local_my_wait_event_info;
#define WAIT_EVENT_CLASS_MASK 0xFF000000
#define WAIT_EVENT_ID_MASK 0x0000FFFF

/*
* Hash tables for storing custom wait event ids and their names in
* shared memory.
*
* WaitEventExtensionHashById is used to find the name from a event id.
* Any backend can search it to find custom wait events.
*
* WaitEventExtensionHashByName is used to find the event ID from a name.
* It is used to ensure that no duplicated entries are registered.
*
* The size of the hash table is based on the assumption that
* WAIT_EVENT_EXTENSION_BASH_INIT_SIZE is enough for most cases, and it seems
* unlikely that the number of entries will reach
* WAIT_EVENT_EXTENSION_BASH_MAX_SIZE.
*/
static HTAB *WaitEventExtensionHashById; /* find names from IDs */
static HTAB *WaitEventExtensionHashByName; /* find IDs from names */

#define WAIT_EVENT_EXTENSION_HASH_INIT_SIZE 16
#define WAIT_EVENT_EXTENSION_HASH_MAX_SIZE 128

/* hash table entries */
typedef struct WaitEventExtensionEntryById
{
uint16 event_id; /* hash key */
char wait_event_name[NAMEDATALEN]; /* custom wait event name */
} WaitEventExtensionEntryById;

typedef struct WaitEventExtensionEntryByName
{
char wait_event_name[NAMEDATALEN]; /* hash key */
uint16 event_id; /* wait event ID */
} WaitEventExtensionEntryByName;


/* dynamic allocation counter for custom wait events in extensions */
typedef struct WaitEventExtensionCounterData
{
Expand All @@ -59,58 +94,118 @@ static WaitEventExtensionCounterData *WaitEventExtensionCounter;
#define NUM_BUILTIN_WAIT_EVENT_EXTENSION \
(WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)

/*
* This is indexed by event ID minus NUM_BUILTIN_WAIT_EVENT_EXTENSION, and
* stores the names of all dynamically-created event IDs known to the current
* process. Any unused entries in the array will contain NULL.
*/
static const char **WaitEventExtensionNames = NULL;
static int WaitEventExtensionNamesAllocated = 0;
/* wait event info for extensions */
#define WAIT_EVENT_EXTENSION_INFO(eventId) (PG_WAIT_EXTENSION | eventId)

static const char *GetWaitEventExtensionIdentifier(uint16 eventId);

/*
* Return the space for dynamic allocation counter.
* Return the space for dynamic shared hash tables and dynamic allocation counter.
*/
Size
WaitEventExtensionShmemSize(void)
{
return sizeof(WaitEventExtensionCounterData);
Size sz;

sz = MAXALIGN(sizeof(WaitEventExtensionCounterData));
sz = add_size(sz, hash_estimate_size(WAIT_EVENT_EXTENSION_HASH_MAX_SIZE,
sizeof(WaitEventExtensionEntryById)));
sz = add_size(sz, hash_estimate_size(WAIT_EVENT_EXTENSION_HASH_MAX_SIZE,
sizeof(WaitEventExtensionEntryByName)));
return sz;
}

/*
* Allocate shmem space for dynamic allocation counter.
* Allocate shmem space for dynamic shared hash and dynamic allocation counter.
*/
void
WaitEventExtensionShmemInit(void)
{
bool found;
HASHCTL info;

WaitEventExtensionCounter = (WaitEventExtensionCounterData *)
ShmemInitStruct("WaitEventExtensionCounterData",
WaitEventExtensionShmemSize(), &found);
sizeof(WaitEventExtensionCounterData), &found);

if (!found)
{
/* initialize the allocation counter and its spinlock. */
WaitEventExtensionCounter->nextId = NUM_BUILTIN_WAIT_EVENT_EXTENSION;
SpinLockInit(&WaitEventExtensionCounter->mutex);
}

/* initialize or attach the hash tables to store custom wait events */
info.keysize = sizeof(uint16);
info.entrysize = sizeof(WaitEventExtensionEntryById);
WaitEventExtensionHashById = ShmemInitHash("WaitEventExtension hash by id",
WAIT_EVENT_EXTENSION_HASH_INIT_SIZE,
WAIT_EVENT_EXTENSION_HASH_MAX_SIZE,
&info,
HASH_ELEM | HASH_BLOBS);

/* key is a NULL-terminated string */
info.keysize = sizeof(char[NAMEDATALEN]);
info.entrysize = sizeof(WaitEventExtensionEntryByName);
WaitEventExtensionHashByName = ShmemInitHash("WaitEventExtension hash by name",
WAIT_EVENT_EXTENSION_HASH_INIT_SIZE,
WAIT_EVENT_EXTENSION_HASH_MAX_SIZE,
&info,
HASH_ELEM | HASH_STRINGS);
}

/*
* Allocate a new event ID and return the wait event.
* Allocate a new event ID and return the wait event info.
*
* If the wait event name is already defined, this does not allocate a new
* entry; it returns the wait event information associated to the name.
*/
uint32
WaitEventExtensionNew(void)
WaitEventExtensionNew(const char *wait_event_name)
{
uint16 eventId;
bool found;
WaitEventExtensionEntryByName *entry_by_name;
WaitEventExtensionEntryById *entry_by_id;

/* Check the limit of the length of the event name */
if (strlen(wait_event_name) >= NAMEDATALEN)
elog(ERROR,
"cannot use custom wait event string longer than %u characters",
NAMEDATALEN - 1);

/*
* Check if the wait event info associated to the name is already defined,
* and return it if so.
*/
LWLockAcquire(WaitEventExtensionLock, LW_SHARED);
entry_by_name = (WaitEventExtensionEntryByName *)
hash_search(WaitEventExtensionHashByName, wait_event_name,
HASH_FIND, &found);
LWLockRelease(WaitEventExtensionLock);
if (found)
return WAIT_EVENT_EXTENSION_INFO(entry_by_name->event_id);

Assert(LWLockHeldByMeInMode(AddinShmemInitLock, LW_EXCLUSIVE));
/*
* Allocate and register a new wait event. Recheck if the event name
* exists, as it could be possible that a concurrent process has inserted
* one with the same name since the LWLock acquired again here was
* previously released.
*/
LWLockAcquire(WaitEventExtensionLock, LW_EXCLUSIVE);
entry_by_name = (WaitEventExtensionEntryByName *)
hash_search(WaitEventExtensionHashByName, wait_event_name,
HASH_FIND, &found);
if (found)
{
LWLockRelease(WaitEventExtensionLock);
return WAIT_EVENT_EXTENSION_INFO(entry_by_name->event_id);
}

/* Allocate a new event Id */
SpinLockAcquire(&WaitEventExtensionCounter->mutex);

if (WaitEventExtensionCounter->nextId > PG_UINT16_MAX)
if (WaitEventExtensionCounter->nextId >= WAIT_EVENT_EXTENSION_HASH_MAX_SIZE)
{
SpinLockRelease(&WaitEventExtensionCounter->mutex);
ereport(ERROR,
Expand All @@ -122,64 +217,23 @@ WaitEventExtensionNew(void)

SpinLockRelease(&WaitEventExtensionCounter->mutex);

return PG_WAIT_EXTENSION | eventId;
}

/*
* Register a dynamic wait event name for extension in the lookup table
* of the current process.
*
* This routine will save a pointer to the wait event name passed as an argument,
* so the name should be allocated in a backend-lifetime context
* (shared memory, TopMemoryContext, static constant, or similar).
*
* The "wait_event_name" will be user-visible as a wait event name, so try to
* use a name that fits the style for those.
*/
void
WaitEventExtensionRegisterName(uint32 wait_event_info,
const char *wait_event_name)
{
uint32 classId;
uint16 eventId;

classId = wait_event_info & WAIT_EVENT_CLASS_MASK;
eventId = wait_event_info & WAIT_EVENT_ID_MASK;

/* Check the wait event class. */
if (classId != PG_WAIT_EXTENSION)
ereport(ERROR,
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid wait event class %u", classId));

/* This should only be called for user-defined wait event. */
if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION)
ereport(ERROR,
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid wait event ID %u", eventId));
/* Register the new wait event */
entry_by_id = (WaitEventExtensionEntryById *)
hash_search(WaitEventExtensionHashById, &eventId,
HASH_ENTER, &found);
Assert(!found);
strlcpy(entry_by_id->wait_event_name, wait_event_name,
sizeof(entry_by_id->wait_event_name));

/* Convert to array index. */
eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
entry_by_name = (WaitEventExtensionEntryByName *)
hash_search(WaitEventExtensionHashByName, wait_event_name,
HASH_ENTER, &found);
Assert(!found);
entry_by_name->event_id = eventId;

/* If necessary, create or enlarge array. */
if (eventId >= WaitEventExtensionNamesAllocated)
{
uint32 newalloc;

newalloc = pg_nextpower2_32(Max(8, eventId + 1));

if (WaitEventExtensionNames == NULL)
WaitEventExtensionNames = (const char **)
MemoryContextAllocZero(TopMemoryContext,
newalloc * sizeof(char *));
else
WaitEventExtensionNames =
repalloc0_array(WaitEventExtensionNames, const char *,
WaitEventExtensionNamesAllocated, newalloc);
WaitEventExtensionNamesAllocated = newalloc;
}
LWLockRelease(WaitEventExtensionLock);

WaitEventExtensionNames[eventId] = wait_event_name;
return WAIT_EVENT_EXTENSION_INFO(eventId);
}

/*
Expand All @@ -188,23 +242,25 @@ WaitEventExtensionRegisterName(uint32 wait_event_info,
static const char *
GetWaitEventExtensionIdentifier(uint16 eventId)
{
bool found;
WaitEventExtensionEntryById *entry;

/* Built-in event? */
if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION)
return "Extension";

/*
* It is a user-defined wait event, so look at WaitEventExtensionNames[].
* However, it is possible that the name has never been registered by
* calling WaitEventExtensionRegisterName() in the current process, in
* which case give up and return "extension".
*/
eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
/* It is a user-defined wait event, so lookup hash table. */
LWLockAcquire(WaitEventExtensionLock, LW_SHARED);
entry = (WaitEventExtensionEntryById *)
hash_search(WaitEventExtensionHashById, &eventId,
HASH_FIND, &found);
LWLockRelease(WaitEventExtensionLock);

if (eventId >= WaitEventExtensionNamesAllocated ||
WaitEventExtensionNames[eventId] == NULL)
return "extension";
if (!entry)
elog(ERROR, "could not find custom wait event name for ID %u",
eventId);

return WaitEventExtensionNames[eventId];
return entry->wait_event_name;
}


Expand Down
1 change: 1 addition & 0 deletions src/backend/utils/activity/wait_event_names.txt
Expand Up @@ -317,6 +317,7 @@ WAIT_EVENT_DOCONLY LogicalRepWorker "Waiting to read or update the state of logi
WAIT_EVENT_DOCONLY XactTruncation "Waiting to execute <function>pg_xact_status</function> or update the oldest transaction ID available to it."
WAIT_EVENT_DOCONLY WrapLimitsVacuum "Waiting to update limits on transaction id and multixact consumption."
WAIT_EVENT_DOCONLY NotifyQueueTail "Waiting to update limit on <command>NOTIFY</command> message storage."
WAIT_EVENT_DOCONLY WaitEventExtension "Waiting to read or update custom wait events information for extensions."

WAIT_EVENT_DOCONLY XactBuffer "Waiting for I/O on a transaction status SLRU buffer."
WAIT_EVENT_DOCONLY CommitTsBuffer "Waiting for I/O on a commit timestamp SLRU buffer."
Expand Down
18 changes: 9 additions & 9 deletions src/include/utils/wait_event.h
Expand Up @@ -44,12 +44,14 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
* Use this category when the server process is waiting for some condition
* defined by an extension module.
*
* Extensions can define their own wait events in this category. First,
* they should call WaitEventExtensionNew() to get one or more wait event
* IDs that are allocated from a shared counter. These can be used directly
* with pgstat_report_wait_start() or equivalent. Next, each individual
* process should call WaitEventExtensionRegisterName() to associate a wait
* event string to the number allocated previously.
* Extensions can define their own wait events in this category. They should
* call WaitEventExtensionNew() with a wait event string. If the wait event
* associated to a string is already allocated, it returns the wait event
* information to use. If not, it gets one wait event ID allocated from
* a shared counter, associates the string to the ID in the shared dynamic
* hash and returns the wait event information.
*
* The ID retrieved can be used with pgstat_report_wait_start() or equivalent.
*/
typedef enum
{
Expand All @@ -60,9 +62,7 @@ typedef enum
extern void WaitEventExtensionShmemInit(void);
extern Size WaitEventExtensionShmemSize(void);

extern uint32 WaitEventExtensionNew(void);
extern void WaitEventExtensionRegisterName(uint32 wait_event_info,
const char *wait_event_name);
extern uint32 WaitEventExtensionNew(const char *wait_event_name);

/* ----------
* pgstat_report_wait_start() -
Expand Down

0 comments on commit af720b4

Please sign in to comment.