Skip to content

Commit

Permalink
Fix recovery conflict SIGUSR1 handling.
Browse files Browse the repository at this point in the history
We shouldn't be doing non-trivial work in signal handlers in general,
and in this case the handler could reach unsafe code and corrupt state.
It also clobbered its own "reason" code.

Move all recovery conflict decision logic into the next
CHECK_FOR_INTERRUPTS(), and have the signal handler just set flags and
the latch, following the standard pattern.  Since there are several
different "reasons", use a separate flag for each.

With this refactoring, the recovery conflict system no longer
piggy-backs on top of the regular query cancelation mechanism, but
instead raises an error directly if it decides that is necessary.  It
still needs to respect QueryCancelHoldoffCount, because otherwise the
FEBE protocol might get out of sync (see commit 2b3a8b2).

This fixes one class of intermittent failure in the new
031_recovery_conflict.pl test added by commit 9f8a050, though the buggy
coding is much older.  Failures outside contrived testing seem to be
very rare (or perhaps incorrectly attributed) in the field, based on
lack of reports.

No back-patch for now due to complexity and release schedule.  We have
the option to back-patch into 16 later, as 16 has prerequisite commit
bea3d7e.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Michael Paquier <michael@paquier.xyz> (earlier version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier version)
Tested-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
Discussion: https://postgr.es/m/CALj2ACVr8au2J_9D88UfRCi0JdWhyQDDxAcSVav0B0irx9nXEg%40mail.gmail.com
  • Loading branch information
macdice committed Sep 7, 2023
1 parent 8c16ad3 commit 0da096d
Show file tree
Hide file tree
Showing 5 changed files with 186 additions and 170 deletions.
4 changes: 2 additions & 2 deletions src/backend/storage/buffer/bufmgr.c
Expand Up @@ -4923,8 +4923,8 @@ LockBufferForCleanup(Buffer buffer)
}

/*
* Check called from RecoveryConflictInterrupt handler when Startup
* process requests cancellation of all pin holders that are blocking it.
* Check called from ProcessRecoveryConflictInterrupts() when Startup process
* requests cancellation of all pin holders that are blocking it.
*/
bool
HoldingBufferPinThatDelaysRecovery(void)
Expand Down
14 changes: 7 additions & 7 deletions src/backend/storage/ipc/procsignal.c
Expand Up @@ -662,25 +662,25 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
HandleParallelApplyMessageInterrupt();

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_TABLESPACE))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_TABLESPACE);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_LOCK))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOCK);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOCK);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);

if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);

SetLatch(MyLatch);

Expand Down

0 comments on commit 0da096d

Please sign in to comment.