Skip to content

Commit

Permalink
Fix failure at promotion with 2PC transactions and archiving enabled
Browse files Browse the repository at this point in the history
When archiving is enabled, a promotion request would fail with the
following error when some 2PC transaction needs to be recovered from
WAL, preventing the promotion to complete:
FATAL:  requested WAL segment pg_wal/000000010000000000000001 has already been removed

The origin of the problem is that the last partial segment of the old
timeline is renamed before recovering the 2PC data via
RecoverPreparedTransactions() at the end of recovery, causing the FATAL
because the segment wanted is now renamed with a .partial suffix.  This
commit reorders a bit the end-of-recovery actions so as the execution of
recovery_end_command, the cleanup of the old segments of the old
timeline (RemoveNonParentXlogFiles) and the last partial segment rename
are done after the 2PC transaction data is recovered with
RecoverPreparedTransactions().  This makes the order of these
end-of-recovery actions more consistent with ~15, at the exception of
the end-of-recovery checkpoint that still needs to happen before all the
actions reordered here in v13 and v14, contrary to what 15~ does.

v15 and newer versions have "fixed" this problem somewhat accidentally
with 811051c, where the end-of-recovery actions got reordered.  In this
case, the recovery of 2PC transactions happens before the renaming of
the last partial segment of the old timeline.

v13 and v14 are the versions that can easily see this problem as per the
refactoring of 38a9573 where XLogReaderState is reset in
XLogBeginRead() before reading the 2PC transaction data.  v11 and v12
could also see this problem, but may finish by reading the 2PC data from
some of the WAL buffers instead.  Perhaps something could be done for
these two branches, but I am not really excited about doing something on
these per the lack of complaints and per the fact that v11 is soon going
to be EOL'd soon (there is always a risk of breaking something).

Note that the TAP test 009_twophase.pl is able to exhibit the issue if
it enables archiving on the primary node, which does not impact the test
coverage as restore_command would remain unused.  This is something that
should be changed on v15 and HEAD as well, so this will be changed in a
separate commit for clarity.

Author: Julian Markwort
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/743b9b45a2d4013bd90b6a5cba8d6faeb717ee34.camel@cybertec.at
Backpatch-through: 13
  • Loading branch information
michaelpq committed Jun 20, 2023
1 parent 73f1c17 commit f663b00
Showing 1 changed file with 51 additions and 51 deletions.
102 changes: 51 additions & 51 deletions src/backend/access/transam/xlog.c
Expand Up @@ -8016,6 +8016,57 @@ StartupXLOG(void)
CreateCheckPoint(CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IMMEDIATE);
}

/*
* Preallocate additional log files, if wanted.
*/
PreallocXlogFiles(EndOfLog);

/*
* Okay, we're officially UP.
*/
InRecovery = false;

/* start the archive_timeout timer and LSN running */
XLogCtl->lastSegSwitchTime = (pg_time_t) time(NULL);
XLogCtl->lastSegSwitchLSN = EndOfLog;

/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
FullTransactionIdRetreat(&ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);

/*
* Start up subtrans, if not already done for hot standby. (commit
* timestamps are started below, if necessary.)
*/
if (standbyState == STANDBY_DISABLED)
StartupSUBTRANS(oldestActiveXID);

/*
* Perform end of recovery actions for any SLRUs that need it.
*/
TrimCLOG();
TrimMultiXact();

/* Reload shared-memory state for prepared transactions */
RecoverPreparedTransactions();

/* Shut down xlogreader */
if (readFile >= 0)
{
close(readFile);
readFile = -1;
}
XLogReaderFree(xlogreader);

/*
* If any of the critical GUCs have changed, log them before we allow
* backends to write WAL.
*/
LocalSetXLogInsertAllowed();
XLogReportParameters();

if (ArchiveRecoveryRequested)
{
/*
Expand Down Expand Up @@ -8097,57 +8148,6 @@ StartupXLOG(void)
}
}

/*
* Preallocate additional log files, if wanted.
*/
PreallocXlogFiles(EndOfLog);

/*
* Okay, we're officially UP.
*/
InRecovery = false;

/* start the archive_timeout timer and LSN running */
XLogCtl->lastSegSwitchTime = (pg_time_t) time(NULL);
XLogCtl->lastSegSwitchLSN = EndOfLog;

/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
FullTransactionIdRetreat(&ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);

/*
* Start up subtrans, if not already done for hot standby. (commit
* timestamps are started below, if necessary.)
*/
if (standbyState == STANDBY_DISABLED)
StartupSUBTRANS(oldestActiveXID);

/*
* Perform end of recovery actions for any SLRUs that need it.
*/
TrimCLOG();
TrimMultiXact();

/* Reload shared-memory state for prepared transactions */
RecoverPreparedTransactions();

/* Shut down xlogreader */
if (readFile >= 0)
{
close(readFile);
readFile = -1;
}
XLogReaderFree(xlogreader);

/*
* If any of the critical GUCs have changed, log them before we allow
* backends to write WAL.
*/
LocalSetXLogInsertAllowed();
XLogReportParameters();

/*
* Local WAL inserts enabled, so it's time to finish initialization of
* commit timestamp.
Expand Down

0 comments on commit f663b00

Please sign in to comment.