Skip to content

Commit

Permalink
Truncate files during delta restore when they are larger than expected.
Browse files Browse the repository at this point in the history
Previously the behavior was to download the file from the repository when it was not exactly the same size in PGDATA. However, it may just be that the file was extended and the contents are the same up to the file size recorded in the manifest. This could also be very valuable for files that are always append only, like logs.

Change info.size to file->size in one place. Both are technically correct but file->size makes more sense.

Use the new fileName variable in a few existing places.

Also adjust some existing comments to make them clearer.
  • Loading branch information
dwsteele committed May 24, 2022
1 parent c98baab commit 7ec51e7
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 8 deletions.
13 changes: 13 additions & 0 deletions doc/xml/release.xml
Expand Up @@ -16,6 +16,19 @@
<release-list>
<release date="XXXX-XX-XX" version="2.40dev" title="UNDER DEVELOPMENT">
<release-core-list>
<release-improvement-list>
<release-item>
<github-pull-request id="1758"/>

<release-item-contributor-list>
<release-item-contributor id="david.steele"/>
<release-item-reviewer id="stephen.frost"/>
</release-item-contributor-list>

<p>Truncate files during delta <cmd>restore</cmd> when they are larger than expected.</p>
</release-item>
</release-improvement-list>

<release-development-list>
<release-item>
<github-pull-request id="1749"/>
Expand Down
47 changes: 39 additions & 8 deletions src/command/restore/file.c
Expand Up @@ -72,34 +72,65 @@ List *restoreFile(
// Else use size and checksum
else
{
// Only continue delta if the file size is as expected
if (info.size == file->size)
// Only continue delta if the file size is as expected or larger
if (info.size >= file->size)
{
const char *const fileName = strZ(storagePathP(storagePg(), file->name));

// If the file was extended since the backup, then truncate it to the size it was during the backup as
// it might have only been appended to with the earlier portion being unchanged (we will verify this
// using the checksum below)
if (info.size > file->size)
{
// Open the file for write
int fd = open(fileName, O_WRONLY, 0);
THROW_ON_SYS_ERROR_FMT(fd == -1, FileReadError, STORAGE_ERROR_WRITE_OPEN, fileName);

TRY_BEGIN()
{
// Truncate to original size
THROW_ON_SYS_ERROR_FMT(
ftruncate(fd, (off_t)file->size) == -1, FileWriteError, "unable to truncate file '%s'",
fileName);

// Sync
THROW_ON_SYS_ERROR_FMT(fsync(fd) == -1, FileSyncError, STORAGE_ERROR_WRITE_SYNC, fileName);
}
FINALLY()
{
THROW_ON_SYS_ERROR_FMT(close(fd) == -1, FileCloseError, STORAGE_ERROR_WRITE_CLOSE, fileName);
}
TRY_END();

// Update info
info = storageInfoP(storagePg(), file->name, .followLink = true);
}

// Generate checksum for the file if size is not zero
IoRead *read = NULL;

if (info.size != 0)
if (file->size != 0)
{
read = storageReadIo(storageNewReadP(storagePgWrite(), file->name));
ioFilterGroupAdd(ioReadFilterGroup(read), cryptoHashNew(HASH_TYPE_SHA1_STR));
ioReadDrain(read);
}

// If the checksum is also equal (or file is zero size) then no need to copy the file
// If the checksum is the same (or file is zero size) then no need to copy the file
if (file->size == 0 ||
strEq(
file->checksum,
pckReadStrP(ioFilterGroupResultP(ioReadFilterGroup(read), CRYPTO_HASH_FILTER_TYPE))))
{
// Even if hash/size are the same set the time back to backup time. This helps with unit testing,
// but also presents a pristine version of the database after restore.
// If the hash/size are now the same but the time is not, then set the time back to the backup time.
// This helps with unit testing, but also presents a pristine version of the database after restore.
if (info.timeModified != file->timeModified)
{
THROW_ON_SYS_ERROR_FMT(
utime(
strZ(storagePathP(storagePg(), file->name)),
fileName,
&((struct utimbuf){.actime = file->timeModified, .modtime = file->timeModified})) == -1,
FileInfoError, "unable to set time for '%s'", strZ(storagePathP(storagePg(), file->name)));
FileInfoError, "unable to set time for '%s'", fileName);
}

fileResult->result = restoreResultPreserve;
Expand Down
8 changes: 8 additions & 0 deletions test/src/module/command/restoreTest.c
Expand Up @@ -2921,6 +2921,14 @@ testRun(void)
// Change timestamp so it will be updated
HRN_STORAGE_TIME(storagePgWrite(), "global/999", 777);

// Enlarge a file so it gets truncated. Keep timestamp the same to prove that it gets updated after the truncate.
HRN_STORAGE_PUT_Z(
storagePgWrite(), "base/16384/" PG_FILE_PGVERSION, PG_VERSION_94_STR "\n\n", .modeFile = 0600,
.timeModified = 1482182860);

// Enlarge a zero-length file so it gets truncated
HRN_STORAGE_PUT_Z(storagePgWrite(), "global/888", "X", .modeFile = 0600);

// Change size so delta will skip based on size
HRN_STORAGE_PUT_Z(storagePgWrite(), "base/1/2", BOGUS_STR);
HRN_STORAGE_MODE(storagePgWrite(), "base/1/2", 0600);
Expand Down

0 comments on commit 7ec51e7

Please sign in to comment.