ADBDEV-4789: Fix an incorrect memory assumption for hash join statistics #676

bandetto · 2024-01-18T10:30:50Z

Fix an incorrect memory assumption for hash join statistics

When a hashjoin's hashtable runs out of allowed memory, it divides the hash
table into batches, and stores each batch to disk. When using EXPLAIN ANALYZE,
an additional description is created for the number of batches and the memory
they used. (With normal EXPLAIN, the request is not executed, and accordingly,
batches are not created). To store information about the disk space occupied by
a batch, the HashJoinTableStats structure is used, which has a batchstats
member of type HashJoinBatchStats, which is an array of structures for each
batch.

Usually, the number of batches is calculated at the beginning of hashjoin, but
sometimes the calculation algorithm may miss. Then, the
ExecHashIncreaseNumBatches() is used, which increases the current number of
batches, multiplying the previous one by two, and allocating additional memory
to the structures that need it.

Each structure of type HashJoinBatchStats takes 80 bytes, opposed to 8 bytes
for a pointer. Since memory for batchstats member is allocated for the
structures themselves, and not pointers to them, the size of realloc() call
for batchstats in ExecHashIncreaseNumBatches() can become more than
MaxAllocSize and cause an error.

In the beginning of ExecHashIncreaseNumBatches(), there is a check that
assumes that each item in any of the reallocated arrays would not be more than 8
bytes (sizeof(void *)). oldnbatch is equal to nbatch / 2. When we take the
size of the HashJoinBatchStats into account, we may notice that this check is
not correct specifically for this allocation, and is off by 10 times of the
value of the faulty repalloc() call.

The issue is that batchstats grows faster than other arrays containing
pointers. So this check does not prevent the faulty statistics batches
reallocation. When we are nearing MaxAllocSize with
nbatch * sizeof(stats->batchstats[0]), and try to reallocate the batchstats
member, we will encounter an error, because allocation size requested will be
more than MaxAllocSize.

Use repalloc_huge() instead of repalloc() for statistics, to allow
requesting memory up to:
MaxAllocSize / ((sizeof(void *) * 2) = 67 108 863;
67 108 863 * 80 = 5 368 709 040, which is slightly less than 5GB.

On a 32-bit machines, nbatch * sizeof(stats->batchstats[0]) could exceed
MaxAllocHugeSize, which is 2 ^ 32 = 4GB. Add a separate check and stop if
nbatch > MaxAllocHugeSize / sizeof(HashJoinBatchStats) to avoid integer
overflow.

On 64-bit machines, maximum allowed amount of batches is unchanged.

This reverts commit 4b6273b.

BenderArenadata · 2024-01-18T12:08:44Z

Allure report https://allure-ee.adsw.io/launch/62045

BenderArenadata · 2024-01-18T12:26:51Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/977661

BenderArenadata · 2024-01-18T15:05:54Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/977662

src/backend/executor/nodeHash.c

BenderArenadata · 2024-01-19T08:23:16Z

Allure report https://allure-ee.adsw.io/launch/62101

BenderArenadata · 2024-01-19T08:33:30Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/980215

BenderArenadata · 2024-01-19T10:28:26Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/980216

BenderArenadata · 2024-01-19T10:29:37Z

Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/980209

BenderArenadata · 2024-01-19T11:07:34Z

Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/980210

src/backend/executor/nodeHash.c

BenderArenadata · 2024-01-22T07:02:55Z

Allure report https://allure-ee.adsw.io/launch/62201

BenderArenadata · 2024-01-22T07:13:04Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/984620

BenderArenadata · 2024-01-22T08:08:56Z

Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/984614

BenderArenadata · 2024-01-22T08:30:30Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/984621

RekGRpth · 2024-01-22T11:10:46Z

provide last commit message

On a 32-bit machines, `nbatch * sizeof(stats->batchstats[0])` could exceed MaxAllocHugeSize, which is SIZE_MAX / 2. Stop if `nbatch > MaxAllocHugeSize / sizeof(HashJoinBatchStats)` to avoid integer overflow.

BenderArenadata · 2024-01-22T12:24:26Z

Allure report https://allure-ee.adsw.io/launch/62245

BenderArenadata · 2024-01-22T12:34:18Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/986017

BenderArenadata · 2024-01-22T12:41:28Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/986018

RekGRpth · 2024-01-23T03:16:03Z

provide last commit message

I meant the commit message that will go into the resulting patch.

src/backend/executor/nodeHash.c

BenderArenadata · 2024-01-24T08:53:37Z

Allure report https://allure-ee.adsw.io/launch/62415

BenderArenadata · 2024-01-24T09:03:34Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/992699

BenderArenadata · 2024-01-24T10:50:00Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/992700

BenderArenadata · 2024-01-25T10:15:16Z

Allure report https://allure-ee.adsw.io/launch/62523

BenderArenadata · 2024-01-25T10:25:32Z

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/997908

BenderArenadata · 2024-01-25T10:32:26Z

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/997909

BenderArenadata · 2024-01-25T11:58:30Z

Failed job Behave tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/997906

When a hashjoin's hashtable runs out of allowed memory, it divides the hash table into batches, and stores each batch to disk. When using EXPLAIN ANALYZE, an additional description is created for the number of batches and the memory they used. (With normal EXPLAIN, the request is not executed, and accordingly, batches are not created). To store information about the disk space occupied by a batch, the HashJoinTableStats structure is used, which has a batchstats member of type HashJoinBatchStats, which is an array of structures for each batch. Usually, the number of batches is calculated at the beginning of hashjoin, but sometimes the calculation algorithm may miss. Then, the ExecHashIncreaseNumBatches() is used, which increases the current number of batches, multiplying the previous one by two, and allocating additional memory to the structures that need it. Each structure of type HashJoinBatchStats takes 80 bytes, opposed to 8 bytes for a pointer. Since memory for batchstats member is allocated for the structures themselves, and not pointers to them, the size of realloc() call for batchstats in ExecHashIncreaseNumBatches() can become more than MaxAllocSize and cause an error. In the beginning of ExecHashIncreaseNumBatches(), there is a check that assumes that each item in any of the reallocated arrays would not be more than 8 bytes (sizeof(void *)). oldnbatch is equal to nbatch / 2. When we take the size of the HashJoinBatchStats into account, we may notice that this check is not correct specifically for this allocation, and is off by 10 times of the value of the faulty repalloc() call. The issue is that batchstats grows faster than other arrays containing pointers. So this check does not prevent the faulty statistics batches reallocation. When we are nearing MaxAllocSize with nbatch * sizeof(stats->batchstats[0]), and try to reallocate the batchstats member, we will encounter an error, because allocation size requested will be more than MaxAllocSize. Use repalloc_huge() instead of repalloc() for statistics, to allow requesting memory up to: MaxAllocSize / ((sizeof(void *) * 2) = 67 108 863; 67 108 863 * 80 = 5 368 709 040, which is slightly less than 5GB. On a 32-bit machines, nbatch * sizeof(stats->batchstats[0]) could exceed MaxAllocHugeSize, which is 2 ^ 32 = 4GB. Add a separate check and stop if nbatch > MaxAllocHugeSize / sizeof(HashJoinBatchStats) to avoid integer overflow. On 64-bit machines, maximum allowed amount of batches is unchanged. (cherry picked from commit 625bcf5)

bandetto added 3 commits January 18, 2024 13:15

Allocate memory for explain batches separately

3dee9cc

Revert "Allocate memory for explain batches separately"

0d33d84

This reverts commit 4b6273b.

Use repalloc_huge instead of repalloc

1bfebbe

bandetto marked this pull request as ready for review January 18, 2024 10:36

RekGRpth reviewed Jan 19, 2024

View reviewed changes

src/backend/executor/nodeHash.c Outdated Show resolved Hide resolved

andr-sokolov requested a review from bimboterminator1 January 19, 2024 05:47

HashJoinBatchStats -> stats->batchstats[0]

778e90f

RekGRpth reviewed Jan 22, 2024

View reviewed changes

src/backend/executor/nodeHash.c Show resolved Hide resolved

Add an overflow check for 32 bit systems

665598a

On a 32-bit machines, `nbatch * sizeof(stats->batchstats[0])` could exceed MaxAllocHugeSize, which is SIZE_MAX / 2. Stop if `nbatch > MaxAllocHugeSize / sizeof(HashJoinBatchStats)` to avoid integer overflow.

bandetto force-pushed the ADBDEV-4789 branch from 8b6d150 to 665598a Compare January 22, 2024 11:49

bimboterminator1 reviewed Jan 23, 2024

View reviewed changes

src/backend/executor/nodeHash.c Show resolved Hide resolved

bandetto added 2 commits January 24, 2024 11:17

Fix the estimation and clarify that it's for MaxAllocSize = 1GB

0e0c0bf

Rewrap the comment to 80 columns

f8d1814

bimboterminator1 approved these changes Jan 25, 2024

View reviewed changes

RekGRpth approved these changes Jan 25, 2024

View reviewed changes

Merge branch 'adb-6.x-dev' into ADBDEV-4789

b4cb13f

andr-sokolov merged commit 625bcf5 into adb-6.x-dev Jan 26, 2024
5 checks passed

andr-sokolov deleted the ADBDEV-4789 branch January 26, 2024 05:17

Stolb27 mentioned this pull request Feb 19, 2024

Release 6.26.0_arenadata54 #821

Merged

Stolb27 mentioned this pull request Mar 13, 2024

Arenadata patchset #55 #890

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADBDEV-4789: Fix an incorrect memory assumption for hash join statistics #676

ADBDEV-4789: Fix an incorrect memory assumption for hash join statistics #676

bandetto commented Jan 18, 2024 •

edited

Loading

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

RekGRpth commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

RekGRpth commented Jan 23, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

ADBDEV-4789: Fix an incorrect memory assumption for hash join statistics #676

ADBDEV-4789: Fix an incorrect memory assumption for hash join statistics #676

Conversation

bandetto commented Jan 18, 2024 • edited Loading

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 18, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 19, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

RekGRpth commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

BenderArenadata commented Jan 22, 2024

RekGRpth commented Jan 23, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 24, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

BenderArenadata commented Jan 25, 2024

bandetto commented Jan 18, 2024 •

edited

Loading