Skip to content

The statistic value of temporary data size always increases #87118

@JiaQiTang98

Description

@JiaQiTang98

Company or project name

No response

Describe what's wrong

The statistic value of temporary data size only increase but not reduce:

std::atomic<size_t> compressed_size;

void TemporaryDataOnDiskScope::deltaAllocAndCheck(ssize_t compressed_delta, ssize_t uncompressed_delta)
{
if (parent)
parent->deltaAllocAndCheck(compressed_delta, uncompressed_delta);
/// check that we don't go negative
if ((compressed_delta < 0 && stat.compressed_size < static_cast<size_t>(-compressed_delta)) ||
(uncompressed_delta < 0 && stat.uncompressed_size < static_cast<size_t>(-uncompressed_delta)))
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Negative temporary data size");
}
size_t new_consumption = stat.compressed_size + compressed_delta;
if (compressed_delta > 0 && settings.max_size_on_disk && new_consumption > settings.max_size_on_disk)
throw Exception(ErrorCodes::TOO_MANY_ROWS_OR_BYTES,
"Limit for temporary files size exceeded (would consume {} / {} bytes)", new_consumption, settings.max_size_on_disk);
stat.compressed_size += compressed_delta;
stat.uncompressed_size += uncompressed_delta;
}

So when max_temporary_data_on_disk_size is set, the statistic value of temporary data size will increase and finally reach the limit size, However, the real data size in disk is not reach the max_temporary_data_on_disk_size.

Does it reproduce on the most recent release?

Yes

How to reproduce

Set max_temporary_data_on_disk_size to a small size(Such as 10240).
create a table

CREATE TABLE default.t_proj_external
(
    `k1` UInt32,
    `k2` UInt32,
    `k3` UInt32,
    `value` UInt32
)
ENGINE = MergeTree
ORDER BY tuple()
SETTINGS index_granularity = 8192;
INSERT INTO t_proj_external SELECT 1, number%2, number%4, number FROM numbers(50000);

Multiple run the sql(about 4 times)

SELECT k1, k2, k3, sum(value) v FROM t_proj_external GROUP BY k1, k2, k3 ORDER BY k1, k2, k3 SETTINGS optimize_aggregation_in_order = 0, max_bytes_before_external_group_by = 1, max_bytes_ratio_before_external_group_by = 0, group_by_two_level_threshold = 1;
Image

Expected behavior

No response

Error message and/or stacktrace

No response

Additional context

No response

Metadata

Metadata

Labels

bugConfirmed user-visible misbehaviour in official release

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions