Skip to content

Merging #86935: Fix logical error when enable enable_filesystem_query_cache_limit#101428

Merged
alexey-milovidov merged 26 commits into
masterfrom
fix_fs_cache_query_limit
May 1, 2026
Merged

Merging #86935: Fix logical error when enable enable_filesystem_query_cache_limit#101428
alexey-milovidov merged 26 commits into
masterfrom
fix_fs_cache_query_limit

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Mar 31, 2026

Continuation of #86935.

After rebasing onto current master, only one of the original three changes from #86935 is still applicable:

IFileCachePriority::check now treats max_size = 0 and max_elements = 0 as "no limit", consistent with canFit (which already treats 0 as unlimited). This is a defensive guard for the per-query priority, which is constructed with max_elements = 0 (matching the 0 == unlimited convention used elsewhere).

The original PR's other two changes are no longer needed:

  • The FileCacheQueryLimit::QueryContext::remove -> tryRemove rename was needed because EvictionCandidates::finalize called query_context->remove(...) for every evicted entry. That call site was removed from master in commit 62ffdf7, so tryRemove would have no callers.
  • The integration test test_caches_with_query_limit could not reliably reproduce the bug from Use s3 with filesystem cache got Logical error: Cache limits violated. #86855: with the current tryReserve shape, query_priority_iterator->incrementSize is unreachable on the first reservation (the iterator is only assigned inside the !main_priority_iterator branch, and add does not return it), so the per-query check is no longer hit. The "Bugfix validation" CI job confirmed this — the test passed against the master binary.

Original issue: #86855

CI report (previous run with the wider scope): https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101428&sha=95a5413b6250fe9b49e79ae60f50d1babb31909b&name_0=PR

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Treat max_size = 0 and max_elements = 0 in IFileCachePriority::check as "no limit", consistent with how canFit interprets these values. This silences spurious Cache limits violated logical errors on priorities created with zero limits (such as the per-query filesystem cache priority used when enable_filesystem_query_cache_limit is on).

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 31, 2026

Workflow [PR], commit [3a6ecaf]

Summary:


AI Review

Summary

This PR updates IFileCachePriority::check so max_size = 0 and max_elements = 0 are treated as unlimited, matching existing canFit semantics and avoiding false LOGICAL_ERROR exceptions for zero-limit priorities. The change is small, coherent with current cache-limit conventions, and I did not find correctness, safety, or rollout issues in the current diff.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Mar 31, 2026
);
"""
)
node.query("insert into fs_cache_query_limit select number,randomString(4096) from system.numbers limit 1000000")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This test writes 1_000_000 rows with randomString(4096), i.e. roughly ~4 GiB of random payload before compression, which is very heavy for an integration test and can significantly increase CI runtime/flakiness.

Could we reduce the data volume while keeping the same coverage goal (triggering eviction with enable_filesystem_query_cache_limit)? For example, lower the row count and/or set a much smaller per-query write limit so we still exercise the same path with less data.

@kssenii kssenii self-assigned this Apr 1, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member Author

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@@ -49,7 +49,7 @@ std::string IFileCachePriority::Entry::toString(const std::string & prefix) cons

void IFileCachePriority::check(const CacheStateGuard::Lock & lock) const
{
if (getSize(lock) > max_size || getElementsCount(lock) > max_elements)
if ((max_size != 0 && getSize(lock) > max_size) || (max_elements != 0 && getElementsCount(lock) > max_elements))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog category in the PR template is currently CI Fix or Improvement, but this PR changes runtime cache behavior (IFileCachePriority::check and query-limit eviction handling), not CI/test infrastructure.

Please switch the category to a runtime-facing one (likely Bug Fix or Improvement) so release notes classification stays accurate.

alexey-milovidov and others added 2 commits April 9, 2026 20:12
…limit

# Conflicts:
#	tests/integration/test_filesystem_cache/test.py
The test was writing 1M rows * 4096 bytes (~4 GiB) which is excessive
for a 1 MiB cache. Reduce to 1000 rows (~4 MiB) which still exceeds
the cache size and triggers eviction, while being much lighter on CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexey-milovidov
Copy link
Copy Markdown
Member Author

The flaky check failure is fixed in #102148, let's update the branch.

@@ -51,7 +51,7 @@ std::string IFileCachePriority::Entry::toString(const std::string & prefix) cons

void IFileCachePriority::check(const CacheStateGuard::Lock & lock) const
{
if (getSize(lock) > max_size || getElementsCount(lock) > max_elements)
if ((max_size != 0 && getSize(lock) > max_size) || (max_elements != 0 && getElementsCount(lock) > max_elements))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry is currently Fix #86855, which is too vague for CHANGELOG.md users.

Please replace it with a user-readable sentence describing the behavior change and scope, for example:
Fix a logical error in filesystem cache query-limit mode: max_size=0 and max_elements=0 are now treated as unlimited, and query-limit eviction no longer throws when an entry was already removed by the main cache priority path.

@clickhouse-gh clickhouse-gh Bot added pr-bugfix Pull request with bugfix, not backported by default and removed pr-ci labels Apr 24, 2026
assert int(elapsed_time) > 1
assert int(elapsed_time) < 5

def test_caches_with_query_limit(cluster):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This test exercises query-limit eviction, but it does not cover the max_size = 0 / max_elements = 0 behavior changed in IFileCachePriority::check.

Please add a case (in this test or a dedicated one) with enable_filesystem_query_cache_limit = 'true' and zero limits, and assert the query finishes without LOGICAL_ERROR. Otherwise the first fix can regress without test coverage.

The original test in this PR was reduced from `1_000_000` to `1000` rows to
keep CI fast, but the smaller volume no longer reliably triggered the bug
(`Bugfix validation (integration tests)` showed the test passing on the
unfixed master binary, which means it was not exercising the fixed code
paths).

The bug only fires on the eviction/re-add path in `FileCache::tryReserve`
(`src/Interpreters/FileCache/FileCache.cpp` around lines 1219-1246), where
`query_priority_iterator` becomes non-null because an entry is still in the
per-query priority but has been evicted from the main priority. Reaching
that path within a single query requires significantly more data than the
cache size (1 MiB), and ideally re-reads of the same segments.

Increase to `100000` rows of `randomString(4096)` (~400 MiB of incompressible
data, ~400x the cache capacity) and run the `SELECT` three times within the
test to amplify eviction/re-add pressure on the per-query priority.

Report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101428&sha=018ad05eaddbfb4e8961766a5cc955805f8be278&name_0=PR&name_1=Bugfix%20validation%20%28integration%20tests%29
throw Exception(ErrorCodes::LOGICAL_ERROR, "There is no {}:{} in query context", key, offset);
return;

record->second->remove(lock);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tryRemove can still hit an exception when the key exists in records but its stored iterator was already invalidated/removed by a prior eviction path. In that case record->second->remove(lock) calls assertValid and throws LOGICAL_ERROR (Attempt to use invalid iterator).

Please guard the removal with isValid(lock) and always erase from records, e.g.:

auto record = records.find({key, offset});
if (record == records.end())
    return;
if (record->second->isValid(lock))
    record->second->remove(lock);
records.erase(record);

…limit

# Conflicts:
#	programs/keeper-bench/Runner.cpp
#	src/Core/ServerSettings.cpp
…move`

`FileCacheQueryLimit::QueryContext::tryRemove` previously returned only
when the entry was missing from `records`. If the entry was present but
its stored iterator had already been invalidated by a prior eviction
path, `record->second->remove(lock)` would call `assertValid` and throw
`LOGICAL_ERROR ("Attempt to use invalid iterator")`.

Guard the call with `isValid(lock)` and always erase the record entry
from the map.
The previous version of the test did not set
`filesystem_cache_max_download_size`, so `FileCache::getQueryContextHolder`
returned a null holder and the per-query `QueryContext` was never
created. That meant the test passed both before and after the fix and
the bugfix-validation CI job correctly reported "Failed to reproduce
the bug".

Set `filesystem_cache_max_download_size` on each `SELECT` so that the
per-query `QueryContext` is constructed (with `max_elements = 0`) and
the per-query priority's `LRUFileCachePriority::LRUIterator::incrementSize`
calls `IFileCachePriority::check`, which is the path that used to throw
`LOGICAL_ERROR`. Setting `skip_download_if_exceeds_query_cache = 0` also
exercises the recache-on-exceed path so that entries can be re-added to
the per-query priority after eviction.

`enable_filesystem_cache_on_write_operations` is set to `0` for the
INSERT to keep cache state deterministic and to avoid filling the cache
on the write path before reads even start.

CI reports for context:
- https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101428&sha=d119180d9433be7fae7d6d164b9ca8985968dbf8&name_0=PR&name_1=Bugfix%20validation%20%28integration%20tests%29

record->second->remove(lock);
records.erase({key, offset});
if (record->second->isValid(lock))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryContext::tryRemove is introduced here, but I can't find any call site in the tree (rg "tryRemove\(" src/Interpreters/FileCache only matches this definition/declaration).

That means the second fix in the PR description is still effectively inactive: records keeps stale iterators for entries evicted through other paths, and a later lookup can still return an invalid iterator instead of recreating a fresh record.

Can we wire tryRemove into the eviction finalization path (for candidates removed by main/query priority), so records is kept in sync with actual queue state?

alexey-milovidov and others added 4 commits April 28, 2026 21:23
After rebasing onto current master, two of the original three changes
in #86935 are no longer applicable:

1. The `FileCacheQueryLimit::QueryContext::remove` -> `tryRemove`
   rename was needed because `EvictionCandidates::finalize` called
   `query_context->remove(...)` for every evicted entry. That call
   site was removed from master in 62ffdf7, so `tryRemove` had
   no callers and was effectively dead code. Reverting the
   `QueryLimit.cpp` / `QueryLimit.h` changes to match master.

2. The integration test `test_caches_with_query_limit` was meant to
   reproduce the bug from #86855, but with the current `tryReserve`
   shape `query_priority_iterator->incrementSize` is unreachable on
   the first reservation (the iterator is only assigned inside the
   `!main_priority_iterator` branch and `add` does not return it),
   so the per-query `check` is no longer hit. The "Bugfix validation"
   job confirms this - the test passes against the master binary.
   Removing the test since it does not validate the fix.

What remains is the `IFileCachePriority::check` change: treat
`max_size = 0` and `max_elements = 0` as "no limit" (consistent with
`canFit`, which already does so). This is a defensive guard for the
per-query priority, which is constructed with `max_elements = 0`,
matching the `0 == unlimited` convention used elsewhere.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101428&sha=95a5413b6250fe9b49e79ae60f50d1babb31909b&name_0=PR

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A trailing blank line between `test_finished_download_time` and
`test_concurrent_eviction` was inadvertently dropped. Restore it
so the net diff against master is just the one-line change in
`IFileCachePriority::check`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh clickhouse-gh Bot added pr-improvement Pull request with some product improvements and removed pr-bugfix Pull request with bugfix, not backported by default labels Apr 29, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 30, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.10% +0.10%
Functions 91.10% 91.10% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: 50.00% (1/2) · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov added this pull request to the merge queue May 1, 2026
Merged via the queue into master with commit c316f0f May 1, 2026
165 checks passed
@alexey-milovidov alexey-milovidov deleted the fix_fs_cache_query_limit branch May 1, 2026 05:10
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants