Fix starrocks and doris cheating at cold runs. by alexey-milovidov · Pull Request #845 · ClickHouse/ClickBench

alexey-milovidov · 2026-05-04T01:54:31Z

No description provided.

Both StarRocks and Doris run a long-lived BE daemon with a process-internal `storage_page_cache` (default ~20% of RAM) that holds decoded column data across queries. The benchmark's `run.sh` only does `echo 3 > /proc/sys/vm/drop_caches`, which clears the OS page cache but does NOT touch the BE's in-process memory. As a result, the "cold run" (first of three tries) is served from the BE's warm in-memory cache and underreports cold-run latency - a clear violation of benchmark rules (README "Caching" section: cold runs require all database caches to be cleared, not only the OS page cache). This is effectively cheating: every system with internal in-memory caching that does not clear it before the first run gets an unearned advantage on the cold-run leaderboard. Both systems' existing results are already tagged `lukewarm-cold-run`, but they are still displayed under the cold metric on the website. Fix: disable the relevant in-process caches in `be.conf` before starting the BE, so that all reads must go through the OS page cache (which `run.sh` does clear). starrocks/benchmark.sh: disable_storage_page_cache = true datacache_enable = false # covers unified Data Cache in v3.3+ doris/benchmark.sh: disable_storage_page_cache = true segment_cache_capacity = 0 Existing results still carry the stale `lukewarm-cold-run` tag and need to be re-collected on AWS hardware to reflect the corrected configuration. DuckDB does not have this problem: its `run.sh` launches a fresh `duckdb` CLI process per query, so the buffer pool is empty at the start of each cold run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit 1753902 added an inline comment after the line-continuation backslash: -H "timeout:1000" \ # see #740 In bash this is *not* a continuation: the backslash escapes the space (not the newline), the `#` then starts an end-of-line comment, and the unescaped newline terminates the curl command. Curl runs without its URL and fails: curl: (3) URL using bad/illegal format or missing URL so the data never gets loaded into StarRocks. Move the comment to its own line above the curl invocation. No similar pattern was found in any other benchmark.sh / run.sh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ickHouse/ClickBench into fix-cold-run-cheating-starrocks-doris

rschu1ze · 2026-05-04T10:54:56Z

@HappenLee Can you please clarify what information the segment cache and the storage page cache in Doris store and how they work (lifecycle)? I found some scattered bits of information in the Doris docs (e.g. https://doris.apache.org/docs/3.x/admin-manual/trouble-shooting/memory-management/memory-analysis/doris-cache-memory-analysis) but it is not really well documented. In particular, is there a way to clear these caches before each first cold query? If yes, then let's do so instead. Note that "doris-parquet/run.sh" (but not "doris/run.sh") does curl -sS http://127.0.0.1:8040/api/clear_cache/all already does that - will that do the trick?

This PR disables the caches globally, which also impacts hot runs (and that may be unfair).

@murphyatwork I have the similar question for Starrocks. However, in the case of the data cache, the docs say

Currently, Data Cache does not provide a direct interface to clear the cached data.

so disabling the data cache globally seems fair.

Can you please also explain what the block cache is doing (mentioned here)? It is not disabled by this PR. Should it? Can it be cleared between queries otherwise?

HappenLee · 2026-05-08T06:50:53Z

@rschu1ze @alexey-milovidov Hello, Here are my two questions regarding this issue:

First, we consider the page cache mechanism to be reasonable. Its logic is similar to that of DuckDB's buffer pool—previously accessed disk files are pinned and cached in the queue. In real-world production environments, users actually use it this way. So why is it considered unreasonable? If that is the case, could I equally argue that DuckDB's results are unreasonable as well?

Second, if this is indeed unreasonable, shouldn’t we clarify the rules clearly and check each database's results accordingly? For closed-source databases, how can we ensure fairness and verifiability under such rules?

Revert #845

rschu1ze · 2026-05-09T11:11:21Z

Nevermind, I reverted this PR, sorry for the confusion.

alexey-milovidov · 2026-05-09T11:32:23Z

@HappenLee, @rschu1ze, Cold result should run with no caches. Otherwise, the results are non-representative.

Revert "Revert #845"

alexey-milovidov and others added 4 commits May 2, 2026 23:48

Update results

13bf638

Merge branch 'fix-cold-run-cheating-starrocks-doris' of github.com:Cl…

efe0a7b

…ickHouse/ClickBench into fix-cold-run-cheating-starrocks-doris

alexey-milovidov self-assigned this May 4, 2026

alexey-milovidov merged commit e3b12cd into main May 4, 2026

rschu1ze mentioned this pull request May 9, 2026

Revert #845 #884

Merged

rschu1ze added a commit that referenced this pull request May 9, 2026

Merge pull request #884 from ClickHouse/revert-845

a4bef2a

Revert #845

alexey-milovidov added a commit that referenced this pull request May 9, 2026

Merge pull request #885 from ClickHouse/revert-884-revert-845

3acef83

Revert "Revert #845"

lukasvogel pushed a commit to lukasvogel/ClickBench that referenced this pull request May 11, 2026

Revert ClickHouse#845

223580c

lukasvogel pushed a commit to lukasvogel/ClickBench that referenced this pull request May 11, 2026

Revert "Revert ClickHouse#845"

61c339c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix starrocks and doris cheating at cold runs.#845

Fix starrocks and doris cheating at cold runs.#845
alexey-milovidov merged 4 commits into
mainfrom
fix-cold-run-cheating-starrocks-doris

alexey-milovidov commented May 4, 2026

Uh oh!

rschu1ze commented May 4, 2026 •

edited

Loading

Uh oh!

HappenLee commented May 8, 2026

Uh oh!

rschu1ze commented May 9, 2026

Uh oh!

alexey-milovidov commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexey-milovidov commented May 4, 2026

Uh oh!

rschu1ze commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HappenLee commented May 8, 2026

Uh oh!

rschu1ze commented May 9, 2026

Uh oh!

alexey-milovidov commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rschu1ze commented May 4, 2026 •

edited

Loading

alexey-milovidov commented May 9, 2026 •

edited

Loading