Skip to content

Re-instate failing fast on row limits#62804

Merged
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
seandhaynes:shaynes/row-limits-fail-fast
Nov 23, 2025
Merged

Re-instate failing fast on row limits#62804
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
seandhaynes:shaynes/row-limits-fail-fast

Conversation

@seandhaynes
Copy link
Copy Markdown
Contributor

@seandhaynes seandhaynes commented Apr 19, 2024

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fail fast when queries reach row limits. Resolves #61872

Documentation entry for user-facing changes

No documentation required

Details

In ClickHouse <= v23.3, we evaluated the row limit settings max_rows_to_read and max_rows_to_read_leaf in MergeTreeDataSelectExecutor code when processing parts. This worked by using a row counter that was updated as threads called process_part. This optimization was added here: #13677

The benefit of this optimization meant that if users execute a query that is particularly expensive, such as a large volume of part processing or range scans, we would fail almost immediately (in the order of hundreds of milliseconds) rather than taking a significant amount of time (potentially minutes) for the user to be presented with a "Row limit exceeded" exception anyway.

So it gave users feedback very quickly, and saved ClickHouse from doing a lot of wasted work.

The optimization was removed here as part of work on projections.

I believe it was removed as projection code also looks at ranges and the total number of mark files that would be used for an "ordinary" query as well as all projections. So even though projection analysis could result in ClickHouse selecting a projection that uses less marks, we still do the range scan of the "ordinary" parts, which meant projection code saw exceptions.

The consequence of this optimization not being in place can be observed on the below ReplicatedMergeTree table that contains ~870 billion rows:

SELECT
    (intDiv(toUInt32(timestamp), 900) * 900) * 1000 AS t,
    sum(_sample_interval) / 900,
    name
FROM r0.table_name
WHERE timestamp >= toDateTime(1710673528)
GROUP BY
    t,
    name
ORDER BY t ASC

Query id: 1a2d3045-7fe4-4773-b365-3c69a1ec9add


0 rows in set. Elapsed: 84.102 sec.

Received exception from server (version 23.3.9):
Code: 158. DB::Exception: Limit for rows (controlled by 'max_rows_to_read' setting) exceeded, max rows: 10.00 million, current rows: 812 billion rows. (TOO_MANY_ROWS)

Note the execution time of 84 seconds and number of current rows processed.

Here's a flamegraph showing where we spent most of the time:

Screenshot 2024-04-18 at 11 35 06

As we can see, we spend most of the time doing range scans via MergeTreeDataSelectExecutor::filterPartsByPrimaryKeyAndSkipIndexes

With the changes in this PR applied on our clusters, here's the same query that I executed before that took 87 seconds:

SELECT
    (intDiv(toUInt32(timestamp), 900) * 900) * 1000 AS t,
    sum(_sample_interval) / 900,
    name
FROM r0.table_name
WHERE timestamp >= toDateTime(1710673528)
GROUP BY
    t,
    name
ORDER BY t ASC

Query id: 1a2d3045-7fe4-4773-b365-3c69a1ec9add

0 rows in set. Elapsed: 0.898 sec.

Received exception from server (version 23.3.9):
Code: 158. DB::Exception: Limit for rows (controlled by 'max_rows_to_read' setting) exceeded, max rows: 10.00 million, current rows: 480.15 million. (TOO_MANY_ROWS)

This PR also includes an optimization for projections that utilize this "failing fast" on row limits.

When we do analysis for projections, I noticed that they could benefit from failing quickly on these row limits too. For example, I noticed that we do a full range scan on the ordinary table, even if we end up determining a projection uses less marks and using it.

So I've included a change that means that if we encounter row limits whilst doing a range scan on the ordinary table, we dismiss that as a candidate to use. This should be safe, as if it was the candidate that selects less data, the analyze for other projections will also hit row limits and return an error to the client. Whereas if you hit row limits on the ordinary table, we dismiss it and move onto projections in the order of milliseconds with queries that touch a lot of data, so projections should be significantly faster if users would hit row limits on the ordinary table.

In all honesty, we (Cloudflare) don't use projections in production, so it would be great to get some opinions on this to make sure I haven't missed anything else.

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from 5687a03 to 80b8cce Compare April 19, 2024 20:45
@seandhaynes seandhaynes changed the title CLICKHOUSE-2974: Re-instate failing fast on row limits Re-instate failing fast on row limits Apr 19, 2024
@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Apr 20, 2024
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-performance Pull request with some performance improvements label Apr 20, 2024
@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch 4 times, most recently from ba73a6c to 949f63e Compare April 20, 2024 23:03
@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from 949f63e to 37ed8ee Compare October 24, 2025 10:51
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Oct 24, 2025

Workflow [PR], commit [96ee1c5]

Summary:

job_name test_name status info comment
Build (amd_compat) failure
Cmake configuration failure cidb
Stateless tests (amd_binary, ParallelReplicas, s3 storage, parallel) failure
02494_query_cache_ignore_output_settings FAIL cidb
Stateless tests (amd_tsan, s3 storage, parallel) failure
03640_load_marks_synchronously FAIL cidb
Integration tests (arm_binary, distributed plan, 3/4) failure
test_peak_memory_usage/test.py::test_clickhouse_client_max_peak_memory_usage_distributed FAIL cidb
AST fuzzer (amd_debug) failure
Logical error: 'Column '__table7.k UInt64 Replicated(size = 2, Nullable(size = 2, UInt64(size = 2), UInt8(size = 2)), UInt8(size = 2))' is expected to be nullable'. FAIL cidb
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: FAIL cidb
BuzzHouse (amd_ubsan) failure
/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Databases/DDLDependencyVisitor.cpp:253:45: runtime error: member access within null pointer of type 'element_type' (aka 'DB::IAST') FAIL cidb

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from 37ed8ee to c65a507 Compare October 24, 2025 10:52
@seandhaynes
Copy link
Copy Markdown
Contributor Author

seandhaynes commented Oct 24, 2025

@alexey-milovidov Hey! I'm so sorry for the delay with updating this one to get projection tests passing.

I've tested changes I've just pushed against the master branch, and confirmed that projections now handle this optimisation correctly. All projection tests now also pass.

PTAL if you have the time, or let me know if there is someone else we can speak with.

Thank you as always!

@alexey-milovidov
Copy link
Copy Markdown
Member

This is great!

Although I don't like catching exceptions (during normal control flow). How practical will it be to remove try/catch in this case?

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from c65a507 to 8dcc873 Compare October 24, 2025 11:00
@seandhaynes
Copy link
Copy Markdown
Contributor Author

seandhaynes commented Oct 24, 2025

This is great!

Although I don't like catching exceptions (during normal control flow). How practical will it be to remove try/catch in this case?

Thanks Alexey! This is a fair point and I went back and forth in between using exceptions, and finding a way of marking an AnalysisResult unusable.

The main reason why I went for exceptions is I was worried about other use cases being introduced calling getAnalysisResult, and it somewhat failing "silently", giving the caller an unexpected result.

I will try again though to see if I can find a slightly different approach that allows us to avoid exception handling. Thank you for this feedback.

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from 8dcc873 to 053a76f Compare October 24, 2025 11:22
@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch 2 times, most recently from 3b16b4d to 6d8e619 Compare November 11, 2025 21:14
@clickhouse-gh clickhouse-gh bot added the submodule changed At least one submodule changed in this PR. label Nov 11, 2025
@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch 2 times, most recently from 6204d2d to c25e3b0 Compare November 11, 2025 21:26
@seandhaynes
Copy link
Copy Markdown
Contributor Author

@alexey-milovidov thanks again for the feedback. I've removed the use of exceptions in projections code as part of normal control flow by marking an AnalysisResult as unusable.

All projection and row limit tests still pass when running against master with these changes. PTAL 🙏

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch 4 times, most recently from a2f2c57 to e3e4422 Compare November 12, 2025 12:07
@seandhaynes
Copy link
Copy Markdown
Contributor Author

seandhaynes commented Nov 12, 2025

@alexey-milovidov all tests have passed with the exception of 1 test type that is the source of all failures: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=62804&sha=e3e4422488c81fc9463242b82018a858c29f0395&name_0=PR&name_1=BuzzHouse%20%28amd_debug%29

30. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__type_traits/invoke.h:217: void* std::__thread_proxy[abi:se210105]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x0000000015371980
31. ? @ 0x0000000000094ac3
. (NOT_IMPLEMENTED)
(query: TRUNCATE d0.`t30` SETTINGS schema_inference_make_json_columns_nullable = 1 PARALLEL WITH INSERT INTO TABLE d1.`t14` (`c1`, `c0`) SELECT `c1`, `c0` FROM generateRandom('c1 Nullable(Decimal(18)), c0 Bool', 2677439836090237281, 6865, 3) LIMIT 74 PARALLEL WITH INSERT INTO TABLE d3.`t38` (`c1`, `c0`) SELECT `c1`, `c0` FROM generateRandom('c1 Nullable(Decimal(18)), c0 Bool', 2680113005207954932, 5849, 1) LIMIT 257 PARALLEL WITH INSERT INTO TABLE `t3` (`c2`, `c0`, `c1`) SELECT `c2`, `c0`, `c1` FROM generateRandom('c2 Dynamic, c0 Date32, c1 Array(Decimal)', 13002814911059457014, 7593, 4) LIMIT 136;)
Received exception from server (version 25.11.1):
Code: 291. DB::Exception: Received from localhost:9000. DB::Exception: File `/var/lib/clickhouse/user_files/query.data` is not inside `/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/user_files`. Stack trace:

0. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000002530ed32
1. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.cpp:129: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x0000000015224869
2. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:123: DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000d57cfce
3. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:58: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000d57cb11
4. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:141: DB::Exception::Exception<String const&, String const&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<String const&>::type>, String const&, String const&) @ 0x000000000d584d76
5. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:261: DB::(anonymous namespace)::checkCreationIsAllowed(std::shared_ptr<DB::Context const>, String const&, String const&, bool) @ 0x000000001e123669
6. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:347: DB::(anonymous namespace)::getPathsList(String const&, String const&, std::shared_ptr<DB::Context const> const&, unsigned long&) @ 0x000000001e11bc76
7. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:996: DB::StorageFile::FileSource::parse(String const&, std::shared_ptr<DB::Context const> const&, std::optional<bool>) @ 0x000000001e11a78d
8. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/TableFunctionFile.cpp:35: DB::TableFunctionFile::parseFirstArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const> const&) @ 0x00000000174c69d7
9. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/ITableFunctionFileLike.cpp:63: DB::ITableFunctionFileLike::parseArgumentsImpl(absl::lts_20250512::InlinedVector<std::shared_ptr<DB::IAST>, 7ul, std::allocator<std::shared_ptr<DB::IAST>>>&, std::shared_ptr<DB::Context const> const&) @ 0x00000000174ca988
10. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/ITableFunctionFileLike.cpp:52: DB::ITableFunctionFileLike::parseArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) @ 0x00000000174ca7ca
11. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/TableFunctionFactory.cpp:53: DB::TableFunctionFactory::get(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) const @ 0x00000000192b01e6
12. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/InterpreterInsertQuery.cpp:135: DB::InterpreterInsertQuery::getTable(DB::ASTInsertQuery&) @ 0x000000001b2305b5
13. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/InterpreterInsertQuery.cpp:854: DB::InterpreterInsertQuery::execute() @ 0x000000001b2386db
14. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/executeQuery.cpp:1608: DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, std::unique_ptr<DB::ReadBuffer, std::default_delete<DB::ReadBuffer>>&, std::shared_ptr<DB::IAST>&, std::shared_ptr<DB::ImplicitTransactionControlExecutor>, std::function<void ()>) @ 0x000000001b5eea6f
15. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/executeQuery.cpp:1817: DB::executeQuery(String const&, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum) @ 0x000000001b5e83f2
16. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Server/TCPHandler.cpp:763: DB::TCPHandler::runImpl() @ 0x000000001efa9e4a
17. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Server/TCPHandler.cpp:2856: DB::TCPHandler::run() @ 0x000000001efc4a64
18. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Net/src/TCPServerConnection.cpp:40: Poco::Net::TCPServerConnection::start() @ 0x00000000253c6c47
19. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Net/src/TCPServerDispatcher.cpp:115: Poco::Net::TCPServerDispatcher::run() @ 0x00000000253c7225
20. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Foundation/src/ThreadPool.cpp:205: Poco::PooledThread::run() @ 0x00000000253648ff
21. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Foundation/src/Thread_POSIX.cpp:341: Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000025362091
22. ? @ 0x0000000000094ac3
23. __clone @ 0x0000000000125a74
. (DATABASE_ACCESS_DENIED)
(query: INSERT INTO TABLE FUNCTION file('/var/lib/clickhouse/user_files/query.data', 'TabSeparatedRaw') SELECT * FROM (SELECT (greaterOrEquals(`t1d0`.`number`, 9927595660033683655::UInt256) AS `a0`), murmurHash3_32(`t0d0`.`c1`.`Nullable(UInt32)`, `a0`.`JSON(max_dynamic_types=29)`), `t0d0`.`c1`.`c0.c1`, `a0`, sumOrDefaultForEachDistinct('-233:09:51'::Time) OVER (ORDER BY `t0d0`.`c0`) FROM `t3` AS t0d0 ANY FULL JOIN numbers_mt(14218351581512645013::UInt16, 564755, 443813) AS t1d0 ON (NOT and(`t0d0`.`c1.size0` = `t1d0`.`number`, (NOT `t0d0`.`c2`[0] = `t1d0`.`number`)))) ORDER BY ALL SETTINGS output_format_write_statistics = 0;)
Received exception from server (version 25.11.1):
Code: 291. DB::Exception: Received from localhost:9000. DB::Exception: File `/var/lib/clickhouse/user_files/query.data` is not inside `/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/user_files`. Stack trace:

0. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000002530ed32
1. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.cpp:129: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x0000000015224869
2. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:123: DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000d57cfce
3. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:58: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000d57cb11
4. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Common/Exception.h:141: DB::Exception::Exception<String const&, String const&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<String const&>::type>, String const&, String const&) @ 0x000000000d584d76
5. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:261: DB::(anonymous namespace)::checkCreationIsAllowed(std::shared_ptr<DB::Context const>, String const&, String const&, bool) @ 0x000000001e123669
6. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:347: DB::(anonymous namespace)::getPathsList(String const&, String const&, std::shared_ptr<DB::Context const> const&, unsigned long&) @ 0x000000001e11bc76
7. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Storages/StorageFile.cpp:996: DB::StorageFile::FileSource::parse(String const&, std::shared_ptr<DB::Context const> const&, std::optional<bool>) @ 0x000000001e11a78d
8. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/TableFunctionFile.cpp:35: DB::TableFunctionFile::parseFirstArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const> const&) @ 0x00000000174c69d7
9. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/ITableFunctionFileLike.cpp:63: DB::ITableFunctionFileLike::parseArgumentsImpl(absl::lts_20250512::InlinedVector<std::shared_ptr<DB::IAST>, 7ul, std::allocator<std::shared_ptr<DB::IAST>>>&, std::shared_ptr<DB::Context const> const&) @ 0x00000000174ca988
10. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/ITableFunctionFileLike.cpp:52: DB::ITableFunctionFileLike::parseArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) @ 0x00000000174ca7ca
11. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/TableFunctions/TableFunctionFactory.cpp:53: DB::TableFunctionFactory::get(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) const @ 0x00000000192b01e6
12. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/InterpreterInsertQuery.cpp:135: DB::InterpreterInsertQuery::getTable(DB::ASTInsertQuery&) @ 0x000000001b2305b5
13. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/InterpreterInsertQuery.cpp:854: DB::InterpreterInsertQuery::execute() @ 0x000000001b2386db
14. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/executeQuery.cpp:1608: DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, std::unique_ptr<DB::ReadBuffer, std::default_delete<DB::ReadBuffer>>&, std::shared_ptr<DB::IAST>&, std::shared_ptr<DB::ImplicitTransactionControlExecutor>, std::function<void ()>) @ 0x000000001b5eea6f
15. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Interpreters/executeQuery.cpp:1817: DB::executeQuery(String const&, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum) @ 0x000000001b5e83f2
16. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Server/TCPHandler.cpp:763: DB::TCPHandler::runImpl() @ 0x000000001efa9e4a
17. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/src/Server/TCPHandler.cpp:2856: DB::TCPHandler::run() @ 0x000000001efc4a64
18. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Net/src/TCPServerConnection.cpp:40: Poco::Net::TCPServerConnection::start() @ 0x00000000253c6c47
19. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Net/src/TCPServerDispatcher.cpp:115: Poco::Net::TCPServerDispatcher::run() @ 0x00000000253c7225
20. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Foundation/src/ThreadPool.cpp:205: Poco::PooledThread::run() @ 0x00000000253648ff
21. /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/base/poco/Foundation/src/Thread_POSIX.cpp:341: Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000025362091
22. ? @ 0x0000000000094ac3
23. __clone @ 0x0000000000125a74
. (DATABASE_ACCESS_DENIED)

It looks unrelated, and I can see it on some other PRs too. Is it OK getting some help to get this merged please?

@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch 3 times, most recently from be7fbc6 to 4695552 Compare November 17, 2025 22:01
@seandhaynes
Copy link
Copy Markdown
Contributor Author

Hey @alexey-milovidov , sorry to ping you again, but just wondered if you were happy with this change now or you wished for me to check anything else?

I'm still only seeing 2 unrelated test failures after rebasing.

In v22.3 We evaluated both max_rows_to_read and max_rows_to_read_leaf
in MergeTreeDataSelectExecutor code when scanning part ranges. This
 meant the moment we exceeded rows counts when processing ranges,
we return an error to clients to save them waiting for queries to finish
that are going to fail anyway.

This was removed in:

ClickHouse@f524dae

It was removed as projection code also looks at ranges and the total
number of mark files that would be used for an "ordinary" query as
well as all projections. So even though query analysis results in us
using the a projection, we hit the memory limits when enumerating
mark files and saw exceptions.

This re-adds limits back alongside making them compatible with
projections
@seandhaynes seandhaynes force-pushed the shaynes/row-limits-fail-fast branch from 4695552 to 96ee1c5 Compare November 20, 2025 16:05
@alexey-milovidov
Copy link
Copy Markdown
Member

I'm happy with this change, already reviewed the code...

@alexey-milovidov alexey-milovidov self-assigned this Nov 23, 2025
@alexey-milovidov alexey-milovidov removed the submodule changed At least one submodule changed in this PR. label Nov 23, 2025
@alexey-milovidov alexey-milovidov merged commit 53baade into ClickHouse:master Nov 23, 2025
122 of 132 checks passed
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Nov 23, 2025
@seandhaynes
Copy link
Copy Markdown
Contributor Author

Thanks very much, @alexey-milovidov !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SELECT queries no longer fail fast when hitting row limits

4 participants