Skip to content

Add ability to limit requests for keeper (max_request_size setting)#87952

Merged
azat merged 1 commit intoClickHouse:masterfrom
azat:keeper-max_request_size
Oct 3, 2025
Merged

Add ability to limit requests for keeper (max_request_size setting)#87952
azat merged 1 commit intoClickHouse:masterfrom
azat:keeper-max_request_size

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Sep 30, 2025

I found that the simple query will lead to:

insert into system.zookeeper (name, path, value) select number::String, '/test_soft_limit', repeat('a', 3000) from numbers(1e6)
2025.09.30 22:00:10.699397 [ 10382 ] {} <Error> void DB::KeeperDispatcher::requestThread(): Code: 361. DB::Exception: Seek position is out of bounds. Offset: -1 224 078 329, Max: 3 070 889 012. (SEEK_POSITION_OUT_OF_BOUND), Stack trace (when copying this message, always include the lines below)
Details
0. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000001ede2a79
1. /src/ch/clickhouse/src/Common/Exception.cpp:128: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000f74ad62
2. /src/ch/clickhouse/src/Common/Exception.h:123: DB::Exception::Exception(String&&, int, String, bool) @ 0x0000000008b2b211
3. /src/ch/clickhouse/src/Common/Exception.h:58: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000008b2adfb
4. /src/ch/clickhouse/src/Common/Exception.h:141: DB::Exception::Exception<long&, String>(int, FormatStringHelperImpl<std::type_identity<long&>::type, std::type_identity<String>::type>, long&, String&&) @ 0x000000000d44f12f
5. /src/ch/clickhouse/src/IO/ReadBufferFromMemory.cpp:45: DB::ReadBufferFromMemoryHelper<DB::ReadBufferFromMemory>::seekImpl(long, int) @ 0x000000000f820bf7
6. /src/ch/clickhouse/src/IO/ReadBufferFromMemory.h:36: DB::IKeeperStateMachine::parseRequest(nuraft::buffer&, bool, DB::IKeeperStateMachine::ZooKeeperLogSerializationVersion*, unsigned long*) @ 0x00000000198fa56c
7. /src/ch/clickhouse/src/Coordination/KeeperServer.cpp:984: DB::KeeperServer::callbackFunc(nuraft::cb_func::Type, nuraft::cb_func::Param*) @ 0x00000000198bccff
8. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x000000001c615ba2
9. /src/ch/clickhouse/contrib/NuRaft/src/handle_client_request.cxx:93: nuraft::raft_server::handle_cli_req_prelock(nuraft::req_msg&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c614d0f
10. /src/ch/clickhouse/contrib/NuRaft/src/raft_server.cxx:747: nuraft::raft_server::process_req(nuraft::req_msg&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c60b899
11. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:184: nuraft::raft_server::send_msg_to_leader(std::shared_ptr<nuraft::req_msg>&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c62a205
12. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:109: nuraft::raft_server::append_entries_ext(std::vector<std::shared_ptr<nuraft::buffer>, std::allocator<std::shared_ptr<nuraft::buffer>>> const&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c62c0b0
13. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:84: nuraft::raft_server::append_entries(std::vector<std::shared_ptr<nuraft::buffer>, std::allocator<std::shared_ptr<nuraft::buffer>>> const&) @ 0x000000001c62be71
14. /src/ch/clickhouse/src/Coordination/KeeperServer.cpp:681: DB::KeeperServer::putRequestBatch(std::vector<DB::KeeperRequestForSession, std::allocator<DB::KeeperRequestForSession>> const&) @ 0x00000000198bc2df
15. /src/ch/clickhouse/src/Coordination/KeeperDispatcher.cpp:271: DB::KeeperDispatcher::requestThread() @ 0x0000000019896a3e
16. /src/ch/clickhouse/src/Coordination/KeeperDispatcher.cpp:465: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::KeeperDispatcher::initialize(Poco::Util::Abstract
Configuration const&, bool, bool, std::shared_ptr<DB::Macros const> const&)::$_1>(DB::KeeperDispatcher::initialize(Poco::Util::AbstractConfiguration const&, bool, bool, std::shared_ptr<DB::Macros const> const&)::$_1&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x0000000
0198a548b
17. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x000000000f87e998
18. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolIm
pl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x000000000f884342
19. /usr/src/debug/glibc/glibc/nptl/pthread_create.c:448: start_thread @ 0x00000000000969cb
20. ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78: __clone3 @ 0x000000000011aa0c
 (version 25.10.1.1)

The reason for this is that the request length is not verified at all, so let's introduce a separate setting for this - max_request_size

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add ability to limit requests for Keeper (max_request_size setting, same as jute.maxbuffer for ZooKeeper, default OFF for backward compatibility, will be set in the next releases)

@azat azat requested a review from antonio2368 September 30, 2025 21:01
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Sep 30, 2025

Workflow [PR], commit [7b91e93]

Summary:

job_name test_name status info comment
Stress test (amd_ubsan) failure
Server died FAIL
Hung check failed, possible deadlock found (see hung_check.log) FAIL
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Sep 30, 2025
@antonio2368 antonio2368 self-assigned this Oct 1, 2025
@antonio2368
Copy link
Copy Markdown
Member

Looks good, but 10MiB limit will break for sure some setups.
Not sure what would be safe enough value, 100MiB maybe?

@azat
Copy link
Copy Markdown
Member Author

azat commented Oct 1, 2025

Actually let's even disable it for now and enable separately

@azat azat changed the title Add ability to limit requests for keeper (with max_request_size, default 10MiB) Add ability to limit requests for keeper (max_request_size setting) Oct 1, 2025
@azat azat enabled auto-merge October 1, 2025 13:38
@azat azat disabled auto-merge October 1, 2025 15:27
@azat azat marked this pull request as draft October 1, 2025 17:30
@azat azat marked this pull request as ready for review October 2, 2025 11:07
@azat azat force-pushed the keeper-max_request_size branch from b11b8c5 to 6d62be3 Compare October 3, 2025 12:54
@azat azat enabled auto-merge October 3, 2025 17:17
…fault)

I found that the simple query will lead to:

  insert into system.zookeeper (name, path, value) select number::String, '/test_soft_limit', repeat('a', 3000) from numbers(1e6)

  2025.09.30 22:00:10.699397 [ 10382 ] {} <Error> void DB::KeeperDispatcher::requestThread(): Code: 361. DB::Exception: Seek position is out of bounds. Offset: -1 224 078 329, Max: 3 070 889 012. (SEEK_POSITION_OUT_OF_BOUND), Stack trace (when copying this message, always include the lines below)

<details>

```
0. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000001ede2a79
1. /src/ch/clickhouse/src/Common/Exception.cpp:128: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000f74ad62
2. /src/ch/clickhouse/src/Common/Exception.h:123: DB::Exception::Exception(String&&, int, String, bool) @ 0x0000000008b2b211
3. /src/ch/clickhouse/src/Common/Exception.h:58: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000008b2adfb
4. /src/ch/clickhouse/src/Common/Exception.h:141: DB::Exception::Exception<long&, String>(int, FormatStringHelperImpl<std::type_identity<long&>::type, std::type_identity<String>::type>, long&, String&&) @ 0x000000000d44f12f
5. /src/ch/clickhouse/src/IO/ReadBufferFromMemory.cpp:45: DB::ReadBufferFromMemoryHelper<DB::ReadBufferFromMemory>::seekImpl(long, int) @ 0x000000000f820bf7
6. /src/ch/clickhouse/src/IO/ReadBufferFromMemory.h:36: DB::IKeeperStateMachine::parseRequest(nuraft::buffer&, bool, DB::IKeeperStateMachine::ZooKeeperLogSerializationVersion*, unsigned long*) @ 0x00000000198fa56c
7. /src/ch/clickhouse/src/Coordination/KeeperServer.cpp:984: DB::KeeperServer::callbackFunc(nuraft::cb_func::Type, nuraft::cb_func::Param*) @ 0x00000000198bccff
8. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x000000001c615ba2
9. /src/ch/clickhouse/contrib/NuRaft/src/handle_client_request.cxx:93: nuraft::raft_server::handle_cli_req_prelock(nuraft::req_msg&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c614d0f
10. /src/ch/clickhouse/contrib/NuRaft/src/raft_server.cxx:747: nuraft::raft_server::process_req(nuraft::req_msg&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c60b899
11. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:184: nuraft::raft_server::send_msg_to_leader(std::shared_ptr<nuraft::req_msg>&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c62a205
12. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:109: nuraft::raft_server::append_entries_ext(std::vector<std::shared_ptr<nuraft::buffer>, std::allocator<std::shared_ptr<nuraft::buffer>>> const&, nuraft::raft_server::req_ext_params const&) @ 0x000000001c62c0b0
13. /src/ch/clickhouse/contrib/NuRaft/src/handle_user_cmd.cxx:84: nuraft::raft_server::append_entries(std::vector<std::shared_ptr<nuraft::buffer>, std::allocator<std::shared_ptr<nuraft::buffer>>> const&) @ 0x000000001c62be71
14. /src/ch/clickhouse/src/Coordination/KeeperServer.cpp:681: DB::KeeperServer::putRequestBatch(std::vector<DB::KeeperRequestForSession, std::allocator<DB::KeeperRequestForSession>> const&) @ 0x00000000198bc2df
15. /src/ch/clickhouse/src/Coordination/KeeperDispatcher.cpp:271: DB::KeeperDispatcher::requestThread() @ 0x0000000019896a3e
16. /src/ch/clickhouse/src/Coordination/KeeperDispatcher.cpp:465: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::KeeperDispatcher::initialize(Poco::Util::Abstract
Configuration const&, bool, bool, std::shared_ptr<DB::Macros const> const&)::$_1>(DB::KeeperDispatcher::initialize(Poco::Util::AbstractConfiguration const&, bool, bool, std::shared_ptr<DB::Macros const> const&)::$_1&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x0000000
0198a548b
17. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x000000000f87e998
18. /src/ch/clickhouse/contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolIm
pl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x000000000f884342
19. /usr/src/debug/glibc/glibc/nptl/pthread_create.c:448: start_thread @ 0x00000000000969cb
20. ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78: __clone3 @ 0x000000000011aa0c
 (version 25.10.1.1)
```

</details>

The reason for this is that the request length is not verified at all,
so let's introduce a separate setting for this - `max_request_size`

v2: Disable max_request_size by default
v3: Avoid using LimitedReadBuffer if max_request_size is 0 in keeper
v4: Set max_request_size in CI only for newer ClickHouse
@azat azat force-pushed the keeper-max_request_size branch from 6d62be3 to 7b91e93 Compare October 3, 2025 17:21
@azat azat added this pull request to the merge queue Oct 3, 2025
Merged via the queue into ClickHouse:master with commit 02f387e Oct 3, 2025
121 of 123 checks passed
@azat azat deleted the keeper-max_request_size branch October 3, 2025 22:09
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants