Skip to content

Resubmit 'aggregated zookeeper log'#87208

Merged
mstetsyuk merged 3 commits intomasterfrom
fix-aggregated-zookeeper-log
Sep 17, 2025
Merged

Resubmit 'aggregated zookeeper log'#87208
mstetsyuk merged 3 commits intomasterfrom
fix-aggregated-zookeeper-log

Conversation

@mstetsyuk
Copy link
Copy Markdown
Member

Original PR: #85102, revert: #87185.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Adds a new system.aggregated_zookeeper_log table. The table contains statistics (e.g. number of operations, average latency, errors) of ZooKeeper operations grouped by session id, parent path and operation type, and periodically flushed to disk.

This table is meant to be used in production by default because, in contrast to system.zookeeper_log, it's much more lightweight.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Sep 16, 2025

Workflow [PR], commit [f9563d9]

Summary:

job_name test_name status info comment
Integration tests (amd_tsan, 3/6) failure
test_replication_credentials/test.py::test_same_credentials FAIL

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Sep 16, 2025
@antaljanosbenjamin antaljanosbenjamin self-assigned this Sep 16, 2025
@antaljanosbenjamin
Copy link
Copy Markdown
Member

The race condition why it was reverted is this race condition:

E           Exception: Sanitizer assert found for instance ==================
E           WARNING: ThreadSanitizer: data race (pid=8)
E             Write of size 8 at 0x727800002400 by main thread (mutexes: write M0):
E               #0 DB::ContextData::resetSharedContext() ci/tmp/build/./src/Interpreters/Context.cpp:1153:12 (clickhouse+0x1be0f9a9) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #1 DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)::$_3::operator()() const ci/tmp/build/./programs/server/Server.cpp:1302:5 (clickhouse+0x1373121d) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #2 BasicScopeGuard<DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)::$_3>::invoke() ci/tmp/build/./base/base/../base/scope_guard.h:101:9 (clickhouse+0x1370cfa0) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #3 BasicScopeGuard<DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)::$_3>::~BasicScopeGuard() ci/tmp/build/./base/base/../base/scope_guard.h:50:26 (clickhouse+0x1370cfa0)
E               #4 DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&) ci/tmp/build/./programs/server/Server.cpp:2900:1 (clickhouse+0x1370cfa0)
E               #5 Poco::Util::Application::run() ci/tmp/build/./base/poco/Util/src/Application.cpp:315:8 (clickhouse+0x2b4b117e) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #6 DB::Server::run() ci/tmp/build/./programs/server/Server.cpp:628:25 (clickhouse+0x136ecdd6) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #7 Poco::Util::ServerApplication::run(int, char**) ci/tmp/build/./base/poco/Util/src/ServerApplication.cpp:131:9 (clickhouse+0x2b4ced20) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #8 mainEntryClickHouseServer(int, char**) ci/tmp/build/./programs/server/Server.cpp:415:20 (clickhouse+0x136e994c) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #9 main ci/tmp/build/./programs/main.cpp:381:21 (clickhouse+0x922ab84) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E           
E             Previous read of size 8 at 0x727800002400 by thread T676:
E               #0 DB::Context::getZooKeeperLog() const ci/tmp/build/./src/Interpreters/Context.cpp:5159:10 (clickhouse+0x1be3c6b4) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #1 Coordination::ZooKeeper::getZooKeeperLog() ci/tmp/build/./src/Common/ZooKeeper/ZooKeeperImpl.cpp:1832:55 (clickhouse+0x234eca97) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #2 Coordination::ZooKeeper::logOperationIfNeeded(std::__1::shared_ptr<Coordination::ZooKeeperRequest> const&, std::__1::shared_ptr<Coordination::ZooKeeperResponse> const&, bool, unsigned long) ci/tmp/build/./src/Common/ZooKeeper/ZooKeeperImpl.cpp:1865:25 (clickhouse+0x234e8884) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #3 Coordination::ZooKeeper::receiveEvent() ci/tmp/build/./src/Common/ZooKeeper/ZooKeeperImpl.cpp:1109:9 (clickhouse+0x234eab44) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #4 Coordination::ZooKeeper::receiveThread() ci/tmp/build/./src/Common/ZooKeeper/ZooKeeperImpl.cpp:910:17 (clickhouse+0x234e95cb) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #5 Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1::operator()() const ci/tmp/build/./src/Common/ZooKeeper/ZooKeeperImpl.cpp:472:56 (clickhouse+0x234f5110) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #6 decltype(std::declval<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&>()()) std::__1::__invoke[abi:ne190107]<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&) ci/tmp/build/./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:149:25 (clickhouse+0x234f5110)
E               #7 decltype(auto) std::__1::__apply_tuple_impl[abi:ne190107]<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&, std::__1::tuple<>&>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&, std::__1::tuple<>&, std::__1::__tuple_indices<...>) ci/tmp/build/./contrib/llvm-project/libcxx/include/tuple:1354:5 (clickhouse+0x234f5110)
E               #8 decltype(auto) std::__1::apply[abi:ne190107]<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&, std::__1::tuple<>&>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&, std::__1::tuple<>&) ci/tmp/build/./contrib/llvm-project/libcxx/include/tuple:1358:5 (clickhouse+0x234f5110)
E               #9 ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'()::operator()() ci/tmp/build/./src/Common/ThreadPool.h:312:13 (clickhouse+0x234f5110)
E               #10 decltype(std::declval<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>()()) std::__1::__invoke[abi:ne190107]<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'()&>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&) ci/tmp/build/./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:149:25 (clickhouse+0x234f5110)
E               #11 void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ne190107]<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'()&>(ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'()&) ci/tmp/build/./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:224:5 (clickhouse+0x234f5110)
E               #12 std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'(), void ()>::operator()[abi:ne190107]() ci/tmp/build/./contrib/llvm-project/libcxx/include/__functional/function.h:210:12 (clickhouse+0x234f5110)
E               #13 void std::__1::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1>(Coordination::ZooKeeper::ZooKeeper(std::__1::vector<zkutil::ShuffleHost, std::__1::allocator<zkutil::ShuffleHost>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>, std::__1::shared_ptr<DB::AggregatedZooKeeperLog>)::$_1&&)::'lambda'(), void ()>>(std::__1::__function::__policy_storage const*) ci/tmp/build/./contrib/llvm-project/libcxx/include/__functional/function.h:610:12 (clickhouse+0x234f5110)
E               #14 std::__1::__function::__policy_func<void ()>::operator()[abi:ne190107]() const ci/tmp/build/./contrib/llvm-project/libcxx/include/__functional/function.h:716:12 (clickhouse+0x13443be2) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #15 std::__1::function<void ()>::operator()() const ci/tmp/build/./contrib/llvm-project/libcxx/include/__functional/function.h:989:10 (clickhouse+0x13443be2)
E               #16 ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::worker() ci/tmp/build/./src/Common/ThreadPool.cpp:812:17 (clickhouse+0x13443be2)
E               #17 decltype(*std::declval<ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*>().*std::declval<void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)()>()()) std::__1::__invoke[abi:ne190107]<void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*, void>(void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*&&) ci/tmp/build/./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117:25 (clickhouse+0x1344cc9b) (BuildId: f275f366de4b5809e64c6b3eb6880b73268c3b32)
E               #18 void std::__1::__thread_execute[abi:ne190107]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*>&, std::__1::__tuple_indices<2ul>) ci/tmp/build/./contrib/llvm-project/libcxx/include/__thread/thread.h:192:3 (clickhouse+0x1344cc9b)
E               #19 void* std::__1::__thread_proxy[abi:ne190107]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*>>(void*) ci/tmp/build/./contrib/llvm-project/libcxx/include/__thread/thread.h:201:3 (clickhouse+0x1344cc9b)

@mstetsyuk mstetsyuk added this pull request to the merge queue Sep 17, 2025
Merged via the queue into master with commit baa7fd7 Sep 17, 2025
121 of 123 checks passed
@mstetsyuk mstetsyuk deleted the fix-aggregated-zookeeper-log branch September 17, 2025 16:01
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants