Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ThreadPool in PipelineExecutor #48146

Merged
merged 2 commits into from
Apr 13, 2023

Conversation

azat
Copy link
Collaborator

@azat azat commented Mar 29, 2023

PipelineExecutor is one of the heavy user of the threads, by using pool
there number of used threads for query execution can be tracked in
system.metrics.

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Use ThreadPool in PipelineExecutor

Cc: @KochetovNicolai @vdimir

@robot-ch-test-poll robot-ch-test-poll added the pr-not-for-changelog This PR should not be mentioned in the changelog label Mar 29, 2023
@azat azat force-pushed the threadpool-pipeline-executor branch from 521e340 to 9760cdf Compare March 30, 2023 10:49
@azat azat marked this pull request as ready for review March 30, 2023 10:49
@azat azat marked this pull request as draft March 30, 2023 18:16
@azat azat force-pushed the threadpool-pipeline-executor branch from 9760cdf to 849403d Compare March 31, 2023 08:50
@azat azat marked this pull request as ready for review March 31, 2023 08:50
@azat
Copy link
Collaborator Author

azat commented Apr 1, 2023

Stateless tests (msan) [1/6] — fail: 1, passed: 849, skipped: 4
Stateless tests (release) — fail: 1, passed: 5181, skipped: 4

Integration tests (tsan) [1/6] — fail: 1, passed: 364, flaky: 0

@azat
Copy link
Collaborator Author

azat commented Apr 4, 2023

@KochetovNicolai or @vdimir can you please take a look?

@KochetovNicolai KochetovNicolai self-assigned this Apr 6, 2023
@KochetovNicolai
Copy link
Member

If this change is only to add metrics for pipeline executor threads, maybe we'd better add CurrentMetrics::Increment in a few places? I think it would be ~ 10 lines of code.

…tor.h)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
@azat azat force-pushed the threadpool-pipeline-executor branch 2 times, most recently from 1783103 to 8037aed Compare April 11, 2023 19:00
@azat
Copy link
Collaborator Author

azat commented Apr 11, 2023

If this change is only to add metrics for pipeline executor threads, maybe we'd better add CurrentMetrics::Increment in a few places? I think it would be ~ 10 lines of code.

The motivation was to avoid using threads directly, but via ThreadPool, this:

  • simplifies the code (at least from my perspective)
  • metrics (and you will not forget to add them, because you cannot create ThreadPool that way)

But maybe the PipelineExecutor is special and I'm missing something and ThreadPool cannot be used there?

@serxa
Copy link
Member

serxa commented Apr 11, 2023

I also think we should use only ThreadPool and not threads directly, ideally remove that ugly opentelemety_trace propagation difference

@serxa
Copy link
Member

serxa commented Apr 12, 2023

Looks like there is still a problem

[ 84286 ] {} <Fatal> BaseDaemon: ########################################
 [ 84286 ] {} <Fatal> BaseDaemon: (version 23.4.1.1, build id: AF7F8427E29B97411388FCCF4A0C972463DA3A2C) (from thread 9250) (query_id: 11c18c6d-e5e4-43e3-a144-1be12e9b34d6) (query: insert into tsv(a) select number from numbers(10000000);) Received signal Segmentation fault (11)
 [ 84286 ] {} <Fatal> BaseDaemon: Address: 0x10. Access: read. Address not mapped to object.
 [ 84286 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fad0f615fc4 0x205d1bc1 0x467b98cb 0x207aa0db 0x368c4845 0x368c0e15 0x368c068e 0x368be3de 0x207a4d2f 0x207b17e0 0x7fad0f613609 0x7fad0f538133
 [ 84286 ] {} <Fatal> BaseDaemon: 3. __pthread_mutex_lock @ 0x7fad0f615fc4 in ?
 [ 84286 ] {} <Fatal> BaseDaemon: 4. ./build_docker/./src/Common/ThreadFuzzer.cpp:0: pthread_mutex_lock @ 0x205d1bc1 in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 5. ./build_docker/./contrib/llvm-project/libcxx/src/mutex.cpp:39: std::mutex::lock() @ 0x467b98cb in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 6. ./build_docker/./src/Common/ThreadPool.cpp:258: ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::wait() @ 0x207aa0db in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 7. ./build_docker/./base/base/../base/scope_guard.h:48: BasicScopeGuard<DB::PipelineExecutor::executeImpl(unsigned long)::$_1>::~BasicScopeGuard() @ 0x368c4845 in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 8. ./build_docker/./src/Processors/Executors/PipelineExecutor.cpp:382: DB::PipelineExecutor::executeImpl(unsigned long) @ 0x368c0e15 in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 9.1. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:274: std::unique_ptr<DB::ExecutingGraph, std::default_delete<DB::ExecutingGraph>>::operator->[abi:v15000]() const
 [ 84286 ] {} <Fatal> BaseDaemon: 9. ./build_docker/./src/Processors/Executors/PipelineExecutor.cpp:116: DB::PipelineExecutor::execute(unsigned long) @ 0x368c068e in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 10.1. inlined from ./build_docker/./src/Processors/Executors/CompletedPipelineExecutor.cpp:56: DB::threadFunction(DB::CompletedPipelineExecutor::Data&, std::shared_ptr<DB::ThreadGroup>, unsigned long)
 [ 84286 ] {} <Fatal> BaseDaemon: 10.2. inlined from ./build_docker/./src/Processors/Executors/CompletedPipelineExecutor.cpp:84: operator()
 [ 84286 ] {} <Fatal> BaseDaemon: 10.3. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__functional/invoke.h:394: decltype(std::declval<DB::CompletedPipelineExecutor::execute()::$_0&>()()) std::__invoke[abi:v15000]<DB::CompletedPipelineExecutor::execute()::$_0&>(DB::CompletedPipel
 [ 84286 ] {} <Fatal> BaseDaemon: 10.4. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/tuple:1789: decltype(auto) std::__apply_tuple_impl[abi:v15000]<DB::CompletedPipelineExecutor::execute()::$_0&, std::tuple<>&>(DB::CompletedPipelineExecutor::execute()::$_0&, std::tuple<>&, std::
 [ 84286 ] {} <Fatal> BaseDaemon: 10.5. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/tuple:1798: decltype(auto) std::apply[abi:v15000]<DB::CompletedPipelineExecutor::execute()::$_0&, std::tuple<>&>(DB::CompletedPipelineExecutor::execute()::$_0&, std::tuple<>&)
 [ 84286 ] {} <Fatal> BaseDaemon: 10.6. inlined from ./build_docker/./src/Common/ThreadPool.h:228: operator()
 [ 84286 ] {} <Fatal> BaseDaemon: 10.7. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__functional/invoke.h:394: decltype(std::declval<DB::CompletedPipelineExecutor::execute()::$_0>()()) std::__invoke[abi:v15000]<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<DB::Comple
 [ 84286 ] {} <Fatal> BaseDaemon: 10.8. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__functional/invoke.h:479: void std::__invoke_void_return_wrapper<void, true>::__call<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<DB::CompletedPipelineExecutor::execute()::$_0>(DB::
 [ 84286 ] {} <Fatal> BaseDaemon: 10.9. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__functional/function.h:235: std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<DB::CompletedPipelineExecutor::execute()::$_0>(DB::CompletedPipelineEx
 [ 84286 ] {} <Fatal> BaseDaemon: 10. ./build_docker/./contrib/llvm-project/libcxx/include/__functional/function.h:716: void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<DB::CompletedPipelineExe
 [ 84286 ] {} <Fatal> BaseDaemon: 11. ./build_docker/./contrib/llvm-project/libcxx/include/__functional/function.h:0: ThreadPoolImpl<std::thread>::worker(std::__list_iterator<std::thread, void*>) @ 0x207a4d2f in /usr/bin/clickhouse
 [ 84286 ] {} <Fatal> BaseDaemon: 12.1. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:302: std::unique_ptr<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::
 [ 84286 ] {} <Fatal> BaseDaemon: 12.2. inlined from ./build_docker/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:259: ~unique_ptr
 [ 84286 ] {} <Fatal> BaseDaemon: 12. ./build_docker/./contrib/llvm-project/libcxx/include/thread:297: void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function
 [ 84286 ] {} <Fatal> BaseDaemon: 13. ? @ 0x7fad0f613609 in ?
 [ 84286 ] {} <Fatal> BaseDaemon: 14. clone @ 0x7fad0f538133 in ?

PipelineExecutor is one of the heavy user of the threads, by using pool
there number of used threads for query execution can be tracked in
system.metrics.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
@azat azat force-pushed the threadpool-pipeline-executor branch from 8037aed to 37d430d Compare April 12, 2023 18:01
@azat
Copy link
Collaborator Author

azat commented Apr 12, 2023

Looks like there is still a problem

@serxa Yep, thanks! Fixed.

@serxa
Copy link
Member

serxa commented Apr 13, 2023

Upgrade check failed, but it looks unrelated

2023.04.13 04:36:20.155693 [ 300181 ] {} <Error> Application: Code: 489. DB::Exception: Invalid Primary key type for simple dictionary: key. Must be native unsigned integer type. To avoid checking it, please set check_dictionary_primary_key=false: While processing  PRIMARY KEY key SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'dicttbl' DB 'dictdb_01043')) LIFETIME(MIN 0 MAX 1) LAYOUT(FLAT()). (INCORRECT_DICTIONARY_DEFINITION), Stack trace (when copying this message, always include the lines below):
2023.04.13 04:36:20.156006 [ 300181 ] {} <Error> Application: DB::Exception: Invalid Primary key type for simple dictionary: key. Must be native unsigned integer type. To avoid checking it, please set check_dictionary_primary_key=false: While processing  PRIMARY KEY key SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'dicttbl' DB 'dictdb_01043')) LIFETIME(MIN 0 MAX 1) LAYOUT(FLAT())
2023.04.13 04:36:29.469797 [ 300665 ] {} <Error> Application: Caught exception while loading metadata: Code: 489. DB::Exception: Invalid Primary key type for simple dictionary: key. Must be native unsigned integer type. To avoid checking it, please set check_dictionary_primary_key=false: While processing  PRIMARY KEY key SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'dicttbl' DB 'dictdb_01043')) LIFETIME(MIN 0 MAX 1) LAYOUT(FLAT()). (INCORRECT_DICTIONARY_DEFINITION), Stack trace (when copying this message, always include the lines below):

@serxa serxa merged commit 86d3791 into ClickHouse:master Apr 13, 2023
136 checks passed
@azat azat deleted the threadpool-pipeline-executor branch April 13, 2023 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-not-for-changelog This PR should not be mentioned in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants