Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't start replication (table will be in readonly mode): Transaction failed: Op #0 #1275

Closed
morkalfon opened this issue Nov 20, 2023 · 3 comments

Comments

@morkalfon
Copy link

We have a setup of triple CH servers and CH keepers, running the 0.22.0 operator.

Keeper is based on: https://github.com/Altinity/clickhouse-operator/blob/master/deploy/clickhouse-keeper/clickhouse-keeper-3-nodes.yaml.

After some time with the setup working fine, we start getting the following error:

clickhouse 0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c6487f7 in /usr/bin/clickhouse
clickhouse 1. DB::Exception::Exception<unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type>, unsigned long&) @ 0x000000000715ba9c in /usr/bin/clickhouse
clickhouse 2. zkutil::KeeperMultiException::KeeperMultiException(Coordination::Error, unsigned long, std::vector<std::shared_ptr<Coordination::Request>, std::allocator<std::shared_ptr<Coordination::Request>>> const&, std::vector<std::shared_ptr<Coordination::Response>, std::allocator<std::shared_ptr<Coordination::Response>>> const&) @ 0x0000000013806b84 in /usr/bin/clickhouse
clickhouse 3. zkutil::KeeperMultiException::check(Coordination::Error, std::vector<std::shared_ptr<Coordination::Request>, std::allocator<std::shared_ptr<Coordination::Request>>> const&, std::vector<std::shared_ptr<Coordination::Response>, std::allocator<std::shared_ptr<Coordination::Response>>> const&) @ 0x00000000137fbc13 in /usr/bin/clickhouse
clickhouse 4. DB::ReplicatedMergeTreeRestartingThread::tryStartup() @ 0x0000000012f130de in /usr/bin/clickhouse
clickhouse 5. DB::ReplicatedMergeTreeRestartingThread::run() @ 0x0000000012f11654 in /usr/bin/clickhouse
clickhouse 6. DB::BackgroundSchedulePool::threadFunction() @ 0x00000000110c859f in /usr/bin/clickhouse
clickhouse 7. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x00000000110c95d1 in /usr/bin/clickhouse
clickhouse 8. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000c7298c4 in /usr/bin/clickhouse
clickhouse 9. ? @ 0x00007f879f8a0609 in ?
clickhouse 10. ? @ 0x00007f879f7c5133 in ?
clickhouse  (version 23.8.7.24 (official build))
clickhouse 2023.11.20 04:51:39.607924 [ 117 ] {} <Warning> *reducted*.*reducted* (ReplicatedMergeTreeRestartingThread): Table was in readonly mode. Will try to activate it.
clickhouse 2023.11.20 04:51:39.609955 [ 117 ] {} <Error> *reducted*.*reducted* (ReplicatedMergeTreeRestartingThread): Couldn't start replication (table will be in readonly mode): Transaction failed: Op #0, path: /clickhouse/tables/550a1058-7e46-467d-a803-88c52b4819f7/2/replicas/chi-clickhouse-clickhouse-2-0/is_active. Code: 999. Coordination::Exception: Transaction failed: Op #0, path: /clickhouse/tables/550a1058-7e46-467d-a803-88c52b4819f7/2/replicas/chi-clickhouse-clickhouse-2-0/is_active. (KEEPER_EXCEPTION), Stack trace (when copying this message, always include the lines below):
clickhouse 
clickhouse 0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c6487f7 in /usr/bin/clickhouse
clickhouse 1. DB::Exception::Exception<unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type>, unsigned long&) @ 0x000000000715ba9c in /usr/bin/clickhouse
clickhouse 2. zkutil::KeeperMultiException::KeeperMultiException(Coordination::Error, unsigned long, std::vector<std::shared_ptr<Coordination::Request>, std::allocator<std::shared_ptr<Coordination::Request>>> const&, std::vector<std::shared_ptr<Coordination::Response>, std::allocator<std::shared_ptr<Coordination::Response>>> const&) @ 0x0000000013806b84 in /usr/bin/clickhouse
clickhouse 3. zkutil::KeeperMultiException::check(Coordination::Error, std::vector<std::shared_ptr<Coordination::Request>, std::allocator<std::shared_ptr<Coordination::Request>>> const&, std::vector<std::shared_ptr<Coordination::Response>, std::allocator<std::shared_ptr<Coordination::Response>>> const&) @ 0x00000000137fbc13 in /usr/bin/clickhouse
clickhouse 4. DB::ReplicatedMergeTreeRestartingThread::tryStartup() @ 0x0000000012f130de in /usr/bin/clickhouse
clickhouse 5. DB::ReplicatedMergeTreeRestartingThread::run() @ 0x0000000012f11654 in /usr/bin/clickhouse
clickhouse 6. DB::BackgroundSchedulePool::threadFunction() @ 0x00000000110c859f in /usr/bin/clickhouse
clickhouse 7. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, StrongTypedef<unsigned long, CurrentMetrics::MetricTag>, char const*)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x00000000110c95d1 in /usr/bin/clickhouse
clickhouse 8. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000c7298c4 in /usr/bin/clickhouse
clickhouse 9. ? @ 0x00007f879f8a0609 in ?
clickhouse 10. ? @ 0x00007f879f7c5133 in ?
clickhouse  (version 23.8.7.24 (official build))

We tried upgrading the images of the CH servers and CH keepers to the latest 23.8 versions but it seems not related.

There is this issue: #1231 which might be related.

@Slach
Copy link
Collaborator

Slach commented Nov 20, 2023

@Slach Slach closed this as completed Nov 20, 2023
@morkalfon
Copy link
Author

@Slach We this be addressed in v0.23.0 of the operator?

@Slach
Copy link
Collaborator

Slach commented Nov 20, 2023

i hope yes. subscribe to #1234
and watch progress

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants