Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot drop table after a wrong create distributed table statement executed #37316

Open
zouyonghao opened this issue May 18, 2022 · 3 comments
Open
Labels
question Question?

Comments

@zouyonghao
Copy link
Contributor

Describe the unexpected behaviour
Cannot drop table after a wrong create distributed table statement executed

How to reproduce

  • Which ClickHouse server version to use
    22.4.5.9

I mistakenly use a wrong create table statement for Distributed table as follows.
The first create table statement executes successfully, and the second one fails as expected but the client seems get stuck.

create table t_bnzx on cluster test_cluster ( 
c_uo INTEGER ,
c_wv073xk INTEGER ,
c_dxi INTEGER ,
primary key(c_uo)
)
engine=ReplicatedMergeTree('/clickhouse/tables/{shard}/t_bnzx', '{replica}');
create table t_bnzx on cluster test_cluster as t_bnzx
engine=Distributed('test_cluster','test','t_bnzx', rand()); -- stuck?
drop table t_bnzx on cluster test_cluster; -- stuck?

And when I try to drop table, the client also gets stuck and won't stop.

fbead2fa5858 :) drop table t_bnzx on cluster test_cluster

DROP TABLE t_bnzx ON CLUSTER test_cluster

Query id: 006f1a17-f5ef-4ecf-847d-e7bbb0633e89

┌─host──────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 127.0.0.1 │ 5001 │      0 │       │                   3 │                1 │
│ 127.0.0.1 │ 5002 │      0 │       │                   2 │                1 │
│ 127.0.0.1 │ 5003 │      0 │       │                   1 │                1 │
└───────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
↑ Progress: 3.00 rows, 159.00 B (0.02 rows/s., 0.88 B/s.)  74%

Expected behavior
The second statement fails without stuck, and the third statement succeeds without stuck.

Error message and/or stacktrace
The cluster is on port 5000,5001,5002 and 5003.
The client says 5001、5002、5003 succeed, so I only paste the logs on 5000.
It seems the warning message is important.

2022.05.18 09:16:46.028430 [ 3394111 ] {76c14f23-7645-4648-82cb-3d2e0d90396d} <Warning> default.t_bnzx (513b4184-9f2f-4db8-b87a-773560d8d376): It looks like the table /clickhouse/tables/01/t_bnzx was created by another server at the same moment, will retry
2022.05.18 09:16:56.383462 [ 3394111 ] {94c00386-7d75-4a97-8a59-234466df4e39} <Error> executeQuery: Code: 57. DB::Exception: Table default.t_bnzx already exists. (TABLE_ALREADY_EXISTS) (version 22.4.5.9 (official build)) (from 0.0.0.0:0) (in query: /* ddl_entry=query-0000000002 */ CREATE TABLE default.t_bnzx UUID '5f204632-9258-436a-ae86-e920eeea41c5' AS t_bnzx ENGINE = Distributed('test_cluster', 'test', 't_bnzx', rand())), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xb6fc2fa in /usr/bin/clickhouse
1. DB::Exception::Exception<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(int, fmt::v8::basic_format_string<char, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&) @ 0xb52b52c in /usr/bin/clickhouse
2. DB::InterpreterCreateQuery::doCreateTable(DB::ASTCreateQuery&, DB::InterpreterCreateQuery::TableProperties const&) @ 0x1604d5a8 in /usr/bin/clickhouse
3. DB::InterpreterCreateQuery::createTable(DB::ASTCreateQuery&) @ 0x16047d9e in /usr/bin/clickhouse
4. DB::InterpreterCreateQuery::execute() @ 0x1604fc43 in /usr/bin/clickhouse
5. ? @ 0x1642be95 in /usr/bin/clickhouse
6. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::__1::shared_ptr<DB::Context>, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>, std::__1::optional<DB::FormatSettings> const&) @ 0x1642f267 in /usr/bin/clickhouse
7. DB::DDLWorker::tryExecuteQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::DDLTaskBase&, std::__1::shared_ptr<zkutil::ZooKeeper> const&) @ 0x15d4ff2e in /usr/bin/clickhouse
8. DB::DDLWorker::processTask(DB::DDLTaskBase&, std::__1::shared_ptr<zkutil::ZooKeeper> const&) @ 0x15d4e97f in /usr/bin/clickhouse
9. DB::DDLWorker::scheduleTasks(bool) @ 0x15d4c703 in /usr/bin/clickhouse
10. DB::DDLWorker::runMainThread() @ 0x15d462e5 in /usr/bin/clickhouse
11. ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::DDLWorker::*)(), DB::DDLWorker*>(void (DB::DDLWorker::*&&)(), DB::DDLWorker*&&)::'lambda'()::operator()() @ 0x15d5a3b7 in /usr/bin/clickhouse
12. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xb7a7ba7 in /usr/bin/clickhouse
13. ? @ 0xb7ab5dd in /usr/bin/clickhouse
14. start_thread @ 0x76db in /lib/x86_64-linux-gnu/libpthread-2.27.so
15. /build/glibc-uZu3wS/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97: __clone @ 0x12161f in /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.27.so

2022.05.18 09:16:56.384136 [ 3394111 ] {94c00386-7d75-4a97-8a59-234466df4e39} <Error> DDLWorker: Query CREATE TABLE default.t_bnzx UUID '5f204632-9258-436a-ae86-e920eeea41c5' AS t_bnzx ENGINE = Distributed('test_cluster', 'test', 't_bnzx', rand()) wasn't finished successfully: Code: 57. DB::Exception: Table default.t_bnzx already exists. (TABLE_ALREADY_EXISTS), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xb6fc2fa in /usr/bin/clickhouse
1. DB::Exception::Exception<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(int, fmt::v8::basic_format_string<char, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&) @ 0xb52b52c in /usr/bin/clickhouse
2. DB::InterpreterCreateQuery::doCreateTable(DB::ASTCreateQuery&, DB::InterpreterCreateQuery::TableProperties const&) @ 0x1604d5a8 in /usr/bin/clickhouse
3. DB::InterpreterCreateQuery::createTable(DB::ASTCreateQuery&) @ 0x16047d9e in /usr/bin/clickhouse
4. DB::InterpreterCreateQuery::execute() @ 0x1604fc43 in /usr/bin/clickhouse
5. ? @ 0x1642be95 in /usr/bin/clickhouse
6. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::__1::shared_ptr<DB::Context>, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>, std::__1::optional<DB::FormatSettings> const&) @ 0x1642f267 in /usr/bin/clickhouse
7. DB::DDLWorker::tryExecuteQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::DDLTaskBase&, std::__1::shared_ptr<zkutil::ZooKeeper> const&) @ 0x15d4ff2e in /usr/bin/clickhouse
8. DB::DDLWorker::processTask(DB::DDLTaskBase&, std::__1::shared_ptr<zkutil::ZooKeeper> const&) @ 0x15d4e97f in /usr/bin/clickhouse
9. DB::DDLWorker::scheduleTasks(bool) @ 0x15d4c703 in /usr/bin/clickhouse
10. DB::DDLWorker::runMainThread() @ 0x15d462e5 in /usr/bin/clickhouse
11. ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::DDLWorker::*)(), DB::DDLWorker*>(void (DB::DDLWorker::*&&)(), DB::DDLWorker*&&)::'lambda'()::operator()() @ 0x15d5a3b7 in /usr/bin/clickhouse
12. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xb7a7ba7 in /usr/bin/clickhouse
13. ? @ 0xb7ab5dd in /usr/bin/clickhouse
14. start_thread @ 0x76db in /lib/x86_64-linux-gnu/libpthread-2.27.so
15. /build/glibc-uZu3wS/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97: __clone @ 0x12161f in /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.27.so
 (version 22.4.5.9 (official build))
@tavplubix
Copy link
Member

Please provide more logs from the server on port 5000:

zgrep -Fa "DDLWorker: " server.log

Also what have finally happened to query drop table t_bnzx on cluster test_cluster (006f1a17-f5ef-4ecf-847d-e7bbb0633e89)? It must fail when distributed_ddl_task_timeout exceeds, I have no ides how it could stuck.

@zouyonghao
Copy link
Contributor Author

@tavplubix Sorry, I try to reproduce this issue today but fail, the issue occured yesteday indeed.
And the log is deleted, but I remember all the related logs are the logs I pasted.

The client failed with timeout finally.

Maybe it's a concurrent bug? Because I see <Warning> default.t_bnzx (513b4184-9f2f-4db8-b87a-773560d8d376): It looks like the table /clickhouse/tables/01/t_bnzx was created by another server at the same moment, will retry, and I do not see this message today.

@tavplubix
Copy link
Member

Maybe it's a concurrent bug? Because I see default.t_bnzx (513b4184-9f2f-4db8-b87a-773560d8d376): It looks like the table /clickhouse/tables/01/t_bnzx was created by another server at the same moment, will retry, and I do not see this message today.

No, this message is irrelevant.

@tavplubix tavplubix added question Question? and removed unexpected behaviour labels May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question?
Projects
None yet
Development

No branches or pull requests

2 participants