-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed and corresponding local tables are not getting created in the new Shard #308
Comments
@yuzhichang, could you describe how are you testing? See test_013: https://github.com/Altinity/clickhouse-operator/blob/master/tests/test_operator.py |
My test steps:
|
I see. Looks like you closed 8123 port that is used for HTTP connections. Operator uses http client in order manage schema. So if you do not't have 8123 port it can not create schema on a new shard. You can check operator logs -- it should be complaining about that. |
@alex-zaitsev I'm not aware how 8123 port is closed. I adjust the above manifest to the following and retest, but no luck:
|
@yuzhichang, I think the problem is old ClickHouse version. Could you try with a newer version? Altinity Stable is currently 19.16.14.65. |
@alex-zaitsev I tried 19.16.14.65.
Operator creates table
I noticed there's an error inside:
|
Hi @yuzhichang , Logs is very helpful. Operator uses following query to get Distributed tables. Does it return everything properly on your system?
Error message is not good. Operator uses a dedicated user to connect, but when remote() function is invoked it goes by default user that is restricted to the cluster node but should work. Btw, please upgrade operator to 0.9.6 (you'll have to re-install it though, see release notes: https://github.com/Altinity/clickhouse-operator/releases) |
Hi @alex-zaitsev. I upgraded operator to 0.9.6, destroyed my clickhouse cluster, and retested. operator created database Operator log is The query on extant 1st and 2nd shard both give:
|
@ yuzhichang, may it be something with DNS resolution? Operator uses load balancer service. |
@alex-zaitsev, operator 0.9.7 behaves the same with 0.9.6. Here's the operator log |
@yuzhichang , thanks. I was finally able to see the similar issue in our environment, though it is not clear why is it happening. |
@yuzhichang, we have tweaked schema creation logic in 0.9.9 (image is updated without a release). It should be much more reliable. Could you check on your side, please? |
This issue applies to altinity/clickhouse-operator:0.9.9. Here's operator log when expand the ck cluster to 4 shards:
I also tried altinity/clickhouse-operator:0.10.0. All deployed ck pods loop in restarting since update operator to 0.10.0. The following is the clickhouse-server stdout:
|
Hi @yuzhichang, 0.10.0 is not yet released. But it is not clear what happens to ClickHouse. Could you share your ClickHouseInstallation yaml, please? |
Here's my ClickHouseInstallation yaml
|
Thanks, @yuzhichang! The bug that ClickHouse was not starting with 0.10.0 operator is fixed. Schema problem is still present sometimes, but we know what to do. |
@yuzhichang, we have finally found and fixed a race condition in the code that could result in schema not being propagated to shards. This is fixed in 0.11.0 today to be released this week. |
Could you confirm https://github.com/Altinity/clickhouse-operator/releases/tag/0.11.0 fixes the issue, please? |
@alex-zaitsev I confirm. Thanks a lot! |
this is false according to my test .
operator version: 0.9.1
The text was updated successfully, but these errors were encountered: