Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clickhouse-keeper-client doesn't support reconfig #54129

Closed
Slach opened this issue Aug 31, 2023 · 6 comments
Closed

clickhouse-keeper-client doesn't support reconfig #54129

Slach opened this issue Aug 31, 2023 · 6 comments
Assignees
Labels

Comments

@Slach
Copy link
Contributor

Slach commented Aug 31, 2023

Describe the unexpected behaviour
Need to add reconfig support for clickhouse-keeper-client

How to reproduce

  • Which ClickHouse keeper version to use
    docker image clickhouse/clickhouse-keeper:23.7
    create keeper cluster with 3 nodes and try to remove server.3

Error message and/or stacktrace

clickhouse-keeper-0:/# clickhouse-keeper client -h localhost -p 2181 -q "get /keeper/config"
server.1=clickhouse-keeper-0.clickhouse-keepers.default.svc.cluster.local:9444;participant;1
server.2=clickhouse-keeper-1.clickhouse-keepers.default.svc.cluster.local:9444;participant;1
server.3=clickhouse-keeper-2.clickhouse-keepers.default.svc.cluster.local:9444;participant;1

clickhouse-keeper-0:/# clickhouse-keeper client -h localhost -p 2181 -q "reconfig -remove 3"
Syntax error: failed at position 1 ('reconfig'):

reconfig -remove 3

Expected one of: Keeper client query, cd, create, delete_stale_backups, find_big_family, find_super_nodes, flwc, get, get_stat, help, ls, rm, rmr, set, touch, lgif, csnp, dump, wchp, rqld, wchc, isro, crst, dirs, cons, srst, envi, conf, stat, ruok, srvr, wchs, mntr
@Slach Slach changed the title clickhouse-keeper-config doesn't support reconfig clickhouse-keeper-client doesn't support reconfig Aug 31, 2023
@Slach
Copy link
Contributor Author

Slach commented Aug 31, 2023

@tavplubix maybe better assign to @myrrc ?

@Slach
Copy link
Contributor Author

Slach commented Sep 1, 2023

hm, how exactly works reconfig?

clickhouse-keeper --version
ClickHouse keeper version 23.8.1.2958 (official build).

cat /etc/clickhouse-keeper/keeper_config.xml

<clickhouse>
    <enable_reconfiguration>true</enable_reconfiguration>
    <include_from>/tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml</include_from>
    <logger>
        <level>trace</level>
        <console>true</console>
    </logger>
    <listen_host>0.0.0.0</listen_host>
    <keeper_server incl="keeper_server">
        <path>/var/lib/clickhouse-keeper</path>
        <tcp_port>2181</tcp_port>
        <four_letter_word_white_list>*</four_letter_word_white_list>
        <coordination_settings>
            <!-- <raft_logs_level>trace</raft_logs_level> -->
            <raft_logs_level>information</raft_logs_level>
        </coordination_settings>
    </keeper_server>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>7000</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
        <!-- https://github.com/ClickHouse/ClickHouse/issues/46136 -->
        <status_info>false</status_info>
    </prometheus>
</clickhouse>
    
cat /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml
<yandex><keeper_server>
<server_id>1</server_id>
<raft_configuration>
<server><id>1</id><hostname>clickhouse-keeper-0.clickhouse-keepers.default.svc.cluster.local</hostname><port>9444</port></server>
<server><id>2</id><hostname>clickhouse-keeper-1.clickhouse-keepers.default.svc.cluster.local</hostname><port>9444</port></server>
<server><id>3</id><hostname>clickhouse-keeper-2.clickhouse-keepers.default.svc.cluster.local</hostname><port>9444</port></server>
</raft_configuration>
    
clickhouse-keeper client -p 2181 -q "get /keeper/config"
server.1=clickhouse-keeper-0.clickhouse-keepers.default.svc.cluster.local:9444;participant;1
server.2=clickhouse-keeper-1.clickhouse-keepers.default.svc.cluster.local:9444;participant;1
server.3=clickhouse-keeper-2.clickhouse-keepers.default.svc.cluster.local:9444;participant;1

but it doesn't work in kazoo ZK shell

apk add py3-pip
pip install zk-shell
zk-shell --run-once "get /keeper/config" 127.0.0.1:2181

Path /keeper/config doesn't exist

zk-shell --run-once "reconfig remove 3" 127.0.0.1:2181

Not implemented by the server: .

this is weird
zk-shell use the same kazoo library as integration tests

zk-shell --run-once "help reconfig" 127.0.0.1:2181

NAME
        reconfig - Reconfigures a ZooKeeper cluster (adds/removes members)

SYNOPSIS
        reconfig <add|remove> <arg> [from_config]

DESCRIPTION

        reconfig add <members> [from_config]

          adds the given members (i.e.: 'server.100=10.0.0.10:2889:3888:observer;0.0.0.0:2181').

        reconfig remove <members_ids> [from_config]

          removes the members with the given ids (i.e.: '2,3,5').

EXAMPLES
        > reconfig add server.100=0.0.0.0:56954:37866:observer;0.0.0.0:42969
        server.1=localhost:20002:20001:participant
        server.2=localhost:20012:20011:participant
        server.3=localhost:20022:20021:participant
        server.100=0.0.0.0:56954:37866:observer;0.0.0.0:42969
        version=100000003

        > reconfig remove 100
        server.1=localhost:20002:20001:participant
        server.2=localhost:20012:20011:participant
        server.3=localhost:20022:20021:participant
        version=100000004

@myrrc
Copy link
Contributor

myrrc commented Sep 1, 2023

keeper_server.enable_reconfiguration, not
yandex.enable_reconfiguration.

@Slach
Copy link
Contributor Author

Slach commented Sep 13, 2023

@pufit thanks a lot for your efforts!

Slach added a commit to Altinity/clickhouse-operator that referenced this issue Nov 16, 2023
…ouse#54129, test_keeper_rescale passed, but quorum lost after 2 hour

Signed-off-by: Slach <bloodjazman@gmail.com>
@odinsy
Copy link

odinsy commented Feb 28, 2024

after enabling and using reconfig feature it's impossible to start instance with ID 0

clickhouse-keeper-client --history-file=/dev/null -h clickhouse-keeper -q "reconfig remove 0"
clickhouse-keeper-client --history-file=/dev/null -h clickhouse-keeper -q "reconfig add 'server.0=clickhouse-keeper-0.clickhouse-keeper-hl.clickhouse.svc.cluster.local:9234;participant;1'"
Coordination error: Bad arguments

config

<clickhouse>
    <keeper_server>
        <server_id from_env="KEEPER_SERVER_ID"></server_id>

        <raft_configuration>
            <server>
                <id>0</id>
                <hostname>clickhouse-keeper-0.clickhouse-keeper-hl.clickhouse.svc.cluster.local</hostname>
                <port>9234</port>
            </server>
            <server>
                <id>1</id>
                <hostname>clickhouse-keeper-1.clickhouse-keeper-hl.clickhouse.svc.cluster.local</hostname>
                <port>9234</port>
            </server>
            <server>
                <id>2</id>
                <hostname>clickhouse-keeper-2.clickhouse-keeper-hl.clickhouse.svc.cluster.local</hostname>
                <port>9234</port>
            </server>
        </raft_configuration>
    </keeper_server>
</clickhouse>
clickhouse-keeper-0:/# grep KEEPER_SERVER_ID /proc/1/environ 
KEEPER_SERVER_ID=0

@Slach
Copy link
Contributor Author

Slach commented Feb 28, 2024

@odinsy this is useless to report comments in already closed issue

please create separate issue with reproduced docker-compose.yaml + reproduse.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants