Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of client connections not reevaluated dynamically in the teardown script #482

Closed
jkhalack opened this issue Jun 7, 2022 · 0 comments · Fixed by #483
Closed

Number of client connections not reevaluated dynamically in the teardown script #482

jkhalack opened this issue Jun 7, 2022 · 0 comments · Fixed by #483

Comments

@jkhalack
Copy link
Contributor

jkhalack commented Jun 7, 2022

Description

The CONN_COUNT variable is only evaluated once in the zookeeperTeardown.sh script (https://github.com/pravega/zookeeper-operator/blob/master/docker/bin/zookeeperTeardown.sh#L23-L32)
As a result, the loop would always take 30 seconds if there were client connections present when the script started to run, and ZK pod would terminate with 137 error code with the default termination grace period set to 30s (the same as in #91)

Importance

The failure makes it impossible to replace a failed node on K8s cluster (the affected ZK pod would not get shut down properly, hence not possible to migrate it).

Location

https://github.com/pravega/zookeeper-operator/blob/master/docker/bin/zookeeperTeardown.sh#L23-L32

Suggestions for an improvement

Move line 23 (evaluation of CONN_COUNT) inside the for loop, so it gets reevaluated every cycle and allows the code to break out of the loop earlier (then we might also have less time spent in upgrades, as in #206).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant