Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed auto-generated erlang cookie causes cluster restart issues #78

Closed
colearendt opened this issue May 25, 2022 · 0 comments · Fixed by #89
Closed

Changed auto-generated erlang cookie causes cluster restart issues #78

colearendt opened this issue May 25, 2022 · 0 comments · Fixed by #89

Comments

@colearendt
Copy link
Contributor

colearendt commented May 25, 2022

Describe the bug

The auto-generated erlang cookie (#68) changes on each helm deployment. As a result, it causes issues with rolling over the cluster.

Version of Helm and Kubernetes: 3.8.3 and 1.21

What happened: Each time we roll over the cluster (i.e. deploy config changes), it short circuits on replication failure b/c the two CouchDB nodes are running with different cookies. As a result, it never completes roll-over and stops with one restarted unhealthily:

NAME                READY   STATUS    RESTARTS   AGE
couch-1-couchdb-0   1/1     Running   0          105m
couch-1-couchdb-1   0/1     Running   0          61m

What you expected to happen: Roll over the cluster happily

How to reproduce it (as minimally and precisely as possible): Deploy helm chart twice w/ different config and replicas > 1

Anything else we need to know:

Potentially related to #77

Possible fix is to use a stateful generation pattern like #74

Alternatively, could make clear in the docs that setting the erlangFlags.setcookie value is required in order for "rollover" to happen cleanly, change the update policy for the chart, etc.

Related: if you set this variable, then it passes as a command line argument and not as a secret env var. This should be able to be toggled independently IMO (since an HA cluster being able to restart safely / consistently depends on consistency here, and some people prefer using secrets)

@colearendt colearendt changed the title Changed erlang cookie Changed erlang cookie causes cluster restart issues May 25, 2022
@colearendt colearendt changed the title Changed erlang cookie causes cluster restart issues Changed auto-generated erlang cookie causes cluster restart issues May 25, 2022
colearendt added a commit to colearendt/couchdb-helm that referenced this issue Jun 20, 2022
related to apache#78, apache#88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.
colearendt added a commit to colearendt/couchdb-helm that referenced this issue Jun 20, 2022
related to apache#78, apache#88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.
colearendt added a commit to colearendt/couchdb-helm that referenced this issue Sep 19, 2022
related to apache#78, apache#88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.
willholley pushed a commit that referenced this issue Oct 24, 2022
related to #78, #88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant