Changed auto-generated erlang cookie causes cluster restart issues #78

colearendt · 2022-05-25T12:31:14Z

Describe the bug

The auto-generated erlang cookie (#68) changes on each helm deployment. As a result, it causes issues with rolling over the cluster.

Version of Helm and Kubernetes: 3.8.3 and 1.21

What happened: Each time we roll over the cluster (i.e. deploy config changes), it short circuits on replication failure b/c the two CouchDB nodes are running with different cookies. As a result, it never completes roll-over and stops with one restarted unhealthily:

NAME                READY   STATUS    RESTARTS   AGE
couch-1-couchdb-0   1/1     Running   0          105m
couch-1-couchdb-1   0/1     Running   0          61m

What you expected to happen: Roll over the cluster happily

How to reproduce it (as minimally and precisely as possible): Deploy helm chart twice w/ different config and replicas > 1

Anything else we need to know:

Potentially related to #77

Possible fix is to use a stateful generation pattern like #74

Alternatively, could make clear in the docs that setting the erlangFlags.setcookie value is required in order for "rollover" to happen cleanly, change the update policy for the chart, etc.

Related: if you set this variable, then it passes as a command line argument and not as a secret env var. This should be able to be toggled independently IMO (since an HA cluster being able to restart safely / consistently depends on consistency here, and some people prefer using secrets)

The text was updated successfully, but these errors were encountered:

related to apache#78, apache#88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.

related to #78, #88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.

colearendt changed the title ~~Changed erlang cookie~~ Changed erlang cookie causes cluster restart issues May 25, 2022

colearendt changed the title ~~Changed erlang cookie causes cluster restart issues~~ Changed auto-generated erlang cookie causes cluster restart issues May 25, 2022

colearendt mentioned this issue Jun 20, 2022

Fix broken CI on main #88

Closed

colearendt mentioned this issue Jun 20, 2022

Auto-generate a stateful erlangCookie #89

Merged

4 tasks

colearendt mentioned this issue Sep 29, 2022

Fauxton shows “This database failed to load” after pod restarts #52

Open

willholley closed this as completed in #89 Oct 24, 2022

willholley pushed a commit that referenced this issue Oct 24, 2022

auto-generate the erlangCookie variable

3024366

related to #78, #88. We auto-generate the secret if it is not provided, and then continue to use that value on upgrades rather than auto-generating fresh each time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed auto-generated erlang cookie causes cluster restart issues #78

Changed auto-generated erlang cookie causes cluster restart issues #78

colearendt commented May 25, 2022 •

edited

Loading

Changed auto-generated erlang cookie causes cluster restart issues #78

Changed auto-generated erlang cookie causes cluster restart issues #78

Comments

colearendt commented May 25, 2022 • edited Loading

colearendt commented May 25, 2022 •

edited

Loading