Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fauxton shows “This database failed to load” after pod restarts #52

Open
DB185344 opened this issue Jul 15, 2021 · 4 comments
Open

Comments

@DB185344
Copy link

Describe the bug

After restarting a pod, the node fails to join the cluster properly, and we're getting an error on Fauxton, that displays 'this database failed to load' on some databases. when refreshing the browser, a different db comes online and a different db displays 'this database failed to load'. only after running a curl request with 'finish_cluster' the error stops.

Version of Helm and Kubernetes: Helm: 3.5.4, Kubernetes: 1.19

What happened: After restarting a pod, the node fails to join the cluster properly, and only after running:

curl -X POST http://$adminUser:$adminPassword@<couchdb_pod>:5984/_cluster_setup -H "Accept: application/json" -H "Content-Type: application/json" -d '{"action": "finish_cluster"}'
The pod will join back to the cluster.

What you expected to happen: After restart of the pod, the node automatically joins the cluster.

How to reproduce it (as minimally and precisely as possible): restart 1 pod in the cluster.

Anything else we need to know:

Adding image from Fauxton regarding this database failed to load:

image

Also added the values.yaml:

clusterSize: 3

allowAdminParty: false

createAdminSecret: false

adminUsername: admin
networkPolicy:
enabled: true

serviceAccount:
enabled: true
create: true
persistentVolume:
enabled: true
accessModes:
- ReadWriteOnce
size: 10Gi
storageClass: "ssd-couchdb"

image:
repository:
tag: latest
pullPolicy: Always

searchImage:
repository: kocolosk/couchdb-search
tag: 0.2.0
pullPolicy: IfNotPresent

enableSearch: false

initImage:
repository: busybox
tag: latest
pullPolicy: Always

podManagementPolicy: Parallel

affinity: {}

annotations: {}

tolerations: []

service:

annotations:

enabled: true
type: LoadBalancer
externalPort: 5984
sidecarsPort: 8080
LoadBalancerIP:

ingress:
enabled: false
hosts:
- chart-example.local
path: /
annotations: []
tls:
resources:
{}

erlangFlags:
name: couchdb
setcookie: monster

couchdbConfig:
chttpd:
bind_address: any
require_valid_user: false

dns:
clusterDomainSuffix: cluster.local
livenessProbe:
enabled: true
failureThreshold: 3
initialDelaySeconds: 0
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
enabled: true
failureThreshold: 3
initialDelaySeconds: 0
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1

sidecars:
image: "<sidecar_image>"
imagePullPolicy: Always

@jftanner
Copy link

Did you ever find a fix for the pod not rejoining the cluster properly? I'm encountering that now.

@willholley
Copy link
Member

@jftanner can you share the logs from the pod that isn't joined? If the admin hash is not specified in the helm chart then you may be encountering #7.

@jftanner
Copy link

jftanner commented Sep 28, 2022

Hi @willholley. It might be #7, but it doesn't happen on pod restart. It only happens when there's a new pod after a helmd upgrade. It seems to be that whenever the helm chart is run, it generates new credentials. (I noticed that the auto-generated admin password changes every time I install I update the helm deployment.) New pods pick up the new credential, but old ones don't. So the workaround I found was to kill all the existing pods after scaling. (Obviously not ideal, but I don't have to do that very often.)

Perhaps #89 will fix it?

Alternatively, I could just define my own admin credentials manually and not have a problem anymore.

@colearendt
Copy link
Contributor

Yes, this sounds just like #78 , and #89 would likely fix / is intended to fix 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants