Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanent recreation of db clusters due to changing sidecar order #924

Closed
siku4 opened this issue Apr 21, 2020 · 1 comment
Closed

Permanent recreation of db clusters due to changing sidecar order #924

siku4 opened this issue Apr 21, 2020 · 1 comment
Labels

Comments

@siku4
Copy link
Contributor

siku4 commented Apr 21, 2020

We have the problem that all db cluster replicas are permanently recreated by the operator after a certain amount of time. We have figured out that the problem is a changing pod spec within the db cluster statefulsets:

time="2020-04-21T06:22:17Z" level=debug msg="spec diff between old and new statefulsets: 
Template.Spec.Containers[0].TerminationMessagePath: \"/dev/termination-log\" != \"\"
Template.Spec.Containers[0].TerminationMessagePolicy: \"File\" != \"\"
[!!!] Template.Spec.Containers[1].Name: \"postgres-exporter\" != \"filebeat\"
[!!!] Template.Spec.Containers[1].Image: \"our.registry.com/pg-exporter:latest-60eaf1c8\" != \"our.registry.com/filebeat:7.5.1-60eaf1c8\"
Template.Spec.Containers[1].TerminationMessagePath: \"/dev/termination-log\" != \"\"
Template.Spec.Containers[1].TerminationMessagePolicy: \"File\" != \"\"
[!!!] Template.Spec.Containers[2].Name: \"filebeat\" != \"postgres-exporter\"
[!!!] Template.Spec.Containers[2].Image: \"our.registry.com/filebeat:7.5.1-60eaf1c8\" != \"our.registry.com/pg-exporter:latest-60eaf1c8\"
Template.Spec.Containers[2].TerminationMessagePath: \"/dev/termination-log\" != \"\"
Template.Spec.Containers[2].TerminationMessagePolicy: \"File\" != \"\"
Template.Spec.RestartPolicy: \"Always\" != \"\"
Template.Spec.DNSPolicy: \"ClusterFirst\" != \"\"
Template.Spec.DeprecatedServiceAccount: \"postgres-pod\" != \"\"
Template.Spec.SchedulerName: \"default-scheduler\" != \"\"
Template.Spec.Tolerations: []v1.Toleration(nil) != []v1.Toleration{}
VolumeClaimTemplates[0].Status.Phase: \"Pending\" != \"\"
RevisionHistoryLimit: &int32(10) != nil
" cluster-name=postgres-sandbox/acid-minimal-cluster pkg=cluster worker=1

In concrete terms the order of our with sidecar_docker_images globally configured sidecars (in our case filebeat + Postgres exporter) permanently changes within the pod spec.

We spent some time on analyzing https://github.com/zalando/postgres-operator/blob/master/pkg/cluster/k8sres.go and our assumption is that the map sidecar_docker_image

Sidecars map[string]string `name:"sidecar_docker_images"`
of the operatorconfiguration CR should be the problem here. We are no Go experts but we figured out that the merging of global and cluster specific sidecars by the function mergeSidecars() occurs in a random order. Because in our case we have no cluster specific sidecars in cluster manifests configured we could notice this behavior in the for loop that iterates over global sidecars (OpConfig.Sidecars):
for name, dockerImage := range c.OpConfig.Sidecars {
To our knowledge the iteration order over a map is not guaranteed to be reproducible.

We have temporary hot fixed that issue by expanding the function mergeSidecars()

func (c *Cluster) mergeSidecars(sidecars []acidv1.Sidecar) []acidv1.Sidecar {
with a simple alphabetically sorting of the sidecar objects in result before returning it. If interested I can share the fix.

Is our assumption correct? What would be the more elegant solution here?

fischerman added a commit to fischerman/postgres-operator that referenced this issue Apr 22, 2020
@FxKu FxKu added the bug label Apr 24, 2020
sdudoladov pushed a commit that referenced this issue Apr 27, 2020
* implement fully speced global sidecars

* fix issue #924
@FxKu
Copy link
Member

FxKu commented May 4, 2020

was fixed with #890

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants