Skip to content

Conversation

sdudoladov
Copy link
Member

Operator should retry switchovers before giving up on moving master pods from non-ready nodes.
That is currently not the case: operator attempts to move the master pods once and then leaves them as is, thereby potentially blocking k8s cluster-wide processes such as node rotation. With retries we avoid some of the blocking, namely the cases where a replica was moved shortly before the master and is not ready at the time of the first switchover attempt of the operator.

To test, build and start operator and one PG cluster in kind as normal. Then:

  1. Tag the replica pod with the nofailover tag
kubectl exec -it $(kubectl get pods -l spilo-role=replica -o jsonpath={.items[0].metadata.name}) -- su postgres
echo -e "tags:\n nofailover: true" >> postgres.yml
patronictl reload $SCOPE --force
  1. Make the node with the master pod non-schedulable
kubectl cordon $(kubectl get pods -l spilo-role=master -o jsonpath={..nodeName})

The operator will log unsuccessful attempts to do a switchover with 1 minutes intervals for 5 minutes.

time="2021-05-31T13:16:26Z" level=debug msg="Waiting for any replica pod to become ready" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:26Z" level=debug msg="Found 1 running replica pods" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:26Z" level=info msg="check failed: pod \"default/acid-minimal-cluster-1\" is already on a live node" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:26Z" level=debug msg="switching over from \"acid-minimal-cluster-0\" to \"default/acid-minimal-cluster-1\"" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:26Z" level=debug msg="making POST http request: http://10.244.2.3:8008/failover" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:26Z" level=debug msg="subscribing to pod \"default/acid-minimal-cluster-1\"" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:27Z" level=debug msg="unsubscribing from pod \"default/acid-minimal-cluster-1\" events" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:16:27Z" level=error msg="could not failover to pod \"default/acid-minimal-cluster-1\": could not switch over from \"acid-minimal-cluster-0\" to \"default/acid-minimal-cluster-1\": patroni returned 'failover is not possible: no good candidates have been found'" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-05-31T13:17:26Z" level=debug msg="switching over from \"acid-minimal-cluster-0\" to \"default/acid-minimal-cluster-1\"" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0

@Jan-M
Copy link
Member

Jan-M commented Jun 3, 2021

if err := cl.MigrateMasterPod(podName); err != nil {

@FxKu @sdudoladov imho the pr currently changes semantics, no proper error propagation after 5 failures

@sdudoladov
Copy link
Member Author

sdudoladov commented Jun 4, 2021

in the case when all switchovers are unsuccessful, the operator log looks like that (with only the last attempt showed):

time="2021-06-04T08:47:28Z" level=debug msg="switching over from \"acid-minimal-cluster-0\" to \"default/acid-minimal-cluster-1\"" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-06-04T08:47:28Z" level=debug msg="making POST http request: http://10.244.2.3:8008/failover" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-06-04T08:47:28Z" level=debug msg="subscribing to pod \"default/acid-minimal-cluster-1\"" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-06-04T08:47:28Z" level=debug msg="unsubscribing from pod \"default/acid-minimal-cluster-1\" events" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-06-04T08:47:28Z" level=error msg="could not failover to pod \"default/acid-minimal-cluster-1\": could not switch over from \"acid-minimal-cluster-0\" to \"default/acid-minimal-cluster-1\": patroni returned 'failover is not possible: no good candidates have been found'" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2021-06-04T08:47:28Z" level=error msg="could not move master pod \"default/acid-minimal-cluster-0\": could not migrate master pod: still failing after 5 retries" pkg=controller
time="2021-06-04T08:47:28Z" level=info msg="0/1 master pods have been moved out from the \"/kind-worker2\" node" pkg=controller
time="2021-06-04T08:47:28Z" level=warning msg="failed to move master pods from the node \"kind-worker2\": could not move master 1/1 pods from the \"/kind-worker2\" node" pkg=controller

after that the operator gives up on moving a master pod from a non-schedulable node. That matches the current behavior.

@FxKu
Copy link
Member

FxKu commented Jun 11, 2021

👍

1 similar comment
@sdudoladov
Copy link
Member Author

👍

@sdudoladov sdudoladov merged commit 53fb540 into master Jun 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants