Skip to content

Conversation

FxKu
Copy link
Member

@FxKu FxKu commented Jun 23, 2021

refactored restart of instances by moving them into a loop over pods. Same for patching Postgres with checkAndSetGlobalPostgreSQLConfiguration

From:

checkAndSetGlobalPostgreSQLConfiguration()
-> loop over pods and patch Patroni config (Postgresql.Parameters)
-> return true if bootstrap only parameters are defined in manifest to trigger restart

restartInstances()
-> call Patroni's `config` endpoint to get `ttl` duration for post restart wait
-> loop over pods and call Patroni's `restart` endpoint

To:

for pod in pods:
  syncPostgreSQLConfiguration(pod)

func syncPostgreSQLConfiguration(pod)
  call Patroni's `config` endpoint

  restartRequired := checkAndSetPostgreSQLConfiguration( pass config for comparison )
    -> patch Patroni config (Postgresql.Parameters)
    -> return true if bootstrap only parameters are defined in manifest *and differ from effective config* to trigger restart

  if restartRequired:
    -> call Patroni's `restart` endpoint

This PR also adds the following functionality:

  • extends checkAndSetGlobalPostgreSQLConfiguration to compare Patroni config with c.Spec.Patroni, too:
  • gathers all options to set in one config object and send it to a new c.patroni.SetConfig endpoint which supports the full config hierarchy compared to c.patroni.SetPostgresParameters which is only for postrgesql.parameters
  • extend (and rename) test_decrease_max_connections e2e test (or create a new one) for config changes

@FxKu FxKu added this to the 1.7 milestone Jul 6, 2021
@FxKu FxKu added the zalando label Jul 6, 2021
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

def patroni_rest(self, pod, path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate of 246-252 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there's a lot of duplicate code. Don't understand why. I thought to better do it, too 😄
Someone should tidy this part up.

try:
k8s.create_with_kubectl("manifests/complete-postgres-manifest.yaml")
k8s.wait_for_pod_start("spilo-role=master", self.test_namespace)
k8s.wait_for_pod_start('spilo-role=replica')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does waiting or the master uses the self.test_namespace and waiting for the replica does not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an oversight, indeed. Might explain why the test fails atm

self.eventuallyEqual(lambda: self.k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")

def compare_config():
effective_config = k8s.patroni_rest(masterPod.metadata.name, "config")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that only tells you if the Patroni config was updated, not if the change was actually applied on the PG level.
Patroni has the pending_restart flag to indicate such situation

$ curl -s http://localhost:8008/patroni | jq .
{
  "pending_restart": true,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. Added a check to query the settings in the database

pods, err = c.listPods()
if err != nil {
return fmt.Errorf("could not set cluster-wide PostgreSQL configuration options: %v", err)
c.logger.Infof("could not list pods of the statefulset: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is arguably still an error because the Sync will fail to complete without the list o pods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error was not replaced but moved into the loop. I took inspiration from line 276 where it's also just an info log line. The function would not fail with an empty list of pods. I've change the log level to warning.

@Jan-M
Copy link
Member

Jan-M commented Aug 9, 2021

👍

1 similar comment
@FxKu
Copy link
Member Author

FxKu commented Aug 9, 2021

👍

@FxKu FxKu merged commit 66620d5 into master Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants