Take and apply some parameters from controldata when starting as replica #703

CyberDem0n · 2018-06-11T07:30:47Z

https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN
There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start:

max_connections
max_prepared_transactions
max_locks_per_transaction
max_worker_processes

It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup.

As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set pending_restart flag.

https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start: * max_connections * max_prepared_transactions * max_locks_per_transaction * max_worker_processes It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup. As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set `pending_restart` flag.

alexeyklyukin · 2018-06-11T09:50:43Z

As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set pending_restart flag.

Do we apply those values once the cluster has already started? Otherwise, why do we need the pending_restart flag?

CyberDem0n · 2018-06-11T09:58:28Z

Those values are applied "before" start, but "pending_restart" is still needed because values don't match with values from global configuration.

alexeyklyukin · 2018-06-11T10:03:48Z

"pending_restart" is still needed because values don't match with values from global configuration.

But how would a restart fix that? Given that we take the values from pg_controldata all the time we start the instance (and the restart is a combination of stop and start calls), surely we will get the same discrepancy next time we call the restart, until someone fixes the global configuration. What am I missing?

CyberDem0n · 2018-06-11T10:10:10Z

until someone fixes the global configuration

configuration is absolutely fine

What am I missing?

values of these parameters are con CONST, they may change in time...

alexeyklyukin · 2018-06-11T10:25:34Z

configuration is absolutely fine

if it is fine, why are we correcting the values obtained from it in the first place?

values of these parameters are con CONST, they may change in time...

how does it different from any other values that may change in time, yet, we don't set pending_restart flag just because of that.

alexeyklyukin · 2018-06-11T10:50:02Z

Answering my own questions (after talking to @CyberDem0n) the idea of this is to make sure the replica can start after the master changes some of those controldata-mentioned parameters (i.e. max_connections), since WAL is replayed asynchronously and may potentially still have the old value of the parameter when the new value is applied to the replica. Therefore, once the WALs are replayed up to the point when the new value of the parameter takes effect, we need to make sure the same value of the desired parameter as on the master is in force; therefore, we need to set pending_restart as a reminder to restart the replica and apply the value from the global configuration (otherwise, the replica will continue to run with the parameters taken from pg_controldata the very first time it crashes by observing non-matching values in the WAL).

I'd suggest adding some of it as a comment to the _build_effective_configuration.

I've been originally approaching this from the standpoint of doing a custom bootstrap from the WAL archive, which would produce the very first node in the cluster that might be wrongly configured and, therefore, is unable to restore the WAL it is being bootstrapped with, however, this is not addressed by this PR.

alexeyklyukin · 2018-06-11T14:47:38Z

👍

CyberDem0n · 2018-06-12T12:01:38Z

👍

Bug was introduced in #703 Close #1418

Document build_effective_configuration behavior

b69dcb2

CyberDem0n merged commit e939304 into master Jun 12, 2018

CyberDem0n deleted the feature/controldata branch June 12, 2018 12:04

CyberDem0n mentioned this pull request Feb 25, 2020

No callbacks on failed start & bringing down a VIP on a failed master #1418

Closed

CyberDem0n pushed a commit that referenced this pull request Feb 26, 2020

On role change callback didn't fire on failed primary

a907497

Bug was introduced in #703 Close #1418

CyberDem0n mentioned this pull request Feb 26, 2020

On role change callback didn't fire on failed primary #1420

Merged

CyberDem0n added a commit that referenced this pull request Feb 27, 2020

On role change callback didn't fire on failed primary (#1420)

4a29caa

Bug was introduced in #703 Close #1418

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take and apply some parameters from controldata when starting as replica #703

Take and apply some parameters from controldata when starting as replica #703

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018 •

edited

alexeyklyukin commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 12, 2018

Take and apply some parameters from controldata when starting as replica #703

Take and apply some parameters from controldata when starting as replica #703

Conversation

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018 • edited

alexeyklyukin commented Jun 11, 2018

alexeyklyukin commented Jun 11, 2018

CyberDem0n commented Jun 12, 2018

alexeyklyukin commented Jun 11, 2018 •

edited