Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take and apply some parameters from controldata when starting as replica #703

Merged
merged 2 commits into from Jun 12, 2018

Conversation

CyberDem0n
Copy link
Collaborator

https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN
There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start:

  • max_connections
  • max_prepared_transactions
  • max_locks_per_transaction
  • max_worker_processes

It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup.

As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set pending_restart flag.

https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN
There is set of parameters which value on the replica must be not
smaller than on the primary, otherwise replica will refuse to start:
* max_connections
* max_prepared_transactions
* max_locks_per_transaction
* max_worker_processes

It might happen that values of these parameters in the global
configuration are not set high enough, what makes impossible to start a
replica without human intervention. Usually it happens when we bootstrap
a new cluster from the basebackup.

As a solution to this problem we will take values of above parameters
from the pg_controldata output and in case if the values in the global
configuration are not high enough, apply values taken from
pg_controldata and set `pending_restart` flag.
@alexeyklyukin
Copy link
Contributor

As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set pending_restart flag.

Do we apply those values once the cluster has already started? Otherwise, why do we need the pending_restart flag?

@CyberDem0n
Copy link
Collaborator Author

Those values are applied "before" start, but "pending_restart" is still needed because values don't match with values from global configuration.

@alexeyklyukin
Copy link
Contributor

"pending_restart" is still needed because values don't match with values from global configuration.

But how would a restart fix that? Given that we take the values from pg_controldata all the time we start the instance (and the restart is a combination of stop and start calls), surely we will get the same discrepancy next time we call the restart, until someone fixes the global configuration. What am I missing?

@CyberDem0n
Copy link
Collaborator Author

until someone fixes the global configuration

configuration is absolutely fine

What am I missing?

values of these parameters are con CONST, they may change in time...

@alexeyklyukin
Copy link
Contributor

alexeyklyukin commented Jun 11, 2018

configuration is absolutely fine

if it is fine, why are we correcting the values obtained from it in the first place?

values of these parameters are con CONST, they may change in time...

how does it different from any other values that may change in time, yet, we don't set pending_restart flag just because of that.

@alexeyklyukin
Copy link
Contributor

Answering my own questions (after talking to @CyberDem0n) the idea of this is to make sure the replica can start after the master changes some of those controldata-mentioned parameters (i.e. max_connections), since WAL is replayed asynchronously and may potentially still have the old value of the parameter when the new value is applied to the replica. Therefore, once the WALs are replayed up to the point when the new value of the parameter takes effect, we need to make sure the same value of the desired parameter as on the master is in force; therefore, we need to set pending_restart as a reminder to restart the replica and apply the value from the global configuration (otherwise, the replica will continue to run with the parameters taken from pg_controldata the very first time it crashes by observing non-matching values in the WAL).

I'd suggest adding some of it as a comment to the _build_effective_configuration.

I've been originally approaching this from the standpoint of doing a custom bootstrap from the WAL archive, which would produce the very first node in the cluster that might be wrongly configured and, therefore, is unable to restore the WAL it is being bootstrapped with, however, this is not addressed by this PR.

@alexeyklyukin
Copy link
Contributor

👍

1 similar comment
@CyberDem0n
Copy link
Collaborator Author

👍

@CyberDem0n CyberDem0n merged commit e939304 into master Jun 12, 2018
@CyberDem0n CyberDem0n deleted the feature/controldata branch June 12, 2018 12:04
CyberDem0n pushed a commit that referenced this pull request Feb 26, 2020
CyberDem0n added a commit that referenced this pull request Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants