-
Notifications
You must be signed in to change notification settings - Fork 108
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix maintenance state related transitions. (#786)
* Fix maintenance state related transitions. We used to disallow starting maintenance on a node in some cases, but it seems that the user should be able to decide about when they need to operate maintenance on their own nodes. After all, we don't stop Postgres when going to maintenance, so users may change their mind without impacting their service. A WARNING message is now displayed in some cases that were previously prevented. Also, the transition from WAIT_MAINTENANCE to MAINTENANCE was failing since we improved the Group State Machine for the primary node, which would go from JOIN_PRIMARY to PRIMARY without waiting for the other nodes to reach their assigned state of WAIT_MAINTENANCE. * Prevent WAIT_PRIMARY state when all secondaries are in maintenance. If number-sync-standbys is set to 1 or more, then we still allow all the secondary nodes to be put in maintenance mode, and we maintain the primary node in the PRIMARY state, in a way that writes are going to be blocked on the primary. * Refrain from DRAINING state when there is no candidate. When a primary is not healthy and there is no candidate node to failover to, then assigning to the primary the DRAINING state is not helping. Also, the reason why we don't have a candidate at the moment might be that the other nodes are in MAINTENANCE and the operator is restarting Postgres to install some new configuration. * Consider wait_maintenance a maintanance state in the monitor * Allow disabling maintenance in prepare_maintenance state * In maintenance, setup Postgres as a standby node without a primary. The primary election might not be finished yet, and also if the operator is to restart the local instance, they probably don't want it to connect to any other node in the system during the maintenance window. * Allow the transition from prepare_maintenance to maintenance to happen early on a multiple standby system, at soon as an election is triggered. * Per review, allow prepare_maintenance -> catchingup transition. * Check for the right node to reach reported state Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
- Loading branch information
Showing
11 changed files
with
475 additions
and
253 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.