Skip to content

Commit

Permalink
Fix maintenance state related transitions. (#786)
Browse files Browse the repository at this point in the history
* Fix maintenance state related transitions.

We used to disallow starting maintenance on a node in some cases, but it
seems that the user should be able to decide about when they need to operate
maintenance on their own nodes. After all, we don't stop Postgres when going
to maintenance, so users may change their mind without impacting their
service. A WARNING message is now displayed in some cases that were
previously prevented.

Also, the transition from WAIT_MAINTENANCE to MAINTENANCE was failing since
we improved the Group State Machine for the primary node, which would go
from JOIN_PRIMARY to PRIMARY without waiting for the other nodes to reach
their assigned state of WAIT_MAINTENANCE.

* Prevent WAIT_PRIMARY state when all secondaries are in maintenance.

If number-sync-standbys is set to 1 or more, then we still allow all the
secondary nodes to be put in maintenance mode, and we maintain the primary
node in the PRIMARY state, in a way that writes are going to be blocked on
the primary.

* Refrain from DRAINING state when there is no candidate.

When a primary is not healthy and there is no candidate node to failover to,
then assigning to the primary the DRAINING state is not helping.

Also, the reason why we don't have a candidate at the moment might be that
the other nodes are in MAINTENANCE and the operator is restarting Postgres
to install some new configuration.

* Consider wait_maintenance a maintanance state in the monitor

* Allow disabling maintenance in prepare_maintenance state

* In maintenance, setup Postgres as a standby node without a primary. 

The primary election might not be finished yet, and also if the operator is to
restart the local instance, they probably don't want it to connect to any other
node in the system during the maintenance window.

* Allow the transition from prepare_maintenance to maintenance to happen
 early on a multiple standby system, at soon as an election is triggered.

* Per review, allow prepare_maintenance -> catchingup transition.

* Check for the right node to reach reported state

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
  • Loading branch information
DimCitus and JelteF committed Sep 2, 2021
1 parent 329b554 commit 2ec279a
Show file tree
Hide file tree
Showing 11 changed files with 475 additions and 253 deletions.
13 changes: 9 additions & 4 deletions src/bin/pg_autoctl/cli_enable_disable.c
Original file line number Diff line number Diff line change
Expand Up @@ -553,15 +553,17 @@ cli_enable_maintenance(int argc, char **argv)
exit(EXIT_CODE_QUIT);
}

NodeState targetStates[] = { MAINTENANCE_STATE };
if (!monitor_wait_until_node_reported_state(
&(keeper.monitor),
keeper.config.formation,
keeper.config.groupId,
keeper.state.current_node_id,
keeper.config.pgSetup.pgKind,
MAINTENANCE_STATE))
targetStates,
lengthof(targetStates)))
{
log_error("Failed to wait until a node reached the wait_primary state");
log_error("Failed to wait until the node reached the maintenance state");
exit(EXIT_CODE_MONITOR);
}
}
Expand Down Expand Up @@ -651,15 +653,18 @@ cli_disable_maintenance(int argc, char **argv)
(void) pg_usleep(sleepTimeMs * 1000);
}

NodeState targetStates[] = { SECONDARY_STATE, PRIMARY_STATE };

if (!monitor_wait_until_node_reported_state(
&(keeper.monitor),
keeper.config.formation,
keeper.config.groupId,
keeper.state.current_node_id,
keeper.config.pgSetup.pgKind,
SECONDARY_STATE))
targetStates,
lengthof(targetStates)))
{
log_error("Failed to wait until a node reached the secondary state");
log_error("Failed to wait until a node reached the secondary or primary state");
exit(EXIT_CODE_MONITOR);
}
}
Expand Down
6 changes: 5 additions & 1 deletion src/bin/pg_autoctl/fsm.c
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ KeeperFSMTransition KeeperFSM[] = {
*/
{ PRIMARY_STATE, PREPARE_MAINTENANCE_STATE, COMMENT_PRIMARY_TO_PREPARE_MAINTENANCE, &fsm_stop_postgres_for_primary_maintenance },
{ PREPARE_MAINTENANCE_STATE, MAINTENANCE_STATE, COMMENT_PRIMARY_TO_MAINTENANCE, &fsm_stop_postgres_and_setup_standby },

{ PRIMARY_STATE, MAINTENANCE_STATE, COMMENT_PRIMARY_TO_MAINTENANCE, &fsm_stop_postgres_for_primary_maintenance },
/*
* was demoted, need to be dead now.
*/
Expand Down Expand Up @@ -342,6 +342,7 @@ KeeperFSMTransition KeeperFSM[] = {
{ CATCHINGUP_STATE, WAIT_MAINTENANCE_STATE, COMMENT_SECONDARY_TO_WAIT_MAINTENANCE, NULL },
{ WAIT_MAINTENANCE_STATE, MAINTENANCE_STATE, COMMENT_SECONDARY_TO_MAINTENANCE, &fsm_start_maintenance_on_standby },
{ MAINTENANCE_STATE, CATCHINGUP_STATE, COMMENT_MAINTENANCE_TO_CATCHINGUP, &fsm_restart_standby },
{ PREPARE_MAINTENANCE_STATE, CATCHINGUP_STATE, COMMENT_MAINTENANCE_TO_CATCHINGUP, &fsm_restart_standby },

/*
* Applying new replication/cluster settings (per node replication quorum,
Expand All @@ -362,6 +363,9 @@ KeeperFSMTransition KeeperFSM[] = {
*/
{ SECONDARY_STATE, REPORT_LSN_STATE, COMMENT_SECONDARY_TO_REPORT_LSN, &fsm_report_lsn },
{ CATCHINGUP_STATE, REPORT_LSN_STATE, COMMENT_SECONDARY_TO_REPORT_LSN, &fsm_report_lsn },
{ MAINTENANCE_STATE, REPORT_LSN_STATE, COMMENT_SECONDARY_TO_REPORT_LSN, &fsm_report_lsn },
{ PREPARE_MAINTENANCE_STATE, REPORT_LSN_STATE, COMMENT_SECONDARY_TO_REPORT_LSN, &fsm_report_lsn },

{ REPORT_LSN_STATE, PREP_PROMOTION_STATE, COMMENT_REPORT_LSN_TO_PREP_PROMOTION, &fsm_prepare_standby_for_promotion },

{ REPORT_LSN_STATE, FAST_FORWARD_STATE, COMMENT_REPORT_LSN_TO_FAST_FORWARD, &fsm_fast_forward },
Expand Down
12 changes: 2 additions & 10 deletions src/bin/pg_autoctl/fsm_transition.c
Original file line number Diff line number Diff line change
Expand Up @@ -720,7 +720,7 @@ fsm_stop_postgres_and_setup_standby(Keeper *keeper)
PostgresSetup *pgSetup = &(postgres->postgresSetup);
KeeperConfig *config = &(keeper->config);

NodeAddress *primaryNode = NULL;
NodeAddress upstreamNode = { 0 };

if (!ensure_postgres_service_is_stopped(postgres))
{
Expand All @@ -737,17 +737,9 @@ fsm_stop_postgres_and_setup_standby(Keeper *keeper)
return false;
}

/* get the primary node to follow */
if (!keeper_get_primary(keeper, &(postgres->replicationSource.primaryNode)))
{
log_error("Failed to initialize standby for lack of a primary node, "
"see above for details");
return false;
}

/* prepare a standby setup */
if (!standby_init_replication_source(postgres,
primaryNode,
&upstreamNode,
PG_AUTOCTL_REPLICA_USERNAME,
config->replication_password,
config->replication_slot_name,
Expand Down
21 changes: 14 additions & 7 deletions src/bin/pg_autoctl/monitor.c
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,8 @@ typedef struct WaitUntilNodeStateNotificationContext
int groupId;
int64_t nodeId;
NodeAddressHeaders *headers;
NodeState targetState;
NodeState *targetStates;
int targetStatesLength;
bool done;
bool firstLoop;
} WaitUntilNodeStateNotificationContext;
Expand Down Expand Up @@ -4092,11 +4093,15 @@ monitor_check_node_report_state(void *context, CurrentNodeState *nodeState)
NodeStateToString(nodeState->reportedState),
NodeStateToString(nodeState->goalState));

if (nodeState->goalState == ctx->targetState &&
nodeState->reportedState == ctx->targetState &&
!ctx->firstLoop)
for (int i = 0; i < ctx->targetStatesLength; i++)
{
ctx->done = true;
if (nodeState->goalState == ctx->targetStates[i] &&
nodeState->reportedState == ctx->targetStates[i] &&
nodeState->node.nodeId == ctx->nodeId &&
!ctx->firstLoop)
{
ctx->done = true;
}
}

if (ctx->firstLoop)
Expand All @@ -4120,7 +4125,8 @@ monitor_wait_until_node_reported_state(Monitor *monitor,
int groupId,
int64_t nodeId,
PgInstanceKind nodeKind,
NodeState targetState)
NodeState *targetStates,
int targetStatesLength)
{
PGconn *connection = monitor->notificationClient.connection;

Expand All @@ -4132,7 +4138,8 @@ monitor_wait_until_node_reported_state(Monitor *monitor,
groupId,
nodeId,
&headers,
targetState,
targetStates,
targetStatesLength,
false, /* done */
true /* firstLoop */
};
Expand Down
3 changes: 2 additions & 1 deletion src/bin/pg_autoctl/monitor.h
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,8 @@ bool monitor_wait_until_node_reported_state(Monitor *monitor,
int groupId,
int64_t nodeId,
PgInstanceKind nodeKind,
NodeState targetState);
NodeState *targetStates,
int targetStatesLength);
bool monitor_wait_for_state_change(Monitor *monitor,
const char *formation,
int groupId,
Expand Down

0 comments on commit 2ec279a

Please sign in to comment.