You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we check the following before completing the upgrade operation on one node and moving to another node.
Check that the restarted tserver responds to an RPC call
Check that the restarted tserver heartbeats to the leader master
In the case of smart AMI upgrades and smart instance type changes, we will be keeping nodes down for potentially longer periods of time. This calls for additional checks on the tserver before moving on
Check that the restarted tserver reclaims its tablets in case it was down for > 15 mins and if its tablets were unassigned as a result.
Check that the restarted tserver catches up via WAL to its corresponding tablet leaders in case it was down for < 15 mins but has yet to catch up via WAL.
Potentially another check for masters in case they are kicked out if the cluster.
Currently we check the following before completing the upgrade operation on one node and moving to another node.
In the case of smart AMI upgrades and smart instance type changes, we will be keeping nodes down for potentially longer periods of time. This calls for additional checks on the tserver before moving on
#8882
The text was updated successfully, but these errors were encountered: