Implement HA mode
#21
Labels
feature request
New feature request
sponsored
Someone has sponsored this feature, privately or publicly
Some ideas for the
HA
mode implementation (subject to change):Node self-monitoring using REST API and
ha-watchdog
processRaft-like Consensus Algorithm, where no matter the cluster size, there are always 3 candidate nodes that control the whole cluster, and among these 3 candidate nodes there is 1 manager that makes failover decisions
All 3 candidate nodes must be specified manually, in the
ha_config.json
All worker nodes are dynamically added to and removed from the cluster
In case of a node failure:
ha-watchdog
process will reboot the node it's running on, which serves as a simple fencing mechanismNotify cluster admins about the outage, include the list of VMs that were failed over and/or were ignored
Keep a log of things that happen overtime in plain text and JSON formats for later representation by the
Hoster
REST API and/or WebUIThe CLI flags to use (subject to change):
--ha-mode
- start the REST API server, and activate theHA
mode--ha-debug
- only log actions, and do not actually perform them - useful for the initial cluster setup and troubleshootingThe text was updated successfully, but these errors were encountered: