Recovery steps documentation #177

aravindavk · 2022-04-10T12:49:22Z

When a Manager node goes down - No management operations are possible. Mounted Volumes continue to work but no new mounts are possible.
- Temporary - Wait till the Management nodes come back online.
- Permanent failure (Notify/Update Mgr URL in all Storage nodes)
  - Setup a new node with the same or different IP/hostname. Restore the Config data from the backup OR
  - Promote any one existing Storage node and restore the Config data from the backup.
When a Storage node goes down
- Temporary - No need to worry, once the node comes back online then everything will be fine.
- Permanent failure
  - Setup a new node with the same IP/Hostname and call node re-add command to add the node to the Pool.
  - Setup a new node with a different IP/Hostname, then call node re-add command with flag --new-name=NEW_HOSTNAME
Create a new Token for Mgr to Node and Node to Mgr communication(Key Rotate)
```
  kadalu node new-token PROD/server1.example.com
```

Identify the changes required to Code and update documentation once implemented.

The text was updated successfully, but these errors were encountered:

Provide feedback