Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery steps documentation #177

Open
aravindavk opened this issue Apr 10, 2022 · 0 comments
Open

Recovery steps documentation #177

aravindavk opened this issue Apr 10, 2022 · 0 comments

Comments

@aravindavk
Copy link
Member

  • When a Manager node goes down - No management operations are possible. Mounted Volumes continue to work but no new mounts are possible.

    • Temporary - Wait till the Management nodes come back online.
    • Permanent failure (Notify/Update Mgr URL in all Storage nodes)
      • Setup a new node with the same or different IP/hostname. Restore the Config data from the backup OR
      • Promote any one existing Storage node and restore the Config data from the backup.
  • When a Storage node goes down

    • Temporary - No need to worry, once the node comes back online then everything will be fine.
    • Permanent failure
      • Setup a new node with the same IP/Hostname and call node re-add command to add the node to the Pool.
      • Setup a new node with a different IP/Hostname, then call node re-add command with flag --new-name=NEW_HOSTNAME
  • Create a new Token for Mgr to Node and Node to Mgr communication(Key Rotate)

      kadalu node new-token PROD/server1.example.com
    

Identify the changes required to Code and update documentation once implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant