Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Use Krew restore-quorum command for mon quorum disaster scenario #11184

Merged
merged 1 commit into from
Oct 20, 2022

Conversation

travisn
Copy link
Member

@travisn travisn commented Oct 19, 2022

Description of your changes:
The new krew plugin command restore-quorum will automate the process to restore quorum from a single healthy mon. No longer do we need the absolutely tedious, messy, manual steps from the rook docs. The krew plugin will take care of all the complexity automatically as of the v0.4.0 release

Which issue is resolved by this Pull Request:
Resolves #3985

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

For example, if you have three mons and lose quorum, you will need to remove the two bad mons from quorum, notify the good mon
that it is the only mon in quorum, and then restart the good mon.
The [Rook Krew Plugin](https://github.com/rook/kubectl-rook-ceph/) has a command `restore-quorum` that will
walk you through the mon quorum restoration process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
walk you through the mon quorum restoration process.
walk you through the mon quorum automated restoration process.

so that users have a sense feel of automation with the new addition


```console
# create the operator. it is safe to ignore the errors that a number of resources already exist.
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
kubectl rook-ceph mons restore-quorum c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kubectl rook-ceph mons restore-quorum c
kubectl rook-ceph mons restore-quorum c

can we add some snippet of output from the above command as an example? Also, I noticed we missed adding the output in the krew doc too. If you feel it is a good add, let's add output here and we can do krew in later PR.

Copy link
Member Author

@travisn travisn Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I had a snippet ready to go and just realized I missed opening it. See rook/kubectl-rook-ceph#65

Copy link
Member Author

@travisn travisn Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming they'll follow the link to the krew doc instead of needing to see the output here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, since it was not added in krew earlier so I was suggesting to add here, but I think we are good now. Thanks!

Copy link
Member

@parth-gr parth-gr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a concern,
Can we go and install krew as default by rook?
As many times users won't have krew installed and are stuck with the cmds....

Copy link
Member Author

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a concern, Can we go and install krew as default by rook? As many times users won't have krew installed and are stuck with the cmds....

Since krew is a client tool to be installed wherever they have kubectl, we can't really install it by default since there isn't a yaml that could install it at the same time as the operator.

The new krew plugin command restore-quorum will automate the process
to restore quorum from a single healthy mon. No longer do we need
the absolutely tedious, messy, manual steps from the rook docs.
The krew plugin will take care of all the complexity automatically
as of the v0.4.0 release

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
@parth-gr
Copy link
Member

Just a concern, Can we go and install krew as default by rook? As many times users won't have krew installed and are stuck with the cmds....

Since krew is a client tool to be installed wherever they have kubectl, we can't really install it by default since there isn't a yaml that could install it at the same time as the operator.

Yaa, I remember the discussion about why not merging with the operator image

But still we can think of:

  1. Having a separate Yaml-like toolbox.
  2. Or emphasize it to have it by the operator image

Copy link
Contributor

@subhamkrai subhamkrai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@subhamkrai
Copy link
Contributor

Just a concern, Can we go and install krew as default by rook? As many times users won't have krew installed and are stuck with the cmds....

Since krew is a client tool to be installed wherever they have kubectl, we can't really install it by default since there isn't a yaml that could install it at the same time as the operator.

Yaa, I remember the discussion about why not merging with the operator image

But still we can think of:

  1. Having a separate Yaml-like toolbox.
  2. Or emphasize it to have it by the operator image

we don't have yet a process to install krew using yaml, https://krew.sigs.k8s.io/docs/user-guide/setup/install/ these how official documentation suggest installing

@parth-gr
Copy link
Member

Just a concern, Can we go and install krew as default by rook? As many times users won't have krew installed and are stuck with the cmds....

Since krew is a client tool to be installed wherever they have kubectl, we can't really install it by default since there isn't a yaml that could install it at the same time as the operator.

Yaa, I remember the discussion about why not merging with the operator image
But still we can think of:

  1. Having a separate Yaml-like toolbox.
  2. Or emphasize it to have it by the operator image

we don't have yet a process to install krew using yaml, https://krew.sigs.k8s.io/docs/user-guide/setup/install/ these how official documentation suggest installing

can these steps can't be bonded with make?

@travisn
Copy link
Member Author

travisn commented Oct 20, 2022

can these steps can't be bonded with make?

If we had a make command, why not just have them run the command to install krew? Let's discuss in huddle.

@travisn travisn merged commit b7cca9d into rook:master Oct 20, 2022
@travisn travisn deleted the mon-restore-docs branch October 20, 2022 15:14
mergify bot added a commit that referenced this pull request Oct 20, 2022
docs: Use Krew restore-quorum command for mon quorum disaster scenario (backport #11184)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automate restoring of mon quorum for disaster recovery
3 participants