Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAQ: Document /run/machine-config-daemon-force #2265

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/FAQ.md
Expand Up @@ -54,6 +54,11 @@ If a file that *is* managed by MachineConfig is changed, the MCD will detect thi

In the future, we would like to harden things more so that these things are more controlled, and ideally avoid having any persistent "unmanaged" state. But it will take significant work to get there; and the status quo means that we can support other operators such as SDN (and e.g. [nmstate](https://github.com/nmstate/kubernetes-nmstate)) that may control parts of the host without the MCO's awareness.

## Q: I changed something, how do I make the MCO revert it?

First, be sure you've read the above question. If it's a file that is managed by the MCO, then
use e.g. `oc debug node/$nodename chroot /host touch /run/machine-config-daemon-force`. See [mco#1086](https://github.com/openshift/machine-config-operator/pull/1086/) and [mco#1891](https://github.com/openshift/machine-config-operator/pull/1891). Wait for the next MCD sync.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry one more thing. To be more specific, you can only use this to "skip validation". The MCD itself cannot revert anything, it can only overwrite when an update is triggered. So this would only really work if you're seeing:
unexpected on-disk state validating against rendered-xxx
and in the node annotations, the currentConfig != desiredConfig
then this would cause the validation to get passed, and the new config would get applied without asking questions, most likely overwriting the old file.
If there was no update and you want to revert the changes you did manually, doing a forcefile would cause nothing to happen. You would instead have to go manually edit the desiredConfig, again something we don't want people to do without knowing the correct background.

I'm hesitant to add this to the openshift docs for the reason that the hard part is determining the state you're in and what you can do to fix it. Force doesn't really fix anything and could make the cluster worse, since the fundamental cause of the drift might be something else. (we've had problems of community operators overwriting crio conf, for example)

Maybe similarly, we should instead just say "the MCO can't reprovision nodes today. If you are seeing a validation error and just want to force an update to a new render config, do this".?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree about jerry's point adding this to openshift docs and the potential compounding danger of suggesting this without enough nuance.


## Q: How do I debug a node failing to join the cluster?

In clusters that are managed by the [Machine API](https://github.com/openshift/machine-api-operator/), see
Expand Down