The goal of this file is to have a place to easily commit answers to questions in a way that's easily searchable, and can make its way into official documentation later.
This is a supported operation. Today, the MCO does not have good support for "per node" configuration, and configuring things like static IP addresses and partition layouts by customizing the Ignition makes sense.
However, it's important to understand that these custom changes are "invisible" to the
MCO today - they won't show up in oc get machineconfig
. And hence it's not
as straightforward to make any "day 2" changes to them.
In the future, it's likely the MCO will gain better support for per-node configuration as well as tools to more easily manipulate Ignition, so there is less need to edit the Ignition JSON directly.
Today, the MCO only blocks on upgrades of control plane nodes. oc get clusterversion
effectively reports the version of the control plane.
To watch rollout of worker nodes, you should look at oc describe machineconfigpool/worker
(as well as other custom pools, if any).
There are two fundamental operators in OpenShift 4 that both include "machine" in their name:
The Machine Config Operator (this repository) manages code and configuration "inside" the OS (and targets specifcally RHCOS).
The Machine API Operator manages "machine" objects which represent underlying IaaS virtual (or physical) machines.
In other words, they operate on fundamentally different levels, but they do interact. For example, both currently will drain a node. The MCO will drain when it's making changes, and machine API will drain when a machine object is deleted and has an associated node.
Another linkage between the two is booting an instance; in IaaS scenarios the "user data" field (managed by machineAPI) will contain a "pointer Ignition config" that points to the Machine Config Server.
However, these repositories have distinct teams. Also, machineAPI is a derivative of a Kubernetes upstream project "cluster API", whereas the MCO is not.
Usually, no. Today, the MCO does not try to claim "exclusive" ownership over everything on the host system; it's just not feasible to do.
If for example you write a daemonset that writes a custom systemd unit into e.g. /etc/systemd/system
, or do so manually via ssh
/oc debug node
- OS upgrades will preserve that change (via libostree), and the MCO will not revert it. The MCO/MCD only changes files included in MachineConfigs
, there is no code to look for "unknown" files.
Another case today is that the SDN operator will extract some binaries from its container image and drop them in /opt
(which is really /var/opt
).
Stated more generally, on an OSTree managed system, all content in /etc
and /var
is preserved by default across upgrades.
Further, rpm-ostree supports package layering and overrides - these will also be preserved by the MCO (currently). Although note that there is no current mechanism to trigger a MCO-coordinated drain/reboot, which is particularly relevant for rpm-ostree install/override
changes.
If a file that is managed by MachineConfig is changed, the MCD will detect this and go degraded. We go degraded rather than overwrite in order to avoid reboot loops.
In the future, we would like to harden things more so that these things are more controlled, and ideally avoid having any persistent "unmanaged" state. But it will take significant work to get there; and the status quo means that we can support other operators such as SDN (and e.g. nmstate) that may control parts of the host without the MCO's awareness.
In clusters that are managed by the Machine API, see this question first.
A node failing to join can fail broadly speaking at two separate stages; inside the initramfs (Ignition), or in the real root. If Ignition fails, at the moment this requires accessing the console of the affected machine. See also this issue.
If the system fails in the real root, and you have configured a SSH key for
the cluster, you should be able to ssh
to the node. A good first command is
systemctl --failed
. Important units to look at would be machine-config-daemon-firstboot.service
and kubelet.service
- in general, the problem is likely to be some dependency
of kubelet.
Not today. The MachineConfig doc discusses which sections
of the rendered Ignition can be changed, and that does not include e.g. the Ignition
storage
section. For example, you cannot currently switch an existing worker
node to be encrypted or use RAID after the fact - you must re-provision the system.
The MCO also does not currently support explicitly re-provisioning a system "in place", however
this is likely to be a future feature. For now, in machineAPI managed environments
you should oc delete
the corresponding machine
object, or in UPI installations,
cordon and drain the node, then delete the node
object and re-provision.
A further problem is that the MCO does not make it easy for new nodes to boot in the new configuration.
Related issues:
All this to say, it's quite hard to change storage layout with the MCO today, but this is a bug.
Yes, RHEL worker nodes will have a instance of the Machine Config Daemon running on them. However, only a subset of MCO functionality is supported on RHEL worker nodes. It is possible to create a Machine Config to write files and systemd
units to RHEL worker nodes, but it is not possible to manage OS updates, kernel arguments, or extensions on RHEL worker nodes.